Skip to content

Distiluse Multilingual

Text model for computing sentence embeddings in multiple languages based on Sentence-Transformers framework1.

Pre-trained models

mozuma.models.sentences.distilbert.pretrained.torch_distiluse_base_multilingual_v2

Multilingual model for semantic similarity

See distiluse-base-multilingual-cased-v2 and sbert documentation for more information.

Parameters:

Name Type Description Default
device torch.device, Optional

The PyTorch device to initialise the model weights. Defaults to torch.device("cpu").

required
enable_tokenizer_truncation bool, Optional

Enable positional embeddings truncation with strategy only_first. Defaults to False.

required

Base model

This model is an implementation of a TorchModel.

mozuma.models.sentences.distilbert.modules.DistilUseBaseMultilingualCasedV2Module

Multilingual model for semantic similarity

See distiluse-base-multilingual-cased-v2 and sbert documentation for more information.

Parameters:

Name Type Description Default
device torch.device

The PyTorch device to initialise the model weights. Defaults to torch.device("cpu").

device(type='cpu')
enable_tokenizer_truncation bool

Enable positional embeddings truncation with strategy only_first. Defaults to False.

False

Pre-trained state origins

See the stores documentation for usage.

mozuma.models.sentences.distilbert.stores.SBERTDistiluseBaseMultilingualCasedV2Store

Loads weights from SBERT's hugging face

See Hugging face's documentation .


  1. Nils Reimers and Iryna Gurevych. Making monolingual sentence embeddings multilingual using knowledge distillation. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 11 2020. URL: https://arxiv.org/abs/2004.09813