Logo

Getting Started

  • Installation
    • Install with uv
    • Install with pip
    • Install with Conda
    • Install from Source
    • Editable Install
    • Install PyTorch with CUDA support
  • Quickstart
    • Sentence Transformer
    • Cross Encoder
    • Sparse Encoder
    • Next Steps
  • Migration Guide
    • Migrating from v5.x to v5.4+
      • Updated import paths
      • Renamed methods and parameters
      • CrossEncoder.max_length property renamed to max_seq_length
      • Trainer tokenizer parameter renamed to processing_class
      • tokenizer_kwargs renamed to processor_kwargs
      • CrossEncoder API changes
      • Removed tags parameter from push_to_hub
      • Default pooling for CausalLM models
      • Changes for custom module and loss authors
    • Migrating from v4.x to v5.x
      • Migration for model.encode
      • Migration for Asym to Router
      • Migration of advanced usage
    • Migrating from v3.x to v4.x
      • Migration for parameters on CrossEncoder initialization and methods
      • Migration for specific parameters from CrossEncoder.fit
      • Migration for CrossEncoder evaluators
    • Migrating from v2.x to v3.x
      • Migration for specific parameters from SentenceTransformer.fit
      • Migration for custom Datasets and DataLoaders used in SentenceTransformer.fit

Sentence Transformer

  • Usage
    • Computing Embeddings
      • Initializing a Sentence Transformer Model
      • Calculating Embeddings
      • Prompt Templates
      • Input Sequence Length
      • Multi-Process / Multi-GPU Encoding
    • Semantic Textual Similarity
      • Similarity Calculation
    • Semantic Search
      • Background
      • Symmetric vs. Asymmetric Semantic Search
      • Manual Implementation
      • Optimized Implementation
      • Speed Optimization
      • Elasticsearch
      • OpenSearch
      • Approximate Nearest Neighbor
      • Retrieve & Re-Rank
      • Examples
    • Retrieve & Re-Rank
      • Retrieve & Re-Rank Pipeline
      • Retrieval: Bi-Encoder
      • Re-Ranker: Cross-Encoder
      • Example Scripts
      • Pre-trained Bi-Encoders (Retrieval)
      • Pre-trained Cross-Encoders (Re-Ranker)
    • Clustering
      • k-Means
      • Agglomerative Clustering
      • Fast Clustering
      • Topic Modeling
    • Paraphrase Mining
      • paraphrase_mining()
    • Translated Sentence Mining
      • Margin Based Mining
      • Examples
    • Image Search
      • Installation
      • Usage
      • Examples
    • Embedding Quantization
      • Binary Quantization
      • Scalar (int8) Quantization
      • Additional extensions
      • Demo
      • Try it yourself
    • Creating Custom Models
      • Modular Architecture
      • Sentence Transformer Model from a Transformers Model
      • Advanced: Custom Modules
    • Evaluation with MTEB
      • Installation
      • Evaluation
      • Additional Arguments
      • Results Handling
      • Leaderboard Submission
    • Speeding up Inference
      • PyTorch
      • ONNX
      • OpenVINO
      • Benchmarks
  • Pretrained Models
    • Original Models
    • Semantic Search Models
      • Multi-QA Models
      • MSMARCO Passage Models
    • Multilingual Models
      • Semantic Similarity Models
      • Bitext Mining
    • Multimodal Models
      • Image & Text Models
      • Audio & Video Models
    • INSTRUCTOR models
    • Scientific Similarity Models
  • Training Overview
    • Why Finetune?
    • Training Components
    • Model
    • Dataset
      • Dataset Format
      • Multimodal Datasets
    • Loss Function
    • Training Arguments
    • Evaluator
    • Trainer
      • Callbacks
    • Multi-Dataset Training
    • Deprecated Training
    • Best Base Embedding Models
    • Comparisons with CrossEncoder Training
    • End-to-End Example
  • Dataset Overview
    • Multimodal Datasets
      • Accepted column types
      • Cross-modal dataset example
      • Automatic preprocessing
    • Datasets on the Hugging Face Hub
    • Pre-existing Datasets
  • Loss Overview
    • Loss Table
    • Loss modifiers
    • Regularization
    • Distillation