BGE-M3

BGE-M3 from BAAI supports 100+ languages with three output types: dense vectors, sparse token-weight representations, and hybrid combinations. Ideal for retrieval pipelines that need both semantic and keyword matching.

When to use:

Multilingual document retrieval across 100+ languages
Hybrid search combining dense (semantic) and sparse (keyword) signals
Reranking pipelines using dense-sparse fusion

Input: Text string + optional fine-tuned checkpoint Output: Dense embedding vector (1024-dim) and/or sparse token-weight dictionary

Model Settings

Embedding Type (default: dense, required, options: dense / sparse / hybrid) Which embedding representation to produce.

dense: Standard 1024-dim vector - use for semantic similarity search
sparse: Token-weight dictionary (like BM25 but neural) - use for keyword-aware retrieval
hybrid: Both dense and sparse - best retrieval accuracy when combined with a reranker

BGE-M3

Model Settings

On this page

Sicherheit auf Enterprise-Niveau

In jeder Infrastruktur einsetzbar

DSGVO-konform

BGE-M3

Model Settings

On this page

Command Palette