Dokumentation (english)

CLIP Cross-Encoder

CLIP-based cross-encoder for image-text and image-image similarity scoring

CLIP Cross-Encoder uses CLIP ViT-L/14 to compute pairwise similarity scores between images and text or between two images. Useful for reranking, visual similarity, and recommendation pipelines.

When to use:

  • Image-to-image similarity reranking
  • Image search reranking with text queries
  • Use when CLIP embeddings are already in use in the pipeline

Input:

  • Query Image (optional): Image to match against candidates
  • Query Text (optional): Text query for image ranking
  • Candidates (required): Candidate images to score

Output:

  • Scores: Similarity or relevance scores per candidate
  • Ranking: Indices sorted by relevance

Inference Settings

No inference-time settings. Scores are computed deterministically.

On this page


Command Palette

Search for a command to run...

Schnellzugriffe
STRG + KSuche
STRG + DNachtmodus / Tagmodus
STRG + LSprache ändern

Software-Details
Kompiliert vor etwa 2 Stunden
Release: v4.0.0-production
Buildnummer: master@afa25ab
Historie: 72 Items