Item Embedding t-SNE
Visualize high-dimensional item embeddings reduced to 2D clusters using t-SNE
Use me when you want to see whether your model has actually learned meaningful structure from the data. I take the hundreds or thousands of numbers your model uses to represent each item and squash them down to two dimensions so you can see clusters, outliers, and unexpected neighbours. If items that should be similar are not grouped together, something is wrong with the embeddings.
Overview
An item embedding t-SNE plot uses the t-SNE (t-distributed Stochastic Neighbour Embedding) algorithm to project high-dimensional embedding vectors into 2D space. Each point represents one item — a product, word, user, image, or any entity the model has learned a representation for. Items with similar learned representations appear close together; dissimilar items appear far apart.
Best used for:
- Validating that a trained model has learned meaningful groupings
- Exploring the structure of learned representations visually
- Identifying clusters of similar items without labelling them first
- Verifying that category labels align with learned embedding clusters
- Detecting outlier items whose embeddings differ from everything else
- Communicating model quality to non-technical stakeholders
Common Use Cases
Recommendation Systems
- Product embeddings — Do items the model considers similar cluster together? Do electronics stay separate from food?
- User embeddings — Do users with similar purchase histories group up?
- Collaborative filtering validation — Confirm the latent factors capture real behavioural similarity
Natural Language Processing
- Word embeddings — Verify that semantically related words are spatially close (synonyms, topic words)
- Sentence embeddings — Check that sentences with similar meaning land in the same neighbourhood
- Document clustering — Explore topic structure without defining topics upfront
Computer Vision
- Image embeddings — Do images of the same object class cluster visually?
- Transfer learning features — Confirm that pre-trained features from a backbone capture the right structure for your task
General ML Representation Learning
- Anomaly detection — Outlier points sitting far from every cluster are candidates for anomalies
- Label quality check — If labelled items scatter randomly across clusters, the labels may be noisy
- Model comparison — Run t-SNE on two model versions to see which produces tighter, more meaningful clusters
Options
Color By
Optional — Select an attribute key from the embedding metadata to color the points by.
When a categorical attribute (e.g., product category, language, user segment) is available as metadata alongside the embedding vectors, coloring by it reveals whether the model has learned to separate those groups. Well-separated color groups mean the embedding captures that attribute implicitly; mixed colors within a cluster mean it has learned something different.
The available attribute keys are sourced dynamically from the embedding data — only attributes stored alongside the embeddings appear in the dropdown.
Interpreting the Plot
What t-SNE Preserves
t-SNE is designed to keep nearby points in the original high-dimensional space nearby in 2D. Items your model considers similar will be placed close together. It does this by converting distances into probabilities and optimising a layout that matches those probabilities as well as possible in 2D.
What t-SNE Does NOT Preserve
The distances between clusters are not meaningful. Two clusters that appear far apart on screen are not necessarily more dissimilar than two clusters that appear close. Only the structure within and immediately around clusters can be trusted. Do not draw conclusions from how spread out the full plot looks, or from which clusters appear to sit near each other globally.
Cluster Shape and Size
- Tight, round clusters — Items in that group are highly similar; the model has learned a compact representation
- Elongated or irregular clusters — Items vary along some internal dimension the model has captured
- Large, diffuse clusters — The group is heterogeneous; the model may not distinguish sub-types
Overlapping Clusters
If two category groups overlap significantly in the t-SNE projection, the model's embeddings do not cleanly separate them. This may mean the categories are genuinely similar, the model has not seen enough data to distinguish them, or the embedding dimensionality is too low.
Outlier Points
Isolated points sitting far from any cluster warrant investigation. They may be mis-labelled, genuinely unusual items, or artefacts from sparse training data.
Tips for Effective Use
-
Use Color By to validate known groups — If you have category labels, apply them as Color By immediately. Well-separated color groups are a strong signal that the model is learning the right structure.
-
Run multiple times with different perplexity — t-SNE has a perplexity parameter (typically 5–50) that controls the balance between local and global structure. The AICU backend tries a sensible default, but if clusters look fragmented or merged, a different perplexity may reveal cleaner structure.
-
Do not over-interpret cluster positions — The absolute coordinates of clusters change every time t-SNE runs (it is a stochastic algorithm). Compare shapes and separability, not positions.
-
Check for a single giant cluster — If all items collapse into one mass, the embeddings may be collapsed (a known failure mode called representation collapse). Inspect variance in the raw embedding vectors.
-
Pair with a quantitative metric — t-SNE is a qualitative tool. Pair it with a silhouette score or nearest-neighbour recall metric to get a number alongside the visual.
-
This plot requires a trained model with stored embeddings — The embeddings must be saved as part of the training artefact. If you do not see the plot option, check that the model node is configured to output embeddings.
Important Caveats
- t-SNE is not deterministic — Re-running produces a different layout. Do not compare two separate runs directly; use the same run's output for a given model checkpoint.
- Cluster count is not reliable — t-SNE tends to fragment large clusters and can merge small ones depending on perplexity. Do not use t-SNE alone to decide how many clusters exist.
- High-dimensional distances are partially lost — t-SNE optimises for local neighbourhood structure. Items that are moderately similar (not nearest neighbours) may appear anywhere.
- Requires trained model data — This plot is in the evaluation category and only activates after a model has been trained and embeddings have been persisted.
Related Visualizations
- Scatter Plot — For general 2D data exploration when dimensions are already meaningful
- Correlation Plot — Understand which raw features are related before training
- Global Feature Importance — See which input features contribute most to model output
- Actual vs Predicted — Evaluate regression model performance with a complementary evaluation view