Dokumentation (english)

Hierarchical Clustering

Builds a tree (dendrogram) of clusters by iteratively merging or splitting groups based on distance

Builds a tree (dendrogram) of clusters by iteratively merging or splitting groups based on distance.

When to use:

  • Want to visualize cluster hierarchy with dendrogram
  • Need clusters at multiple granularities
  • Relatively small dataset (<10k samples)
  • Want deterministic results

Strengths: Creates hierarchical structure, no need to specify k upfront, deterministic, visualizable with dendrogram Weaknesses: Slow on large datasets, sensitive to noise and outliers, cannot undo merges

Model Parameters

N Clusters (default: 2, required) Number of clusters to extract from the hierarchy.

Linkage (default: "ward") How to measure distance between clusters:

  • ward: Minimizes variance (default, best for most cases)
  • complete: Maximum distance between all point pairs
  • average: Average distance between all point pairs
  • single: Minimum distance (can create long chains)

Metric (default: "euclidean") Distance metric for computing linkage:

  • euclidean: Standard distance (required for ward)
  • manhattan: L1 distance
  • cosine: Angle-based similarity
  • Others: l1, l2, correlation, etc.

Distance Threshold (optional) Stop merging when distance exceeds this threshold. If set, n_clusters should be None.

Compute Full Tree (default: "auto") Whether to compute the full tree or stop early:

  • auto: Automatically decide based on parameters
  • true: Compute full dendrogram
  • false: Stop early (faster)

On this page


Command Palette

Search for a command to run...

Schnellzugriffe
STRG + KSuche
STRG + DNachtmodus / Tagmodus
STRG + LSprache ändern

Software-Details
Kompiliert vor 1 Tag
Release: v4.0.0-production
Buildnummer: master@64a3463
Historie: 68 Items