Dokumentation (english)

BIRCH

Balanced Iterative Reducing and Clustering using Hierarchies - memory-efficient hierarchical clustering for very large datasets

Balanced Iterative Reducing and Clustering using Hierarchies - memory-efficient hierarchical clustering for very large datasets.

When to use:

  • Very large datasets that don't fit in memory
  • Need fast online clustering
  • Roughly spherical clusters
  • Memory constraints

Strengths: Very fast, memory efficient, online learning, handles large datasets, incremental Weaknesses: Assumes spherical clusters, sensitive to threshold, order-dependent results

Model Parameters

N Clusters (default: 3, required) Number of clusters after final clustering step.

Threshold (default: 0.5) Radius of the subcluster obtained by merging a new sample. Key parameter.

  • Low (0.1-0.3): Many small clusters, high memory
  • Medium (0.5): Balanced
  • High (1.0+): Few large clusters, low memory

Branching Factor (default: 50) Maximum number of subclusters in each node.

  • 10-30: Deeper tree, slower
  • 50: Good default
  • 100+: Shallower tree, faster

On this page


Command Palette

Search for a command to run...

Schnellzugriffe
STRG + KSuche
STRG + DNachtmodus / Tagmodus
STRG + LSprache ändern

Software-Details
Kompiliert vor 1 Tag
Release: v4.0.0-production
Buildnummer: master@64a3463
Historie: 68 Items