DBSCAN
Density-based clustering that detects arbitrarily shaped clusters and noise
DBSCAN groups points that are densely packed together and marks sparse points as noise (label -1). It discovers clusters of arbitrary shape without requiring the number of clusters to be specified.
When to use:
- Datasets with non-spherical or irregularly shaped clusters
- Anomaly detection — noise points (label -1) can indicate outliers
- When the number of clusters is unknown
Input: Tabular data with the feature columns defined during training Output: Cluster label per row (-1 indicates noise/outlier)
Model Settings (set during training, used at inference)
Eps (default: 0.5) Maximum distance between two points to be considered neighbors. The most sensitive parameter — tune based on your feature scale.
Min Samples (default: 5) Minimum neighbors for a point to be a core point. Higher values create denser, more conservative clusters.
Metric (default: euclidean) Distance metric for neighbor computation.
Algorithm (default: auto)
Nearest neighbor algorithm. auto selects the best for your data.
Inference Settings
No dedicated inference-time settings. New points are assigned to the nearest core point's cluster, or labeled as noise.