Random Forest

Ensemble of decision trees that votes on the final prediction. Each tree sees a random subset of data and features.

When to use:

Robust baseline - works well on most problems
Handles non-linear relationships
Can handle missing values
Feature importance needed
Resistant to overfitting

Strengths: Very accurate, handles non-linearity, robust to noise, provides feature importance Weaknesses: Can be slow, large model size, less interpretable than single trees

Model Parameters

N Estimators (default: 100) Number of trees in the forest. More trees = better but slower.

50-100: Fast training
100-300: Good default
500+: Maximum accuracy, slower

Max Depth Maximum tree depth. Controls model complexity.

None: Trees grow until pure (may overfit)
Low values (3-10): Simple, fast, prevents overfitting
High values (20-50): Complex patterns, may overfit

Min Samples Split (default: 2) Minimum samples needed to split a node. Higher values prevent overfitting.

Min Samples Leaf (default: 1) Minimum samples in a leaf node. Higher values create smoother decision boundaries.

Max Features Features to consider at each split:

sqrt: Square root of total features (good default for classification)
log2: Log2 of total features
None: Use all features

Bootstrap (default: true) Whether to use bootstrap sampling. Keep true for better generalization.

Criterion Split quality measure:

gini: Gini impurity (default, faster)
entropy: Information gain (slightly better sometimes)
log_loss: Log loss (for probability calibration)

Random State (default: 42) Seed for reproducibility.

Random Forest

Model Parameters

On this page

Sicherheit auf Enterprise-Niveau

In jeder Infrastruktur einsetzbar

DSGVO-konform

Random Forest

Model Parameters

On this page

Command Palette