LightGBM
Train LightGBM to predict categorical outcomes
Microsoft's gradient boosting framework optimized for speed and memory efficiency.
When to use:
- Large datasets (>10k rows)
- Many features
- Need fast training
- Limited memory
Strengths: Very fast, low memory, handles large datasets, accurate Weaknesses: Can overfit small datasets, many hyperparameters
Model Parameters
Similar to XGBoost with additional options:
Num Leaves (default: 31) Maximum number of leaves in one tree. More = complex.
Feature Fraction (default: 1.0) Equivalent to colsample_bytree in XGBoost.
Bagging Fraction (default: 1.0) Equivalent to subsample in XGBoost.
Min Data in Leaf (default: 20) Minimum number of data in one leaf.
Plus: learning_rate, n_estimators, max_depth, reg_alpha, reg_lambda, random_state