Dokumentation (english)

LightGBM Time Series

Fast and memory-efficient gradient boosting with native categorical feature support

LightGBM for time series combines gradient boosting with lag features and rolling statistics. It's faster and more memory-efficient than XGBoost, with native support for categorical features, making it ideal for large-scale time series forecasting.

When to Use LightGBM Time Series

LightGBM Time Series is best suited for:

  • Large-scale time series forecasting (millions of observations)
  • When you need faster training than XGBoost
  • Datasets with many categorical features (handles natively)
  • Memory-constrained environments
  • Complex non-linear patterns with external features
  • High-dimensional feature spaces
  • Production systems requiring fast inference
  • Scenarios where XGBoost is too slow or memory-intensive

Strengths

  • Extremely fast training (histogram-based algorithm)
  • Low memory usage compared to XGBoost
  • Native categorical feature support (no one-hot encoding needed)
  • Handles large datasets efficiently
  • Captures non-linear relationships
  • Feature importance for interpretability
  • Parallel and GPU training support
  • Works well with many features
  • Robust to outliers
  • Excellent for production deployment

Weaknesses

  • Requires feature engineering (lags, rolling statistics)
  • Needs substantial historical data
  • Cannot extrapolate beyond training distribution
  • Prone to overfitting with small datasets
  • No native uncertainty quantification
  • Less interpretable than statistical models
  • Requires exogenous features at forecast time
  • May underperform XGBoost on small datasets
  • Hyperparameter tuning still needed

Parameters

Common Time Series Parameters

All time series models share these parameters:

  • Timestamp Column (required): Column containing dates/times
  • Target Column (required): Numeric value to forecast
  • Feature Columns (optional): Additional feature columns (exogenous variables)
  • Frequency (optional): Time spacing (D, H, W, M). Auto-inferred if not specified
  • Forecast Steps (required, default=1): How many periods to predict

Feature Engineering Parameters

Lag Features

  • Type: List of integers
  • Default: [1, 2, 3, 7]
  • Description: Past time steps to include as features
  • Examples:
    • Daily data: [1, 7, 14, 30] (yesterday, last week, 2 weeks, month)
    • Hourly data: [1, 24, 168] (last hour, yesterday, last week)
    • Monthly data: [1, 12] (last month, last year)
  • Guidance: Include lags at meaningful intervals for your domain

Rolling Mean Windows

  • Type: List of integers
  • Default: [7, 14]
  • Description: Window sizes for rolling average features
  • Examples:
    • Daily data: [7, 14, 30] (week, bi-week, month)
    • Hourly data: [24, 168] (day, week)
  • Purpose: Smooths noise and captures recent trends

LightGBM Model Parameters

Number of Trees (n_estimators)

  • Type: Integer
  • Default: 100
  • Description: Number of boosting iterations (trees)
  • Typical Range: 50-500
  • Guidance:
    • Start with 100
    • Increase to 200-300 for complex patterns
    • Use early stopping to optimize

Max Depth

  • Type: Integer
  • Default: -1 (no limit)
  • Description: Maximum depth of each tree (-1 means no limit)
  • Typical Range: 3-10, or -1
  • Guidance:
    • -1: Let LightGBM control depth via num_leaves
    • 3-5: Shallow trees, prevent overfitting
    • 7-10: Deep trees for complex interactions
  • Note: LightGBM uses leaf-wise growth, so depth is less critical than in XGBoost

Learning Rate

  • Type: Float
  • Default: 0.1
  • Description: Shrinkage applied to each tree
  • Typical Range: 0.01-0.3
  • Guidance:
    • 0.01-0.05: Slow, robust, needs more trees
    • 0.1: Standard default
    • 0.2-0.3: Fast, fewer trees, higher overfitting risk

Configuration Tips

Quick Start Configuration

For most time series:

add_lags=[1, 7]
add_rolling_mean=[7, 14]
n_estimators=100
max_depth=-1
learning_rate=0.1

Feature Engineering by Frequency

Daily Data:

add_lags=[1, 7, 14, 30]
add_rolling_mean=[7, 14, 30]

Captures daily, weekly, bi-weekly, and monthly patterns.

Hourly Data:

add_lags=[1, 24, 168]
add_rolling_mean=[24, 168]

Last hour, same time yesterday, same time last week.

Monthly Data:

add_lags=[1, 12]
add_rolling_mean=[3, 6, 12]

Last month and last year; quarterly, semi-annual, annual averages.

Hyperparameter Tuning

Conservative (Small Data, Prevent Overfitting):

n_estimators=100
max_depth=5
learning_rate=0.05
num_leaves=31  # if configurable
min_child_samples=20

Aggressive (Large Data, Capture Complexity):

n_estimators=300
max_depth=-1
learning_rate=0.1
num_leaves=127
min_child_samples=5

Using Categorical Features

LightGBM handles categorical features natively:

feature_columns=['temperature', 'day_of_week', 'store_id', 'product_category']

Advantages:

  • No one-hot encoding needed
  • Faster training
  • Better handling of high-cardinality categories
  • Less memory usage

Preparation: Ensure categorical columns are marked as category dtype.

LightGBM vs XGBoost Trade-offs

Use LightGBM when:

  • Large datasets (> 10K rows)
  • Many categorical features
  • Training speed is critical
  • Memory is limited

Use XGBoost when:

  • Small datasets (< 10K rows)
  • Need maximum accuracy
  • Well-tuned for your problem

Often: Try both and compare performance.

Common Issues and Solutions

Issue: Overfitting (Training Accuracy >> Validation)

Solution:

  • Reduce max_depth to 5-7
  • Decrease learning_rate to 0.05
  • Increase min_child_samples (e.g., 20-50)
  • Reduce n_estimators
  • Add regularization (reg_alpha, reg_lambda)
  • Use fewer lag features
  • Enable early stopping with validation set

Issue: Underfitting (Poor Training and Validation)

Solution:

  • Increase n_estimators to 200-500
  • Increase max_depth (or set to -1)
  • Add more lag and rolling features
  • Include relevant external features
  • Decrease min_child_samples
  • Increase learning_rate to 0.15-0.2

Issue: Training Is Slow Despite Using LightGBM

Solution:

  • Reduce n_estimators for initial experiments
  • Use histogram-based splitting (default, but verify)
  • Reduce max_depth
  • Subsample data for prototyping
  • Enable parallel training (set n_jobs=-1)
  • Use GPU if available

Issue: Poor Long-Term Forecasts

Solution:

  • Gradient boosting cannot extrapolate
  • Use for short-term forecasts (1-30 steps)
  • Combine with statistical models (ARIMA, Prophet)
  • Ensure maximum lag >= forecast_steps
  • Consider ensemble with trend models

Issue: Categorical Features Not Handled Correctly

Solution:

  • Verify categorical columns are dtype='category'
  • Pass categorical_feature parameter if using scikit-learn interface
  • Check for missing values in categorical columns
  • Ensure categories are consistent between train and test

Issue: Need Prediction Intervals

Solution:

  • LightGBM doesn't provide native intervals
  • Use quantile regression (train for quantiles 0.1, 0.5, 0.9)
  • Bootstrap methods (multiple models on resampled data)
  • Conformal prediction
  • Ensemble with models that provide intervals (ARIMA)

Issue: Memory Errors

Solution:

  • Reduce number of lag/rolling features
  • Subsample training data
  • Reduce max_depth
  • Use smaller data types (float32 instead of float64)
  • Process data in chunks if possible

Example Use Cases

Daily E-commerce Sales with Promotions

target: daily_sales
feature_columns: [day_of_week, is_holiday, promotion_type, category]
add_lags: [1, 7, 14]
add_rolling_mean: [7, 30]
n_estimators: 200
max_depth: -1

Native categorical handling for promotion_type and category.

Hourly Server Load Prediction

target: cpu_usage
feature_columns: [hour, day_of_week, is_weekend, concurrent_users]
add_lags: [1, 24, 168]
add_rolling_mean: [24, 168]
n_estimators: 300
max_depth: 7

Fast training for real-time monitoring system.

Multi-Store Inventory Forecasting

target: units_sold
feature_columns: [store_id, product_id, price, is_promotion]
add_lags: [1, 7]
add_rolling_mean: [7, 14]
n_estimators: 150
max_depth: -1

Handles high-cardinality categorical features (store_id, product_id).

Energy Demand Forecasting

target: megawatt_demand
feature_columns: [temperature, humidity, hour, day_type]
add_lags: [1, 24, 168]
add_rolling_mean: [24, 168]
n_estimators: 200
max_depth: 8

Captures weather dependencies and temporal patterns.

Comparison with Other Models

vs XGBoost Time Series:

  • LightGBM: Faster, lower memory, better with categorical features
  • XGBoost: Sometimes more accurate on small data, more mature ecosystem

vs CatBoost Time Series:

  • LightGBM: Faster training, lower memory
  • CatBoost: Better with categorical features, less hyperparameter tuning

vs ARIMA/Prophet:

  • LightGBM: Non-linear, handles features, flexible
  • ARIMA/Prophet: Statistical framework, confidence intervals, better extrapolation

vs Neural Networks (LSTM, etc.):

  • LightGBM: Faster, less data needed, interpretable
  • Neural Networks: Better for very complex patterns, very long sequences

Advanced Tips

Feature Engineering Enhancements

  1. Time-Based Features:

    • day_of_week, month, quarter, is_weekend, is_month_start/end, week_of_year
  2. Lag Differences:

    • diff_1 = target[t] - target[t-1]
    • pct_change_1 = (target[t] - target[t-1]) / target[t-1]
  3. Rolling Statistics Beyond Mean:

    • rolling_std_7, rolling_min_7, rolling_max_7
    • rolling_median_7
  4. Interaction Features:

    • is_weekend × temperature
    • hour × day_of_week
  5. Exponentially Weighted Features:

    • ewm_mean_span_7

Categorical Feature Engineering

For high-cardinality categories (e.g., user_id):

  • Target encoding: Encode by mean target value per category
  • Frequency encoding: Count of each category
  • LightGBM handles these natively, but encoding can sometimes help

Validation Strategy

Time Series Cross-Validation:

Split 1: Train [0:100], Test [100:110]
Split 2: Train [0:110], Test [110:120]
Split 3: Train [0:120], Test [120:130]

Never shuffle time series data!

Early Stopping

Enable early stopping to find optimal n_estimators:

early_stopping_rounds=50
eval_metric='rmse'

Stops training if validation metric doesn't improve for 50 rounds.

Handling Multiple Time Series

For forecasting multiple series (e.g., sales per store):

  • Option 1: Global model with categorical features (store_id)
  • Option 2: Separate models per store
  • Trade-off: Global captures cross-series patterns, local captures store-specific patterns

Technical Details

LightGBM Algorithm

  • Leaf-wise growth: Grows trees by splitting leaf with maximum gain (vs level-wise in XGBoost)
  • Histogram-based: Discretizes continuous features for faster splits
  • GOSS: Gradient-based One-Side Sampling (keeps large-gradient instances)
  • EFB: Exclusive Feature Bundling (combines sparse features)

These optimizations make LightGBM fast and memory-efficient.

Leaf-wise vs Level-wise

  • XGBoost: Level-wise (balanced trees, slower, less overfitting)
  • LightGBM: Leaf-wise (asymmetric trees, faster, more overfitting risk)

Control overfitting with max_depth and num_leaves.

Multi-Step Forecasting

For forecast_steps > 1:

  1. Train 1-step-ahead model
  2. Predict t+1
  3. Use prediction as lag feature for t+2
  4. Repeat recursively

Errors compound, so accuracy decreases with horizon.

Production Considerations

Model Serving

LightGBM models are lightweight:

  • Fast inference (milliseconds)
  • Small model size
  • Easy serialization (pickle, joblib, LightGBM native format)

Monitoring

Track:

  • Forecast errors over time
  • Feature drift (distribution changes)
  • Concept drift (relationship changes)
  • Retrain regularly (e.g., weekly)

Retraining Strategy

  • Incremental: Add new data, keep recent history
  • Full retrain: Periodically retrain from scratch
  • Online learning: Update model with new observations (advanced)
  1. Baseline: Start with simple configuration
  2. Feature Engineering: Add relevant lags and rolling features
  3. Hyperparameter Tuning: Use cross-validation
  4. Validation: Test on multiple time windows
  5. Compare: Benchmark against statistical models
  6. Ensemble: Combine with other models if needed
  7. Deploy: Monitor and retrain regularly

Command Palette

Search for a command to run...

Schnellzugriffe
STRG + KSuche
STRG + DNachtmodus / Tagmodus
STRG + LSprache ändern

Software-Details
Kompiliert vor 1 Tag
Release: v4.0.0-production
Buildnummer: master@64a3463
Historie: 68 Items