Dokumentation (english)

XGBoost Time Series

Gradient boosting with lag features for capturing complex non-linear time series patterns

XGBoost adapted for time series uses gradient boosting trees with engineered lag features and rolling statistics. It's powerful for capturing complex non-linear patterns, interactions between features, and handling exogenous variables.

When to Use XGBoost Time Series

XGBoost Time Series is best suited for:

  • Time series with non-linear patterns and complex interactions
  • When you have external features (weather, promotions, economic indicators)
  • Data with abrupt changes or regime shifts
  • Scenarios where tree-based models outperform linear models
  • High-dimensional feature spaces
  • When you need feature importance insights
  • Business forecasting with many covariates
  • Ensemble forecasting (combine with statistical models)

Strengths

  • Captures non-linear relationships naturally
  • Handles many exogenous features efficiently
  • Feature interactions automatically learned
  • Robust to outliers compared to statistical models
  • Feature importance for interpretability
  • Fast training (parallel tree construction)
  • No need for stationarity assumptions
  • Works well with irregular patterns
  • Can model complex seasonality through lag features
  • Scales to large datasets

Weaknesses

  • Requires careful feature engineering (lags, rolling stats)
  • Needs substantial historical data for lag features
  • Cannot extrapolate beyond training distribution
  • Forecast uncertainty requires additional methods (quantile regression)
  • Hyperparameter tuning can be time-consuming
  • Risk of overfitting with insufficient data
  • Less interpretable than statistical models
  • Requires exogenous features to be known at forecast time
  • Not designed for very long-term forecasts (multi-step becomes iterative)

Parameters

Common Time Series Parameters

All time series models share these parameters:

  • Timestamp Column (required): Column containing dates/times
  • Target Column (required): Numeric value to forecast
  • Feature Columns (optional): Additional feature columns (exogenous variables)
  • Frequency (optional): Time spacing (D, H, W, M). Auto-inferred if not specified
  • Forecast Steps (required, default=1): How many periods to predict

Feature Engineering Parameters

Lag Features

  • Type: List of integers
  • Default: [1, 2, 3, 7]
  • Description: Past time steps to use as features (e.g., [1, 7] creates features for yesterday and last week)
  • Guidance:
    • For daily data: [1, 7, 14, 30] (yesterday, last week, 2 weeks, last month)
    • For hourly data: [1, 24, 168] (last hour, same hour yesterday, same hour last week)
    • For monthly data: [1, 12] (last month, same month last year)
  • Important: Larger lags require more historical data

Rolling Mean Windows

  • Type: List of integers
  • Default: [7, 14]
  • Description: Window sizes for rolling average features
  • Guidance:
    • For daily data: [7, 14, 30] (weekly, bi-weekly, monthly averages)
    • For hourly data: [24, 168] (daily, weekly averages)
  • Purpose: Captures recent trends and smooths noise

XGBoost Model Parameters

Number of Trees (n_estimators)

  • Type: Integer
  • Default: 100
  • Description: Number of boosting trees to train
  • Typical Range: 50-500
  • Guidance:
    • Start with 100
    • Increase to 200-500 for complex patterns
    • Use early stopping to find optimal number

Max Depth

  • Type: Integer
  • Default: 6
  • Description: Maximum depth of each tree
  • Typical Range: 3-10
  • Guidance:
    • 3-4: Shallow trees, prevents overfitting
    • 6-8: Moderate complexity (default range)
    • 9-10: Deep trees for very complex interactions
  • Note: Deeper trees increase overfitting risk

Learning Rate

  • Type: Float
  • Default: 0.1
  • Description: Shrinkage applied to each tree (step size)
  • Typical Range: 0.01-0.3
  • Guidance:
    • 0.01-0.05: Slow learning, needs more trees, less overfitting
    • 0.1: Standard default
    • 0.2-0.3: Fast learning, fewer trees, more overfitting risk
  • Trade-off: Lower learning rate + more trees = better generalization but slower training

Configuration Tips

Feature Engineering Strategy

Minimal Configuration:

add_lags=[1, 7]
add_rolling_mean=[7]

Uses recent value and last week, plus 7-day average.

Comprehensive Configuration (Daily Data):

add_lags=[1, 2, 3, 7, 14, 30]
add_rolling_mean=[7, 14, 30]

Captures short-term (1-3 days), weekly, bi-weekly, and monthly patterns.

Hourly Data:

add_lags=[1, 24, 168]
add_rolling_mean=[24, 168]

Last hour, same hour yesterday, same hour last week.

Determining Lag Features

  1. Domain Knowledge: What past periods are relevant?

    • Retail: day-of-week effects → lag 7
    • Energy: same hour yesterday → lag 24 (hourly data)
  2. ACF/PACF Plots: Check autocorrelation at different lags

  3. Seasonality: Include lags at seasonal periods

    • Weekly: 7 (daily), 168 (hourly)
    • Yearly: 365 (daily), 12 (monthly)

Hyperparameter Tuning

Conservative (Prevent Overfitting):

n_estimators=100
max_depth=3
learning_rate=0.05

Aggressive (Capture Complexity):

n_estimators=300
max_depth=8
learning_rate=0.1

Recommended Starting Point:

n_estimators=100
max_depth=6
learning_rate=0.1

Then tune based on validation performance.

Using External Features

XGBoost shines with exogenous variables:

feature_columns=['temperature', 'is_holiday', 'promotion', 'competitor_price']

Tips:

  • Categorical features: One-hot encode or use native category handling
  • Scale features: Not strictly necessary for trees, but can help
  • Feature importance: Use XGBoost's feature importance to identify useful features

Multi-Step Forecasting

For forecast_steps > 1, two strategies:

  1. Direct Multi-Step: Train separate models for each horizon (h=1, h=2, ..., h=n)
  2. Recursive: Use 1-step model iteratively, feeding predictions as inputs

Most implementations use recursive by default.

Common Issues and Solutions

Issue: Poor Long-Term Forecasts

Solution:

  • XGBoost cannot extrapolate beyond training range
  • Forecasts revert to mean for distant horizons
  • Use for short-term forecasts (1-30 steps)
  • Combine with statistical models (ARIMA, Prophet) for long-term
  • Ensure lag features cover forecast horizon

Issue: Predictions Are Constant or Too Smooth

Solution:

  • Not enough lag diversity or external features
  • Increase lag variety: add more lags
  • Add rolling statistics (std, min, max)
  • Include time-based features (day_of_week, month, quarter)
  • Verify training data has sufficient variation

Issue: Overfitting (Great Training, Poor Validation)

Solution:

  • Reduce max_depth to 3-4
  • Lower learning_rate to 0.05
  • Decrease n_estimators
  • Add regularization (increase reg_alpha, reg_lambda if available)
  • Reduce number of lag features
  • Use cross-validation for hyperparameter tuning

Issue: Underfitting (Poor Training and Validation)

Solution:

  • Increase max_depth to 7-10
  • Increase n_estimators to 200-500
  • Add more lag features
  • Include relevant external features
  • Engineer interaction features (e.g., is_weekend × temperature)

Issue: Training Takes Too Long

Solution:

  • Reduce n_estimators
  • Decrease max_depth
  • Subsample data for initial experiments
  • Use fewer lag/rolling features
  • Enable GPU acceleration if available

Issue: Need Prediction Intervals

Solution:

  • XGBoost doesn't natively provide confidence intervals
  • Use quantile regression (train models for different quantiles)
  • Bootstrap methods (train multiple models on resampled data)
  • Conformal prediction
  • Combine with statistical models that provide intervals

Issue: Exogenous Features Not Available at Forecast Time

Solution:

  • Only use features you can know in advance
  • Forecast exogenous variables first (separate models)
  • Use lagged versions of uncertain features
  • Consider scenario-based forecasting

Example Use Cases

Daily Retail Sales with Promotions

target: daily_sales
feature_columns: [is_promotion, discount_percent, is_holiday, temperature]
add_lags: [1, 7, 14]
add_rolling_mean: [7, 30]
n_estimators: 200
max_depth: 6

Captures weekly patterns, promotional effects, and seasonal trends.

Hourly Energy Consumption

target: hourly_demand
feature_columns: [temperature, is_weekend, hour_of_day]
add_lags: [1, 24, 168]
add_rolling_mean: [24, 168]
n_estimators: 300
max_depth: 7

Models intraday cycles, day-of-week, and temperature dependence.

Website Traffic with Marketing

target: daily_visitors
feature_columns: [ad_spend, email_sent, content_posts]
add_lags: [1, 7]
add_rolling_mean: [7, 14]
n_estimators: 150
max_depth: 5

Separates organic traffic patterns from marketing impacts.

Stock Trading Volume

target: volume
feature_columns: [price_change, volatility, market_index]
add_lags: [1, 2, 5]
add_rolling_mean: [5, 20]
n_estimators: 100
max_depth: 4

Captures market dynamics and momentum effects.

Comparison with Other Models

vs ARIMA/SARIMA:

  • XGBoost: Non-linear, handles features, no stationarity assumption
  • ARIMA: Linear, interpretable, confidence intervals, statistical framework

vs Prophet:

  • XGBoost: More flexible with features, captures complex interactions
  • Prophet: Better for seasonality, trends, holidays, easier to use

vs LightGBM/CatBoost Time Series:

  • Similar capabilities, differences in speed and categorical handling
  • LightGBM: Faster, lower memory
  • CatBoost: Better with categorical features, less tuning

vs Traditional XGBoost (Tabular):

  • Same algorithm, but time series version adds temporal feature engineering
  • Requires lag/rolling features and temporal validation

Advanced Tips

Feature Engineering Ideas

  1. Time-Based Features:

    • day_of_week, month, quarter, is_weekend, is_month_end
  2. Lag Transformations:

    • Differences: target[t] - target[t-1]
    • Percentage changes: (target[t] - target[t-1]) / target[t-1]
  3. Rolling Statistics:

    • Rolling std, min, max (not just mean)
    • Expanding windows
  4. Interaction Features:

    • is_holiday × day_of_week
    • temperature × hour_of_day

Validation Strategy

Use time series cross-validation:

  • Fixed origin: Train on [1:n], test on [n+1:n+h]
  • Rolling origin: Multiple train/test splits respecting time order
  • Never shuffle data (violates temporal structure)

Ensemble Approach

Combine XGBoost with statistical models:

Final Forecast = 0.5 × XGBoost + 0.5 × ARIMA

Balances XGBoost's flexibility with ARIMA's extrapolation.

Handling Categorical Features

  • One-hot encoding (standard)
  • Target encoding (encode by target mean per category)
  • XGBoost native categorical (if implementation supports)

Technical Details

Lag Feature Creation

For add_lags=[1, 7]:

X[t] includes:
- target[t-1]  (yesterday)
- target[t-7]  (last week)

For forecast_steps=3, you need at least lag=3 to predict h=3.

Multi-Step Recursive Forecasting

  1. Predict t+1 using lags from training data
  2. Predict t+2 using prediction from step 1 as lag
  3. Repeat for all forecast_steps

Errors accumulate, so performance degrades with horizon.


Command Palette

Search for a command to run...

Schnellzugriffe
STRG + KSuche
STRG + DNachtmodus / Tagmodus
STRG + LSprache ändern

Software-Details
Kompiliert vor 1 Tag
Release: v4.0.0-production
Buildnummer: master@64a3463
Historie: 68 Items