Auto ARIMA
Automatically selects optimal ARIMA parameters using stepwise search algorithm
Auto ARIMA automatically selects the best ARIMA or SARIMA model by testing different combinations of (p,d,q) and seasonal (P,D,Q,m) parameters. It eliminates manual parameter tuning while finding models that balance fit and complexity.
When to Use Auto ARIMA
Auto ARIMA is best suited for:
- When you want ARIMA but don't want to manually tune parameters
- Univariate time series forecasting with or without seasonality
- Automated forecasting pipelines requiring minimal human intervention
- Exploring what ARIMA configuration works best for your data
- When you need statistically rigorous forecasts with confidence intervals
- Quick model development without domain expertise in ARIMA
Strengths
- Eliminates manual parameter tuning
- Tests many model configurations efficiently
- Uses information criteria (AIC/BIC) for model selection
- Can detect and model seasonality automatically
- Includes stationarity and unit root tests
- Provides ARIMA and SARIMA models
- Statistically principled model selection
- Good for production pipelines
- Works well with diverse time series types
Weaknesses
- Slower than fitting a single ARIMA model (searches over many models)
- Still limited to univariate forecasting
- Cannot incorporate exogenous variables (use pmdarima's auto_arima with exogenous if needed)
- May select overly complex models on noisy data
- Computational cost increases with large seasonal periods
- No guarantee of "best" model for out-of-sample forecasts
- Black box: less interpretable than manually selected models
Parameters
Common Time Series Parameters
All time series models share these parameters:
- Timestamp Column (required): Column containing dates/times
- Target Column (required): Numeric value to forecast
- Frequency (optional): Time spacing (D, H, W, M). Auto-inferred if not specified
- Forecast Steps (required, default=1): How many periods to predict
Auto ARIMA-Specific Parameters
Seasonal
- Type: Boolean
- Default: true
- Description: Whether to search for seasonal ARIMA models (SARIMA)
- true: Searches both non-seasonal and seasonal models
- false: Only searches non-seasonal ARIMA models
- Guidance:
- Set to true if your data has seasonality
- Set to false for non-seasonal data or when seasonality is pre-removed
Seasonal Period (m)
- Type: Integer
- Default: 12
- Description: Number of periods in a seasonal cycle (only used if seasonal=true)
- Common Values:
- 7 for weekly seasonality in daily data
- 12 for yearly seasonality in monthly data
- 24 for daily seasonality in hourly data
- 4 for yearly seasonality in quarterly data
- Important: Must match your data's seasonal pattern
Stepwise Search
- Type: Boolean
- Default: true
- Description: Search strategy
- true: Stepwise algorithm (faster, explores promising areas)
- false: Exhaustive grid search (slower, guaranteed to find best in search space)
- Guidance:
- Use true for faster results (recommended)
- Use false if you have time and want exhaustive search
Show Progress (Trace)
- Type: Boolean
- Default: false
- Description: Whether to print search progress and model comparisons
- true: Shows each model tested and its AIC score
- false: Silent mode
- Guidance: Set to true for debugging or understanding model selection
Configuration Tips
Quick Start Configuration
For most cases, use these settings:
seasonal=true
m=[your seasonal period]
stepwise=true
trace=falseAuto ARIMA will handle the rest.
Choosing Seasonal Period (m)
Match m to your data's repeating pattern:
- Daily data with weekly patterns: m=7
- Hourly data with daily patterns: m=24
- Monthly data with yearly patterns: m=12
- Quarterly data: m=4
Stepwise vs Exhaustive Search
Stepwise (stepwise=true):
- Faster (minutes vs hours)
- Intelligent search based on information criteria
- May miss global optimum but finds good models
- Recommended for most use cases
Exhaustive (stepwise=false):
- Tests all combinations in parameter space
- Guaranteed to find best model within bounds
- Very slow for large seasonal periods (m > 24)
- Use when you have time and need the absolute best
Understanding the Search Process
Auto ARIMA:
- Tests for stationarity (unit root tests)
- Determines d and D (differencing orders)
- Searches over p, q (non-seasonal AR, MA)
- Searches over P, Q (seasonal AR, MA)
- Selects model with lowest AIC (or BIC)
Handling Large Seasonal Periods
For m > 24 (e.g., hourly data with weekly seasonality, m=168):
- Consider aggregating to lower frequency (hourly → daily)
- Use stepwise=true (exhaustive will be extremely slow)
- Or use Prophet/TBATS which handle large m more efficiently
When to Disable Seasonal Search
Set seasonal=false if:
- Your data has no seasonality (e.g., stock returns)
- Seasonality already removed via preprocessing
- Seasonal period is very large (m > 52)
Common Issues and Solutions
Issue: Search Takes Too Long
Solution:
- Ensure stepwise=true (not exhaustive)
- Reduce seasonal period by aggregating data (hourly → daily)
- Set seasonal=false if seasonality isn't critical
- Consider using simpler models like Theta or Exponential Smoothing
- Reduce the max search space (if configurable in your implementation)
Issue: Selected Model Has Poor Out-of-Sample Performance
Solution:
- Auto ARIMA optimizes in-sample AIC, which may overfit
- Use time series cross-validation to validate
- Try constraining search space (e.g., max p=3, q=3)
- Compare with simpler models (Theta, Exponential Smoothing)
- Ensure sufficient data for model complexity
Issue: Model Selection Unstable (Changes with Small Data Updates)
Solution:
- This is expected with borderline AIC differences
- Consider averaging forecasts from top models
- Use a simpler, manually selected ARIMA model for stability
- Increase data history if possible
Issue: Need External Variables (Promotions, Weather, etc.)
Solution:
- Standard Auto ARIMA doesn't support exogenous variables
- Some implementations (e.g., pmdarima) support exogenous regressors
- Alternatively, use SARIMAX with manual parameter selection
- Or use Prophet with additional regressors
Issue: Seasonal Period Unclear
Solution:
- Plot your data and look for repeating patterns
- Use autocorrelation plots (ACF) to identify lags with high correlation
- Try multiple seasonal periods and compare performance
- Use Prophet which auto-detects multiple seasonalities
Issue: Selected Model Is Too Complex
Solution:
- AIC may favor complex models on some data
- Manually specify simpler models (ARIMA(1,1,1) or SARIMA(1,1,1)(1,1,1)s)
- Use BIC instead of AIC if your implementation allows (BIC penalizes complexity more)
- Compare forecast performance vs simpler baselines
Issue: No Suitable Model Found
Solution:
- Your data may not be suitable for ARIMA
- Check for structural breaks or outliers
- Try preprocessing (outlier removal, log transformation)
- Consider non-linear models (Prophet, XGBoost)
Example Use Cases
Daily Retail Sales (Weekly Seasonality)
seasonal=true
m=7
stepwise=trueFinds best SARIMA model for day-of-week patterns.
Monthly Revenue (Yearly Seasonality)
seasonal=true
m=12
stepwise=trueDiscovers optimal monthly model with yearly cycle.
Hourly Energy Consumption (Daily Cycle)
seasonal=true
m=24
stepwise=true
trace=true # see search progressIdentifies best model for intraday patterns.
Stock Prices (No Seasonality)
seasonal=false
stepwise=trueFinds best non-seasonal ARIMA (likely random walk).
Weekly Website Traffic
seasonal=true
m=52
stepwise=trueWarning: m=52 is large; consider aggregating to monthly (m=12).
Comparison with Other Models
vs Manual ARIMA:
- Auto ARIMA: Automatic, no expertise needed, explores many models
- Manual ARIMA: Faster if you know parameters, more interpretable
vs SARIMA:
- Auto ARIMA: Automatically finds SARIMA parameters
- SARIMA: Requires manual specification of (p,d,q)(P,D,Q,s)
vs Prophet:
- Auto ARIMA: Univariate, no exogenous variables, statistical
- Prophet: Multiple seasonalities, external regressors, more flexible
vs Theta Method:
- Auto ARIMA: More complex, potentially better fit
- Theta Method: Much faster, simpler baseline
vs XGBoost/LightGBM:
- Auto ARIMA: Statistical model, confidence intervals, works with less data
- XGBoost: Non-linear, handles many features, needs more data
Technical Details
Model Selection Criteria
Auto ARIMA typically uses AIC (Akaike Information Criterion):
- AIC = -2 × log(likelihood) + 2 × (number of parameters)
- Lower AIC is better
- Balances fit (likelihood) and simplicity (parameters)
Some implementations offer BIC (Bayesian Information Criterion):
- BIC penalizes complexity more than AIC
- Tends to select simpler models
Search Bounds
Typical default search spaces:
- p: 0 to 5
- d: 0 to 2
- q: 0 to 5
- P: 0 to 2
- D: 0 to 1
- Q: 0 to 2
Stationarity Tests
Auto ARIMA performs unit root tests (e.g., KPSS, ADF) to determine d and D automatically.
Practical Recommendations
- Start with Auto ARIMA to understand what works for your data
- Examine the selected model: Note the (p,d,q)(P,D,Q)m parameters
- Validate: Use time series cross-validation
- Compare: Benchmark against simpler models (Theta, Exponential Smoothing)
- Productionize: Consider fixing the selected model for stability
Auto ARIMA is excellent for exploration and baseline forecasts, but manual tuning or alternative models may outperform in production.