Complete SARIMAX Study Notes
1. What is SARIMAX?
SARIMAX is a forecasting model used to predict future values from past time-based data.
Full Form:
- S = Seasonal
- AR = AutoRegressive
- I = Integrated
- MA = Moving Average
- X = Exogenous Variables
It is an advanced version of ARIMA.
Example:
Predict next month ice cream sales using:
Predict next month ice cream sales using:
- Past sales
- Summer season
- Temperature
2. Why We Use SARIMAX?
- ✅ Trend
- ✅ Seasonality
- ✅ Past dependency
- ✅ External factors
Example:
Electricity demand depends on:
Electricity demand depends on:
- Past usage
- Weather
- Summer season
3. Simple Meaning of SARIMAX
SARIMAX predicts future data using:
- Past values
- Past mistakes
- Seasonal patterns
- External factors
Example:
Predict website traffic using:
Predict website traffic using:
- Last week traffic
- Weekend effect
- Ad campaign
4. Components of SARIMAX
(1) Seasonal Component
Captures repeated patterns.
Every December sales increase.
(2) AutoRegressive (AR)
Current value depends on past values.
Today's stock price depends on yesterday's price.
(3) Integrated (I)
Used to remove trend using differencing.
Sales: 100, 120, 140, 160 → Trend exists.
(4) Moving Average (MA)
Uses past errors.
Yesterday error = +10, today model adjusts.
(5) Exogenous Variables (X)
External variables affect output.
Rainfall affects umbrella sales.
5. What is Seasonality?
Seasonality means repeating pattern after fixed interval.
- Ice cream sales high every summer
- Shopping high every Diwali
6. Why Seasonality is Important?
Ignoring seasonality gives wrong forecast.
If AC sales are predicted same in winter and summer → wrong result.
7. How to Handle Seasonality?
Use SARIMAX with seasonal parameters.
Monthly sales repeating yearly → use m = 12
8. Differencing (Integration)
Used to remove trend.
Y(t)' = Y(t) - Y(t-d)
Today sales = 200
Yesterday sales = 180
Difference = 20
Yesterday sales = 180
Difference = 20
9. What is Stationary Data?
Stable mean and variance over time.
100, 102, 98, 101, 99 = stationary
10. Seasonal Differencing
Used to remove seasonal effect.
This January sales = 500
Last January sales = 450
Difference = 50
Last January sales = 450
Difference = 50
11. Identify Seasonal Component
Every Sunday restaurant sales are high.
12. Trend using Moving Average
Sales = 100,120,140
Average = 120
Trend = increasing
Average = 120
Trend = increasing
13. Detrended Series
Original sales = 150
Trend = 120
Detrended = 30
Trend = 120
Detrended = 30
14. Residuals
After removing trend + seasonality, remaining random values.
15. SARIMAX Parameters
(p,d,q)(P,D,Q,m)
(1,1,1)(1,1,1,12)
16. Meaning of Parameters
- p = AR Order
- d = Differencing
- q = MA Order
- P = Seasonal AR
- D = Seasonal Differencing
- Q = Seasonal MA
- m = Seasonal Length
Monthly yearly data → m = 12
17. Example of m
| Data | Season | m |
|---|---|---|
| Daily | Weekly | 7 |
| Monthly | Yearly | 12 |
| Quarterly | Yearly | 4 |
18. Example Model
SARIMAX(order=(1,1,1), seasonal_order=(1,1,1,12))
Use:
- 1 AR term
- 1 Differencing
- 1 MA term
- Yearly seasonality
19. Wrong Parameters Effect
- Too much differencing removes useful data
- Too few AR terms miss history
- Too many AR terms cause overfitting
- Wrong m gives wrong seasonal cycle
20. Advantages
- ✅ Handles seasonality
- ✅ Uses external variables
- ✅ Accurate forecasts
21. Limitations
- ❌ Needs parameter tuning
- ❌ Slow on huge data
