Time Series Forecasting and Nixtla Ecosystem!

Python Milano meetup

Milan, May 18th 2023

github/pietroppeter

👨‍👩‍👧 Me - Pietro Peterlongo

  • past: Math (Pisa, Trieste), Climate Science (Paris), Cryptography (Trento)

  • now (since 2015): Principal Data Scientist at ToolsGroup: Supply Chain Planning Software

  • Python: started early 2000s, serious around 2015, my first presentation in... May 2023 ;)

  • ...also active in Open Source with Nim

Why Forecasting?

What is Time Series Forecasting?

t y
0 5
1 35
2 15
3 23
4 25
5 30
6 ?

Given:

$$y_t$$ $$t=0, 1, \ldots, T$$

Provide:

$\hat{y}_{T + 1}$

(+ confidence)

Domains of applications

  • events planning (how many?)

  • weather forecasting (sun or rain?)

  • economics (growth next year?)

  • control theory (reactor will heat?)

  • finance (stock up or down?)

  • web analytics (how much traffic?)

  • energy (how much consumption?)

  • supply chain (how many sales?)

How to forecast

  • equations with $t$ (ODEs, PDEs, ...)

  • judgmental forecasts

  • statistical forecasting

    • classical (ETS, ARIMA, ...)

    • ml (random forest, LGBM, ...)

    • neural (NBEATS, DeepAR, ...)

  • ensemble forecasting

Dealing with Uncertainity

Supply Chain

A Crash Course

Methodology

  1. think about your why
  2. gather data (process, explore)
  3. baseline
  4. measure
  5. improve
  6. restart from step 4 or less

Baseline

  • Historical average
  • Naive (aka persistence in weather forecasting)
  • Moving Average
  • Seasonal Naive
  • ..., existing forecast (benchmark)

Metrics

  1. low
  2. median
  3. average

from Forecast KPIs: RMSE, MAE, MAPE & Bias

$$ \text{MAPE} = \frac{1}{N} \Sigma \frac{\left| e_t \right|}{d_t} $$

$$ \text{MAE} = \frac{1}{N} \Sigma \left| e_t \right| $$

$$ \text{RMSE} = \sqrt{\frac{1}{N} \Sigma e_t^2} $$

Cross Validation (aka Backtesting)

ETS (Error, Trend, Seasonal)

AutoETS performs automatic model selection

References

FPP: Forecasting Principles and Practices


by Rob J Hyndman and George Athanasopoulos

Monash University, Australia

3rd edition, May 2021

free! otexts.com/fpp3/

R-based

Other references

Python Libraries for Timeseries

Hyndman's suggestions

The best Python implementations for my time series methods are available from Nixtla.

Nixtla Story

Team

Why Nixtla?

  • open source
  • api
  • performance
  • vision

Risks: moving fast, iterating on business model

Nixtlaverse

import lightgbm as lgbm
from mlforecast import MLForecast
from window_ops.expanding import expanding_mean
from window_ops.rolling import rolling_mean

mlf = MLForecast(
    models = [lgbm.LGBMRegressor()], # list of models
    freq = 'MS', # frequency of time series: month start
    differences=[12], # differences to apply to target
    lags=[1, 12], # lags to use as feature
    lag_transforms=( # lag transformation to apply to specific lags
        1: [expanding_mean],
        12: [(rolling_mean, 24)],
    )
)

A declarative feature engineering pipeline!

mlf.fit(df)
forecast_df = mlf.predict(12, levels=[90])
Thank you for listening!
Tack
谢谢
धन्यवाद
Gracias

 

Grazie
ありがとう
Merci
Bedankt

 

Danke
Дякую тобі
Obrigado
Dziękuję Ci