diff --git a/tutorials/GluonTS_COVID19_Prediction/GluonTS.API.py b/tutorials/GluonTS_COVID19_Prediction/GluonTS.API.py index c442d9d75..d7f4d4ebf 100644 --- a/tutorials/GluonTS_COVID19_Prediction/GluonTS.API.py +++ b/tutorials/GluonTS_COVID19_Prediction/GluonTS.API.py @@ -6,9 +6,9 @@ # extension: .py # format_name: percent # format_version: '1.3' -# jupytext_version: 1.19.0 +# jupytext_version: 1.19.1 # kernelspec: -# display_name: Python 3 +# display_name: Python 3 (ipykernel) # language: python # name: python3 # --- @@ -54,25 +54,39 @@ # Once you're comfortable with the mechanics here, move to `GluonTS.example.ipynb` to see these models applied to real COVID-19 data. # %% [markdown] -# --- -# # ## Setup # %% +# %load_ext autoreload +# %autoreload 2 + +# System libraries. +import logging import warnings warnings.filterwarnings("ignore") -# Core GluonTS components for the API tutorial +# Third party libraries. +import matplotlib.pyplot as plt +import numpy as np +import pandas as pd +import seaborn as sns + +# %% [markdown] +# ## GluonTS imports and utilities + +# %% +from gluonts.evaluation import make_evaluation_predictions +from gluonts.torch.model.deep_npts import DeepNPTSEstimator from gluonts.torch.model.deepar import DeepAREstimator from gluonts.torch.model.simple_feedforward import SimpleFeedForwardEstimator -from gluonts.torch.model.deep_npts import DeepNPTSEstimator -from gluonts.evaluation import make_evaluation_predictions -# All our utilities are in one place - much cleaner! import GluonTS_utils as gluonts -print("Setup complete. Ready to explore GluonTS.") +# %% +_LOG = logging.getLogger(__name__) +gluonts.init_logger(_LOG) +_LOG.info("Setup complete. Ready to explore GluonTS.") # %% [markdown] # ## The GluonTS Workflow @@ -90,8 +104,6 @@ # Let's see this in action, starting with the simplest possible time series. # %% [markdown] -# --- -# # # Level 1: Sinusoid — The Simplest Pattern # # We begin with a pure sine wave plus a small amount of Gaussian noise. This is the easiest pattern a model can encounter: perfectly periodic, no trend, no regime changes. @@ -245,8 +257,6 @@ # %% [markdown] # > **Checkpoint:** You just completed the full GluonTS workflow — configure, train, forecast, visualize, evaluate. This is the same 5-step pattern for *every* GluonTS model. From here on we'll move faster since you know the drill. What changes is the **data** (increasing complexity) and the **model** (different architectures), not the workflow itself. # -# --- -# # # Level 2: Multi-Frequency — Adding Realism # # Real time series rarely consist of a single clean cycle. This synthetic series combines four components you'll encounter in real data: @@ -372,8 +382,6 @@ ) # %% [markdown] -# --- -# # # Level 3: Regime Change — The Hard Problem # # This is where things get interesting. The series behaves one way for the first half, then **abruptly shifts** to a different baseline, amplitude, and frequency. @@ -506,15 +514,11 @@ print(f" MAPE: {deepar_regime_metrics['mape']:.2f}%") # %% [markdown] -# --- -# # # Model Comparison # # Let's bring all results together. Each model was tested on the data type that best highlights its strengths and weaknesses. # %% -import pandas as pd - comparison = pd.DataFrame( { "Model": [ @@ -551,8 +555,6 @@ print(comparison.to_string(index=False, float_format="%.2f")) # %% [markdown] -# --- -# # # Summary # # ## What You Learned @@ -563,8 +565,6 @@ # 4. **Probabilistic output** — every forecast gives you means, medians, quantiles, and raw samples # 5. **Model choice matters** — the right model depends on your data's characteristics # -# --- -# # ## Which Model Should You Choose? # # | If your data has... | Try this model | Why | @@ -574,8 +574,6 @@ # | Regime shifts, unusual distributions | **DeepNPTS** | Non-parametric — adapts to changing behavior | # | No idea yet | **Start with SimpleFeedForward** | Fast to test, then try DeepAR for more accuracy | # -# --- -# # ## Quick Reference # # | Task | Code | @@ -585,8 +583,6 @@ # | 80% confidence interval | `forecast.quantile(0.1)` to `forecast.quantile(0.9)` | # | Raw sample paths | `forecast.samples` (shape: `num_samples × prediction_length`) | # -# --- -# # ## Tips for Better Results # # | Area | Tip | @@ -596,8 +592,6 @@ # | **Features** | More isn't always better — test with and without | # | **Data quality** | Handle missing values and normalize if needed | # -# --- -# # ## Troubleshooting # # | Problem | Fix | @@ -607,14 +601,10 @@ # | *Poor forecast quality* | Increase `context_length`, train longer, or try DeepNPTS for regime changes | # | *"Unexpected keyword argument"* | DeepAR: `trainer_kwargs={"max_epochs": N}`. DeepNPTS: `epochs=N`. SimpleFeedForward: no `freq` parameter | # -# --- -# # ## Resources # # - [GluonTS Documentation](https://ts.gluon.ai/) · [GitHub](https://github.com/awslabs/gluonts) · [PyTorch Lightning](https://lightning.ai/docs/pytorch/stable/) # -# --- -# # ## What's Next? # # Now that you understand how GluonTS works on clean synthetic data, move to **`GluonTS.example.ipynb`** to see these same models applied to real **COVID-19 case prediction** — with feature engineering, multiple covariates, scenario analysis, and all the messiness of real-world data. diff --git a/tutorials/GluonTS_COVID19_Prediction/GluonTS.example.py b/tutorials/GluonTS_COVID19_Prediction/GluonTS.example.py index 9b1ecf6d9..58d4f8ab8 100644 --- a/tutorials/GluonTS_COVID19_Prediction/GluonTS.example.py +++ b/tutorials/GluonTS_COVID19_Prediction/GluonTS.example.py @@ -6,7 +6,7 @@ # extension: .py # format_name: percent # format_version: '1.3' -# jupytext_version: 1.19.0 +# jupytext_version: 1.19.1 # kernelspec: # display_name: Python 3 (ipykernel) # language: python @@ -25,21 +25,31 @@ # **Models:** We compare DeepAR (complex patterns), SimpleFeedForward (fast baseline), and DeepNPTS (regime changes). # %% [markdown] -# --- -# -# ## 1. Setup and Imports +# ## 1. Setup and imports # # Let's get everything set up for our COVID-19 forecasting analysis. # %% +# %load_ext autoreload +# %autoreload 2 + +# System libraries. +import logging import warnings warnings.filterwarnings("ignore") -# All our utilities in one place - much cleaner! -import GluonTS_utils as gluonts +# Third party libraries. +import matplotlib.pyplot as plt +import numpy as np +import pandas as pd +import seaborn as sns -# Explicit imports for functions called without gluonts. prefix +# %% [markdown] +# ## GluonTS utilities + +# %% +import GluonTS_utils as gluonts from GluonTS_utils import ( train_deepar_covid, train_feedforward_covid, @@ -51,11 +61,12 @@ print_policy_insights, ) -print("Setup complete. Ready to forecast COVID-19 cases.") +# %% +_LOG = logging.getLogger(__name__) +gluonts.init_logger(_LOG) +_LOG.info("Setup complete. Ready to forecast COVID-19 cases.") # %% [markdown] -# --- -# # ## 2. Load and Explore COVID-19 Data # # Let's load our real COVID-19 data and take a look at what we're working with. @@ -116,8 +127,6 @@ # **Why these features?** Target (7-day MA) smooths reporting; deaths lag cases and correlate with outcomes; CFR indicates strain; mobility captures lockdown effects. # %% [markdown] -# --- -# # ## 3. Feature Engineering # # Our data pipeline has already engineered several features to improve model performance: @@ -148,8 +157,6 @@ ) # %% [markdown] -# --- -# # ## 4. Train All Three Models # # **Model choice:** DeepAR for complex wave patterns; SimpleFeedForward for a fast baseline; DeepNPTS for regime shifts across COVID variants. @@ -230,8 +237,6 @@ ) # %% [markdown] -# --- -# # ## 5. Compare Models # # Now that we've trained all three models, let's compare their performance! @@ -293,8 +298,6 @@ ) # %% [markdown] -# --- -# # ## 6. Scenario Analysis: Simulating Interventions # # One of the most powerful applications of forecasting is **scenario analysis** - @@ -372,8 +375,6 @@ print_policy_insights(scenario_results) # %% [markdown] -# --- -# # ## 7. Conclusions and Recommendations # # ### Key Takeaways @@ -438,8 +439,6 @@ # 4. **Scalability**: Use GPU acceleration for faster training # 5. **Interpretability**: Provide explanations alongside forecasts # -# --- -# # ## Congratulations! # # You've completed a full end-to-end COVID-19 forecasting application! @@ -455,8 +454,6 @@ # **Ready to apply these skills to your own forecasting problems?** # %% [markdown] -# --- -# # ## Additional Resources # # **GluonTS Documentation** diff --git a/tutorials/GluonTS_COVID19_Prediction/GluonTS_utils.py b/tutorials/GluonTS_COVID19_Prediction/GluonTS_utils.py index eb609085f..44a14f198 100644 --- a/tutorials/GluonTS_COVID19_Prediction/GluonTS_utils.py +++ b/tutorials/GluonTS_COVID19_Prediction/GluonTS_utils.py @@ -70,6 +70,33 @@ warnings.filterwarnings("ignore") +def init_logger(notebook_log: logging.Logger) -> None: + """ + Configure notebook display and route loggers to print for Jupyter output. + + Standalone tutorial images do not include the helpers package, so this + mirrors the essentials of helpers.hnotebook.config_notebook locally. + """ + import seaborn as sns + + plt.rcParams["figure.figsize"] = (12, 6) + plt.rcParams["legend.fontsize"] = 12 + plt.rcParams["font.size"] = 12 + pd.set_option("display.max_rows", 500) + pd.set_option("display.max_columns", 500) + pd.set_option("display.width", 1000) + sns.set() + + def _info_print(msg: str, *args, **kwargs) -> None: + if args: + msg = msg % args + print(msg) + + notebook_log.info = _info_print # type: ignore[method-assign] + global _LOG + _LOG.info = _info_print # type: ignore[method-assign] + + # ############################################################################# # Analysis # ############################################################################# diff --git a/tutorials/GluonTS_COVID19_Prediction/README.md b/tutorials/GluonTS_COVID19_Prediction/README.md index 90b9809cf..659fd7321 100644 --- a/tutorials/GluonTS_COVID19_Prediction/README.md +++ b/tutorials/GluonTS_COVID19_Prediction/README.md @@ -1,3 +1,10 @@ +# Summary + +This directory is a hands-on GluonTS tutorial: synthetic fundamentals in +`GluonTS.API.ipynb`, a full COVID-19 forecasting pipeline in +`GluonTS.example.ipynb`, shared code in `GluonTS_utils.py`, and Docker scripts +for reproducible Jupyter. The README below explains setup, data, and layout. + # GluonTS Probabilistic Time Series Forecasting Welcome! This tutorial teaches you how to build probabilistic forecasting models with GluonTS. We use a **synthetic-data-first** approach: you'll learn GluonTS fundamentals with clean, interpretable synthetic data, then apply them to a real-world COVID-19 forecasting pipeline. @@ -5,6 +12,7 @@ Welcome! This tutorial teaches you how to build probabilistic forecasting models The tutorials are **interactive and focused on learning**. Implementation details live in reusable Python utilities so notebooks stay clean and readable—you focus on understanding *how* probabilistic forecasting works. **What you'll learn:** + - Building time series forecasts with uncertainty estimates - Comparing different GluonTS model architectures (DeepAR, SimpleFeedForward, DeepNPTS) - Synthetic data progression: sinusoid → multi-frequency → regime change @@ -14,8 +22,8 @@ The tutorials are **interactive and focused on learning**. Implementation detail ## Learning Path -1. **`GluonTS.API.ipynb`** — Start here. Learn GluonTS fundamentals with **synthetic data** (sinusoid, multi-frequency, regime change). No data download needed. Covers DeepAR, SimpleFeedForward, DeepNPTS. -2. **`GluonTS.example.ipynb`** — Real-world application. Full COVID-19 forecasting pipeline with JHU + mobility data, feature engineering, and scenario analysis. +1. **`GluonTS.API.ipynb`**: Start here. Learn GluonTS fundamentals with **synthetic data** (sinusoid, multi-frequency, regime change). No data download needed. Covers DeepAR, SimpleFeedForward, DeepNPTS. +2. **`GluonTS.example.ipynb`**: Real-world application. Full COVID-19 forecasting pipeline with JHU + mobility data, feature engineering, and scenario analysis. ## Getting Started @@ -26,6 +34,7 @@ The tutorials are **interactive and focused on learning**. Implementation detail **Automatic Download (Default)** Just run a notebook—it will: + 1. Check if data exists locally 2. Download any missing files from Google Drive 3. Continue with analysis @@ -37,64 +46,76 @@ No setup needed. Download from: https://drive.google.com/drive/folders/1qMDGBstdY8H2hYpz8xSolhzNOsVxNHMA Save to `data/` directory: -- `cases.csv` — COVID-19 confirmed cases -- `deaths.csv` — COVID-19 deaths -- `mobility.csv` — Mobility patterns + +- `cases.csv`: COVID-19 confirmed cases +- `deaths.csv`: COVID-19 deaths +- `mobility.csv`: Mobility patterns Or run: + ```bash -python GluonTS_utils.py +> python GluonTS_utils.py ``` ### Build and Run **Build Docker image:** + ```bash -./docker_build.sh +> ./docker_build.sh ``` + Takes ~1-2 minutes the first time, ~30 seconds after. **Start Jupyter:** + ```bash -./docker_jupyter.sh +> ./docker_jupyter.sh ``` + Opens at http://localhost:8888 **Or use interactive shell:** + ```bash -./docker_bash.sh +> ./docker_bash.sh ``` ### Files and Structure **Notebooks** -- `GluonTS.API.ipynb` — GluonTS fundamentals with synthetic data -- `GluonTS.example.ipynb` — COVID-19 end-to-end application + +- `GluonTS.API.ipynb`: GluonTS fundamentals with synthetic data +- `GluonTS.example.ipynb`: COVID-19 end-to-end application **Utilities** -- `GluonTS_utils.py` — Consolidated utilities: data I/O, download, preprocessing, - GluonTS conversion, model training, evaluation, visualization, synthetic data + +- `GluonTS_utils.py`: Consolidated utilities: data I/O, download, preprocessing, GluonTS conversion, model training, evaluation, visualization, synthetic data **Data** (auto-downloaded) -- `data/cases.csv` — Daily confirmed cases -- `data/deaths.csv` — Daily deaths -- `data/mobility.csv` — Mobility patterns + +- `data/cases.csv`: Daily confirmed cases +- `data/deaths.csv`: Daily deaths +- `data/mobility.csv`: Mobility patterns **Documentation** -- `blog_GluonTS.md` — Blog post covering GluonTS and COVID-19 forecasting + +- `blog_GluonTS.md`: Blog post covering GluonTS and COVID-19 forecasting **Docker** -- `Dockerfile` — Container setup -- `docker_build.sh` — Build image -- `docker_jupyter.sh` — Run Jupyter -- `docker_bash.sh` — Run shell -- `requirements.txt` — Python packages + +- `Dockerfile`: Container setup +- `docker_build.sh`: Build image +- `docker_jupyter.sh`: Run Jupyter +- `docker_bash.sh`: Run shell +- `requirements.txt`: Python packages ## Notebook Design The notebooks are organized for learning. Implementation details (data loading, plotting, model training) are in utility modules. Notebooks focus on the learning narrative—explanations, results, and insights. Instead of notebook cells with 20 lines of matplotlib code, you see: + ```python import GluonTS_utils as gluonts gluonts.plot_data_overview(train_df, test_df) @@ -156,15 +177,18 @@ flowchart TB ## Learning Resources -**GluonTS** +**GluonTS** + - [Official Documentation](https://ts.gluon.ai/) - [GitHub Repository](https://github.com/awslabs/gluonts) -**Research Papers** +**Research Papers** + - [DeepAR: Probabilistic Forecasting with Autoregressive Recurrent Networks](https://arxiv.org/abs/1704.04110) - [Deep Neural Probabilistic Time Series](https://arxiv.org/abs/1906.05264) -**Data Sources** +**Data Sources** + - [JHU COVID-19 Data Repository](https://github.com/CSSEGISandData/COVID-19) - [Google COVID-19 Community Mobility Reports](https://www.google.com/covid19/mobility/) - [CDC COVID-19 Data Tracker](https://covid.cdc.gov/covid-data-tracker/)