Integrating Nowcasts into an Ensemble of Data-Driven Forecasting Models for SARI Hospitalizations in Germany

Replication package

Integrating Nowcasts into an Ensemble of Data-Driven Forecasting Models for SARI Hospitalizations in Germany

Daniel Wolffram, Johannes Bracher

Repository Structure

code/ — Python project (primary codebase)
- src/ — reusable Python modules
- *.ipynb — Jupyter notebooks for tuning, training, evaluation, and plotting
- pyproject.toml, uv.lock — Python environment
r/ — R project (separate renv environment)
- hhh4/, tscount/, persistence/ — model-specific scripts
- nowcasting/ — nowcast computation
- illustrations/ — visualizations
- renv.lock, .Rprofile — R environment
data/ — input datasets
figures/ — generated plots
forecasts/ — generated forecasts
nowcasts/ — generated nowcasts
results/ — generated results
- scores/ — evaluation metrics
- tuning/ — hyperparameter tuning results

Environments

Python code lives in code/. R code lives in r/, with its own environment. Shared inputs and outputs (data/, forecasts/, nowcasts/, figures/, results/) live at the repo root and are accessible from both Python and R.

Python

The project uses uv to manage the Python environment.
Install uv on your system as follows:

Linux / macOS

curl -LsSf https://astral.sh/uv/install.sh | sh

Windows:

irm https://astral.sh/uv/install.ps1 | iex

(In case of problems, please refer to the official installation guide.)

Once uv is installed, set up the environment from the repository root:

uv sync

This will create a local .venv/ and install all dependencies specified in pyproject.toml and uv.lock. It will also automatically install the required Python version if it is not already available on your system.

To run the notebooks with this environment, you must first register it as a Jupyter kernel:

uv run -m ipykernel install --user --name=replication-sari

For interactive use, you can start JupyterLab inside the managed environment:

uv run jupyter lab

This provides a browser-based interface, useful if you don't have a preferred IDE installed. After launching, select the kernel replication-sari when opening notebooks.

R

To ensure reproducibility, please use R 4.5.1. Dependencies are managed with renv. From the r/ folder, restore the environment with:

R -e "install.packages('renv'); renv::restore()"

This will restore all R package dependencies as specified in renv.lock.
⚠️ Unlike uv, renv does not install R itself — you must install R 4.5.1 manually.

Note: The repository includes .Rprofile files (at both the root and in r/) that automatically activate the correct renv environment and anchor the here package to the repository root. This ensures that paths like here("data", ...) always work consistently, whether you open the whole repo or just the R subproject.

Running the Pipeline

The repository contains a helper script run_pipeline.py that orchestrates the execution of all notebooks and R scripts in a defined order. This ensures reproducibility of results and allows running the full pipeline or just selected parts of it.
(If preferred, you can also open and run the individual notebooks or R scripts manually.)

Pipeline structure

The pipeline runs through the following stages:

exploration: Exploratory data analysis and visualization.

plot_sari.ipynb: visualize SARI data
plot_ari.ipynb: visualize ARI data
plot_delays.ipynb: analyze reporting delays
autocorrelation.ipynb: investigate correlation structure of time series

nowcasts: Real-time estimation of current case counts.

nowcasting/compute_nowcasts.R

tuning: Hyperparameter tuning for machine learning models (⚠️ may take several days).

tuning_lightgbm.ipynb
tuning_tsmixer.ipynb

forecasts: Generate forecasts with different model variants.

baseline_historical.ipynb: historical baseline model
compute_forecasts.ipynb: compute ML-based forecasts
persistence/persistence.R: persistence baseline
hhh4/hhh4_default.R, hhh4/hhh4_exclude_covid.R, hhh4/hhh4_naive.R, hhh4/hhh4_oracle.R, hhh4/hhh4_shuffle.R, hhh4/hhh4_skip.R, hhh4/hhh4_vincentization.R: hhh4 model variants
tscount/tscount_extended.R, tscount/tscount_simple.R: tscount models

ensemble: Combine forecasts into an ensemble.

compute_ensemble.R

scores: Compute forecast evaluation scores.

compute_scores.ipynb

evaluation: Final visualization and evaluation of forecasts.

plot_nowcasts.ipynb
plot_forecasts.ipynb
evaluation.ipynb
evaluation_quantiles.ipynb
diebold_mariano.ipynb

Usage

The pipeline can be executed with different options from the repository root.
(We use uv run instead of python to ensure the script is executed inside the correct environment managed by uv.)

Run the entire pipeline
```
uv run run_pipeline.py
```

Run a single stage

uv run run_pipeline.py --stage evaluation

Run a contiguous range of stages

uv run run_pipeline.py --start forecasts --end scores

Run everything except selected stages
```
uv run run_pipeline.py --skip tuning
```

⚠️ Note: The tuning stage can take a very long time (several days). If you do not want to run it, use --skip tuning

Requirement: correct R version

When running the pipeline, make sure that the Rscript command points to the correct R version (4.5.1).
On some systems, the default Rscript may refer to an older version of R.

You can check this with:

Rscript --version

Name		Name	Last commit message	Last commit date
Latest commit History 131 Commits
code		code
data		data
figures		figures
forecasts		forecasts
nowcasts/simple_nowcast		nowcasts/simple_nowcast
r		r
results		results
.Rprofile		.Rprofile
.gitattributes		.gitattributes
.gitignore		.gitignore
.python-version		.python-version
README.md		README.md
pyproject.toml		pyproject.toml
ruff.toml		ruff.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Replication package

Integrating Nowcasts into an Ensemble of Data-Driven Forecasting Models for SARI Hospitalizations in Germany

Repository Structure

Environments

Python

R

Running the Pipeline

Pipeline structure

Usage

Requirement: correct R version

About

Uh oh!

Releases

Packages

Languages

dwolffram/replication-sari-forecasting

Folders and files

Latest commit

History

Repository files navigation

Replication package

Integrating Nowcasts into an Ensemble of Data-Driven Forecasting Models for SARI Hospitalizations in Germany

Repository Structure

Environments

Python

R

Running the Pipeline

Pipeline structure

Usage

Requirement: correct R version

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages