Reference implementation and experiment suite for STFD (Sliding Time-Frequency Decomposition), a decomposition-based preprocessing method for time series forecasting. The code is built on top of the official codebase of Unpacking the Trend (Kreuzer et al., DMKD 2025), extended with STFD and the additional baselines reported in the paper.
The forecasting backbones (DLinear, LSTM, TimeMixer, iTransformer) follow the standard preprocessing-then-forecast pipeline: a series is first decomposed into per-timestep features, and a backbone is trained on those features.
# 1. Install dependencies
pip install -r requirements.txt
# 2. Prepare datasets (see the "Datasets" section below).
# Files are expected under data/UTS/ (univariate) and data/MTS/ (multivariate).
# 3. Run a single experiment (manual mode)
python exp_local.py --algorithm iTransformer --dataset ett_h1 --decomp stfd \
--stfd_window_size 12 --stfd_kernel_size 13 --seed 0 --save
# 4. Run the full grid (batch mode)
python exp_local.py --batchSTFD turns a univariate signal X of length L into a stack of per-timestep
features (paper Algorithm 1):
- Trend
T = MovingAverage(X, kernel=k). - Seasonal
S = X - T. - For each sliding window of
S(sizew, stride 1), take the FFT within the window (dim=-1) and keep the firstw//2 + 1bins (a real signal's FFT is conjugate-symmetric, so the remaining bins are redundant), split into real and imaginary parts. - Align
X,T,Swith the windows and concatenate[X, T, S, real, imag]as the per-timestep feature vector.
In the default rfft mode the feature count is n_features = 3 + 2 * (w//2 + 1) = w + 5. Multivariate inputs are handled channel-independently (each variable is
preprocessed separately), matching the Unpacking the Trend framework. Because
STFD features are not additive, the backbone output's first component (X) is
used as the forecast, via a dedicated reconstruction wrapper.
.
├── README.md # this file
├── requirements.txt
│
├── config.py # dataset / decomposition settings (STFD params included)
├── data_reading.py # dataset loaders
├── forecasting_dataset.py # decomposition dispatch + train/val/test windowing
├── model_config.py
├── metrics.py
├── utility.py
│
├── decomposition/
│ ├── moving_average.py # local: moving-average decomposition
│ ├── fourier_bandlimited.py # local: band-limited Fourier
│ ├── fourier_topk.py # local: top-k Fourier
│ ├── wavelet.py # local: wavelet
│ ├── ewt.py # local: Empirical Wavelet Transform
│ ├── modwt.py # local: MODWT (via pywt.swt)
│ ├── vmd_fixed.py # local: VMD with fixed K
│ ├── stfd.py # STFD (the proposed method)
│ ├── decomposition_pytorch.py # in-graph moving-average (used by DLinear)
│ └── global_baselines.py # global wrappers: EMD/EEMD/CEEMDAN/VMD/EWT/SSA
│
├── models/
│ ├── DLinear/, LSTM/, TimeMixer/, iTransformer/ # forecasting backbones
│ ├── input_decomposition_wrapper.py # adds the 'first' reconstruction mode (for STFD)
│ ├── create_model.py # builds backbone + decomposition wrapper
│ ├── train_model.py # training / validation / prediction loop
│ └── losses.py, masking.py, tools.py
│
└── # ----- Experiments (main results) -----
├── exp_local.py # local comparison (STFD vs local baselines, leakage-free)
└── exp_global_ensemble.py # global decomposition-ensemble (EMD-LSTM family, with leakage)
# STFD, single run
python exp_local.py --algorithm iTransformer --dataset ett_h1 --decomp stfd \
--stfd_window_size 12 --stfd_kernel_size 13 --stfd_padding_mode replicate --seed 0
# Compare against other local decompositions
python exp_local.py --algorithm iTransformer --dataset ett_h1 --decomp wavelet --seed 0
python exp_local.py --algorithm iTransformer --dataset ett_h1 --decomp none --seed 0
# Persist results
python exp_local.py ... --saveSTFD-specific flags: --stfd_window_size (w), --stfd_kernel_size (k),
--stfd_padding_mode (none | reflect | cyclic | replicate), and
--stfd_use_full_fft (0/1). Defaults are taken from config.py.
# Every (algorithm, dataset, decomposition) combination
python exp_local.py --batch
# A specific subset
python exp_local.py --batch --algorithms iTransformer TimeMixer \
--datasets ett_h1 exchange_rate illness --decomps stfd wavelet moving_avg \
--n_seeds 3The local comparison defaults to four backbones (DLinear, LSTM, TimeMixer,
iTransformer) and nine decompositions (none, moving_avg,
fourier_bandlimited, fourier_topk, wavelet, ewt, modwt, vmd_fixed,
stfd).
python exp_local.py --ablation_stfd --datasets ett_h1 --algorithms iTransformerThe global protocol decomposes the full series before splitting (so it contains data leakage by construction) and trains one backbone per IMF, summing the forecasts — the canonical EMD-LSTM / VMD-LSTM setup.
python exp_global_ensemble.py --batch
python exp_global_ensemble.py --batch --decomps emd_global vmd_global ceemdan_globalDecomposition is seed-independent and expensive (e.g. CEEMDAN), so results are
cached under cache_global_decomp/. Set STFD_GLOBAL_WORKERS=N to parallelise
the per-series decompositions across N processes.
Results are written as pickle files under results/ (see the "Result format"
section below for the exact structure).
Place dataset files under data/UTS/ (univariate) and data/MTS/
(multivariate). The default batch run evaluates the datasets listed in
config.py (dataset_names):
Univariate: m4_h (M4 Hourly), m4_m (M4 Monthly), m4_y (M4 Yearly).
Multivariate: ett_h1, ett_h2, ett_m1, ett_m2 (ETT), weather,
exchange_rate, illness.
data_reading.py supports additional datasets (e.g. m4_w, m4_d, m4_q,
nn5, tourism, m3_*) that can be requested explicitly via --datasets. If
you add a new dataset to dataset_names, also extend the periods,
stride_lengths, d_model, and d_ff maps in config.py.
To keep runtimes manageable, the large M4 subsets (m4_m, m4_d, m4_q) are
deterministically sub-sampled to 2,000 series (read_m4(path, max_series=2000),
random_state=42). Set max_series=None in data_reading.py to use the full
sets.
Dataset sources: ETT (https://github.com/zhouhaoyi/ETDataset), Exchange/Weather/Illness (https://github.com/thuml/Autoformer), M4 (https://github.com/Mcompetitions/M4-methods), and the Monash time series forecasting repository for the remaining series.
results/exp_local/{algorithm}/{dataset}.pkl
-> {decomp_name: {mse, mae, mape, smape, mase, owa, std_*, n_params,
train_time, inference_time, effective_input_length,
horizon, backhorizon, decomp_params}}
results/exp_global_ensemble/{algorithm}/{dataset}.pkl
-> same shape, plus 'is_global', 'is_ensemble', 'n_imf_trained',
'n_imf_intended', and 'leakage_caveat'
The sliding FFT is taken within each window (dim=-1). The original
upstream code applied the FFT along the window-position axis (dim=1); that is
corrected here so the transform matches a proper short-time Fourier transform.
With padding_mode='none' the effective input length shrinks to L - w + 1
(the paper's original behaviour). The padding modes (reflect, cyclic,
replicate) preserve length L, which keeps the input comparable to the other
baselines; replicate is used for the paper's main results.
See requirements.txt. Beyond the upstream dependencies, the additional
baselines require ewtpy (EWT), vmdpy (VMD), and EMD-signal (EEMD/CEEMDAN).