Skip to content

Variable series length support for foundation models#3125

Draft
Kurokabe wants to merge 1 commit into
masterfrom
variable_length_dataset
Draft

Variable series length support for foundation models#3125
Kurokabe wants to merge 1 commit into
masterfrom
variable_length_dataset

Conversation

@Kurokabe
Copy link
Copy Markdown
Collaborator

Hi @dennisbader and @daidahao ,

Here is my draft PR to support variable-length fine-tuning and inference on foundation models.

The main changes are:

  • VariableLengthTorchTrainingDataset (new class in training_dataset.py): a TorchTrainingDataset subclass that accepts series shorter than input_chunk_length by left-padding the past window with NaN. This allows fit_from_dataset() to handle heterogeneous datasets without requiring per-window input_chunk_length tuning or silently dropping short series. Covariates and sample weights are intentionally not supported for now.

  • FoundationModel._build_inference_dataset override (new method in foundation_model.py): transparently left-pads short series with NaN before passing them to SequentialTorchInferenceDataset, so that predict() works on short series without any manual pre-processing from callers. The padding logic mirrors what VariableLengthTorchTrainingDataset does during training.

Note that for now, only inference has been tested end-to-end. dev_fev_tasks_mini_validation.ipynb and fev_tasks_mini.yaml are development artifacts I've included in case you want to reproduce the validation runs, they will be removed before merging.

One thing I can't fully explain: the notebook compares three approaches. Step 1 (adaptive input_chunk_length, window-by-window) produces different results than steps 2 and 3. Step 2 uses a fixed input_chunk_length=32 with manual NaN pre-padding before fit(), processed window-by-window. Step 3 uses VariableLengthTorchTrainingDataset with the same fixed input_chunk_length=32 in a single pass over all series. Steps 2 and 3 match each other exactly, which validates that VariableLengthTorchTrainingDataset is equivalent to manual pre-padding. But I can't explain why step 1 produces different outputs, since the only difference is the input_chunk_length value used per window, it's likely a context-length effect rather than a batching artefact, but I'm not certain. Do you have any insight on this?

One idea I had for a potential follow-up: instead of a dedicated VariableLengthTorchTrainingDataset, we could relax the short-series validation in ShiftedTorchTrainingDataset (currently a hard error in _get_end_of_output_idx) and handle the NaN padding in a collate_fn passed to the DataLoader.

Let me know what you think :)

…gth inputs and pre-pad smaller inputs during inference for foundation models
@review-notebook-app
Copy link
Copy Markdown

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@codecov
Copy link
Copy Markdown

codecov Bot commented May 29, 2026

Codecov Report

❌ Patch coverage is 19.73684% with 61 lines in your changes missing coverage. Please review.
✅ Project coverage is 96.15%. Comparing base (40af46d) to head (e71a8e2).
⚠️ Report is 2 commits behind head on master.

Files with missing lines Patch % Lines
...arts/utils/data/torch_datasets/training_dataset.py 9.23% 59 Missing ⚠️
darts/models/forecasting/foundation_model.py 81.81% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #3125      +/-   ##
==========================================
- Coverage   96.54%   96.15%   -0.39%     
==========================================
  Files         160      160              
  Lines       17261    17361     +100     
==========================================
+ Hits        16664    16693      +29     
- Misses        597      668      +71     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant