Code repositories for ATV (Adaptive Task Vectors).
To run this code, create and activate a conda environment using the provided environment.yaml file:
conda env create -f environment.yaml
conda activate ATV./scripts/prepare_dataset.sh./scripts/ATV_training.shRunning the above script trains the model on all 20 in-domain datasets. After training, evaluation is performed on the test samples from all in-domain datasets.
--batch_size N: Groups N samples for GPT-2 forward.--llama_batch: Additionally batches LLaMA forward (N samples × 3 templates at once). FP16 numerical differences cause slightly different training trajectories.--logits_to_keep: Computes only final-token logits on compatible last-token inference paths. Adaptive training loss keeps full logits to preserve the original teacher-forcing objective.--gradient_checkpointing: Enables LLaMA activation checkpointing during training. This substantially reduces memory by recomputing LLaMA activations during backward, with extra compute cost.
# GPT-2 batch only
python ATV_training.py ... --batch_size 2
# GPT-2 + LLaMA batch
python ATV_training.py ... --batch_size 2 --llama_batch
# Memory-focused mode
python ATV_training.py ... --batch_size 2 --llama_batch --gradient_checkpointing./scripts/ATV_evaluate.shRunning the above script enables evaluation of performance on each individual dataset within the full collection.
--batch_size N: Batches N samples for GPT-2 and LLaMA forward simultaneously.--logits_to_keep: Computes only the final-token logits for last-token evaluation. This saves memory but can introduce tiny numerical differences.
python ATV_evaluate.py ... --batch_size 4python ATV_analysis.pyThis script enables evaluation of performance for each category.
Please make sure to modify the result_dirs variable in ATV_analysis.py to match the path to your result directory.
python ATV_analysis.pyFor unseen data, run ATV_unseen.py to perform the evaluation.
As above, make sure to set the correct paths accordingly.
This repository is built on top of the ELICIT project. We thank the authors for sharing the source and their work itself.