Code for the paper:

Understanding the staged dynamics of of transformers in latent structure learning.

This codebase contains the code to train and evaluate transformer based models on the DM Alchemy dataset.

Installation

This project is built as a python package (dm-alchemy) along with scripts for data processing and model training.

To install the environment and the required package dependencies, run:

pip install -e .

To run the model training scripts, you will also need the following deep learning libraries:

pip install torch accelerate wandb tqdm

Usage

Training Models

The main entry point for training is src/models/train.py. The script uses argparse to configure the dataset, model architecture, training loop, and optimizer.

Example Command (Held-out Experiment):

python src/models/train.py \
    --task_type classification \
    --model_architecture decoder \
    --model_size xsmall \
    --is_held_out_color_exp True \
    --train_data_path src/data/shuffled_held_out_exps_generated_data_enhanced/compositional_chemistry_samples_167424_80_unique_stones_train_shop_1_qhop_1_single_held_out_color_4_edges_exp.json \
    --val_data_path src/data/shuffled_held_out_exps_generated_data_enhanced/compositional_chemistry_samples_167424_80_unique_stones_val_shop_1_qhop_1_single_held_out_color_4_edges_exp.json \
    --seed 42 \
    --epochs 1000 \
    --batch_size 32 \
    --learning_rate 1e-4 \
    --weight_decay 0.001 \
    --eta_min 7e-5 \
    --wandb_project <enter-your-wandb-project-name>

Note: The held_out experiment type (exp_typ) is enabled by setting --is_held_out_color_exp True. For multi-GPU training, you can use accelerate launch.

'src/data' contains the json files for the other tasks/hops.

Weights & Biases (W&B) Sweeps

Note: The train.py script is fully compatible with Weights & Biases sweeps. Because all hyperparameters are exposed via argparse, you can easily set up a sweep.yaml configuration to search over learning rates, model sizes, architectures, and scheduler configurations. The script will automatically pick up the arguments injected by the W&B agent.

Control Flow Documentation

For a detailed breakdown of how the training pipeline operates and how different arguments affect the execution path (e.g., initialization, dataset building, and model selection), please refer to control_flow.md.

Name		Name	Last commit message	Last commit date
Latest commit History 245 Commits
.eggs		.eggs
.idea		.idea
.vscode		.vscode
__pycache__		__pycache__
build/lib		build/lib
dm_alchemy.egg-info		dm_alchemy.egg-info
dm_alchemy		dm_alchemy
docs		docs
examples		examples
figs		figs
src		src
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
accelerate_config.yaml		accelerate_config.yaml
alchemy_prompt_generator.py		alchemy_prompt_generator.py
analyze_seed_overlap.py		analyze_seed_overlap.py
control_flow.md		control_flow.md
decomposition_2_phasic_learning_of_latent_structure.pdf		decomposition_2_phasic_learning_of_latent_structure.pdf
decomposition_3_phasic_learning_of_latent_structure.pdf		decomposition_3_phasic_learning_of_latent_structure.pdf
decomposition_4_phasic_learning_of_latent_structure.pdf		decomposition_4_phasic_learning_of_latent_structure.pdf
decomposition_5_phasic_learning_of_latent_structure.pdf		decomposition_5_phasic_learning_of_latent_structure.pdf
generate_samples_from_chemistry_graph.py		generate_samples_from_chemistry_graph.py
half_chemistry_behavior_analysis.pdf		half_chemistry_behavior_analysis.pdf
held_out_4_phasic_learning_of_latent_structure.pdf		held_out_4_phasic_learning_of_latent_structure.pdf
move_first_resume_predictions_to_aip.py		move_first_resume_predictions_to_aip.py
setup.py		setup.py
simple_test.py		simple_test.py
stone_state_analysis.py		stone_state_analysis.py
test_chunking.py		test_chunking.py
test_equivalence_verification.py		test_equivalence_verification.py
test_extended_stone_states.py		test_extended_stone_states.py
test_mixed_formats.py		test_mixed_formats.py
test_separate_vocabularies.py		test_separate_vocabularies.py
test_stone_state_count.py		test_stone_state_count.py
test_val_split.py		test_val_split.py
train_chemistry_graph.json		train_chemistry_graph.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Code for the paper:

Installation

Usage

Training Models

Weights & Biases (W&B) Sweeps

Control Flow Documentation

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Code for the paper:

Installation

Usage

Training Models

Weights & Biases (W&B) Sweeps

Control Flow Documentation

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages