CS 325B - Wealth Changes Code

Code submission for CS 325B, wealth changes group. Github: https://github.com/rosikand/CS325B-wealth-changes. Please contact rsikand@stanford.edu for any questions/help running code.

Code base structure design (optional)

The main repository structure is created with the intention of being modular and configurable to make modeling more efficient. For example, changing the learning rate is not done via code but by setting the lr parameter in a config. We use python dataclasses to facilitate the configs. All configs are placed in src/configs.py with the default config being the DefaultConfig in this file. To orchestrate correctly with the codebase, each config parameter must follow certain conventions. For example, the encoder parameter must be one of the several encoders we support in the codebase.

Secondly, we use class-based experiments, contained in src/experiments.py to run the main functionality of each experiment. The main experiment class takes in a config object and contains the training loop, val loop, and test loop. Each experiment class inherits from a trainer module specified in src/trainer.py.

In addition, several other files are included for organization purposes. The dataset classes are specified in src/datasets.py and the models are specified in src/models.py.

Finally, to run the actual program to train, val, or test a model, the user should run src/run.py, the runner script that puts everything together. The user must specify two command line arguments when running src/run.py: the config class they wish to use and whether to train, test, or val the model. For example, training the a model with the default config can be done with the following command:

$ python3 run.py -config DefaultConfig -train

Note that all training runs will consist of the specified amount of epochs where each epoch represents a full traversal through the training set, as well as performing evaluation on the val set afterwards. Finally, at the end of the run, the model is run on both the val set and test set for a final time.

A user should specify whether they want to train the model or only perform evaluation on the val or test set using the -train, -val, and -test flags respectively when running run.py. Note that the user should only specify one of them per process.

Footnote: the trainer.py module is adapted from the torchplate package experiment.py module. torchplate was created by one of the authors of this project.

Config class parameters

- country (string): one of {malawi, mozambique}
- log (bool): whether to log to wandb. Need to provide your API key. 
- description (string): description of experiment for documentating the output log. 
- seed (int): random seed for reproducibility.
- use_seed (bool): whether to use the seed.
- experiment (class): which experiment class to use from experiments.py.
- model_name (string): name of model to use from models.py.
- encoder (string): name of encoder to use from models.py.
- default_run_name (string): for wandb logging purposes. random 5 digit number + timestamp.
- experiment_name (string): for wandb logging purposes. 
- label_normalize (bool): whether to normalize labels.
- optimizer (string): name of optimizer to use. One of {adam, adamw}
- lr (float): learning rate.
- schedule_lr (bool): whether to use a learning rate scheduler.
- weight_decay (float): weight decay for l2 regularization.
- scheduler_step_size (int): step size for lr scheduler.
- scheduler_gamma (float): gamma for lr scheduler.
- verbose (bool): whether to print out things to log during training progress.
    - Note that this parameter is buggy and doesn't work fully as intended. 
- epochs (int): number of epochs to train for.  
- resize_size (tuple): size to resize images to.
- year_1_dir (string): relative path to year 1 directory.
- year_2_dir (string): relative path to year 2 directory.
- train_filemap_path (string): relative path to train filemap.
- val_filemap_path (string): relative path to val filemap.
- test_filemap_path (string): relative path to test filemap.
- normalization_constants (string): name of normalization constants to use. 
    Options: 
        - Malawi: {standard_malawi, group_malawi, district_malawi}
        - Mozambique: {standard_mozambique}
- percent_of_dataset_to_train (float): percent of dataset to use for training.
- percent_of_dataset_to_val (float): percent of dataset to use for validation.
- latent_dim (int): latent dimension of model; encoder output dim. (really is 2x due to concat)
- hidden_dim (int): hidden dimension of MLP head.
- save_best_model (bool): whether to save the best model during the course of training.
- save_at_end (bool): whether to save the model at the end of training.
- model_checkpoint_path (string): path to model checkpoint to load from.
- image_augmentation (bool): whether to use image augmentation.

Instructions

Create a python virtual environment.
Install the requirements in the venv using $ pip install -r requirements.txt.
cd into the program directory: $ cd code
Set the paths to your data, relative to the current directory $*$ in configs.py in lines 18-21.
Run experiment with desired config. As of now, we support the best model (log) run via running code/main.sh.

$*$: We downloaded the data to our vm from the GCP cloud bucket due to high latency during training. If you wish to use the GCP cloud bucket, simply mount the bucket to your VM and specify the paths as usual.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
code		code
README.md		README.md
environment_setup.txt		environment_setup.txt
requirements.txt		requirements.txt
run_instructions.txt		run_instructions.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CS 325B - Wealth Changes Code

Instructions

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

CS 325B - Wealth Changes Code

Instructions

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages