-
Notifications
You must be signed in to change notification settings - Fork 83
Benchmark: Model benchmark - deterministic training support #731
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
Aishwarya-Tonpe
wants to merge
97
commits into
main
Choose a base branch
from
aishwaryatonpe/deterministic-training
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
97 commits
Select commit
Hold shift + click to select a range
0040b97
Add deterministic training functionality to PyTorch LLaMA benchmark
Aishwarya-Tonpe e103dd0
llama: add periodic checksum logging (deterministic-only, log-only); …
Aishwarya-Tonpe 87ff6d6
deterministic training: enable seeding + deterministic algorithms acr…
Aishwarya-Tonpe 8eee235
tests(pytorch): add strict determinism skip guards and detailed docst…
Aishwarya-Tonpe fe34247
Refactor LLaMA model tests: align strict, soft determinism, and check…
Aishwarya-Tonpe c374dfe
examples: add deterministic and strict_determinism flags and docs to …
Aishwarya-Tonpe 614f96c
Deterministic fingerprints: replace checksum with Loss+ActMean across…
Aishwarya-Tonpe 689dc44
Deterministic training + reproducible logging: align GPT-2/LLaMA/LSTM…
Aishwarya-Tonpe 33c3f6a
Adding flag: Checck-frequency
Aishwarya-Tonpe f35e98b
Add Check frequency flag to tests
Aishwarya-Tonpe dd7fcbe
Code refactor: Move enable_determinism to base class, add a consolida…
Aishwarya-Tonpe d439395
Code refactor: Add a new test folder to remove redundant code, remove…
Aishwarya-Tonpe da9c85a
Code refactor: Move loss and ActMean logging to base class from indiv…
Aishwarya-Tonpe 2635aad
Code refactor: Move _benchmark() method to base class
Aishwarya-Tonpe 4a21990
Code refactor: Add method _finalize_periodic_logging to base class to…
Aishwarya-Tonpe ddd3f23
Code cleanup: Remove unnecessary imports
Aishwarya-Tonpe a9cb452
Code cleanup: Remove unnecessary imports
Aishwarya-Tonpe 52c5516
Code cleanup: Remove unnecessary imports
Aishwarya-Tonpe 6623f59
Code cleanup: Remove unnecessary imports
Aishwarya-Tonpe 8853c21
Tescase addition: Add Failure testcase, renameflag
Aishwarya-Tonpe 14be806
Delete extra lines
Aishwarya-Tonpe 8cd1c19
Add Docstrings, align imports, add assertions messages
Aishwarya-Tonpe 99bdc16
Lint Checks
Aishwarya-Tonpe 4bc0445
Lint Checks
Aishwarya-Tonpe 2c8d856
Lint Checks
Aishwarya-Tonpe d8d9ca0
Failed check: Resolving failed pipeline check for creating temp file …
Aishwarya-Tonpe 8bcd801
Pipeline failure fixes : Fixing Lint failures on test, example and ba…
Aishwarya-Tonpe 315d07f
Pipeline failure fixes : Fixing Lint failures on test, example and ba…
Aishwarya-Tonpe 5ae57f0
Pipeline failure error: Github not reflecting change in base file, at…
Aishwarya-Tonpe c379c5e
Pipeline failure fixes
Aishwarya-Tonpe 3b186cf
Pipeline failure fixes
Aishwarya-Tonpe 64d7b81
Test file lint fixes
Aishwarya-Tonpe 90a6595
Pipeline Error: Mixtral create Model
Aishwarya-Tonpe 055723c
Modifying test parameters for efficiency
Aishwarya-Tonpe b47688d
Attempting to skip tests for heavy models in CI
Aishwarya-Tonpe 13ad2fe
Attempting to skip tests for heavy models in CI
Aishwarya-Tonpe 2ed5ae0
Skipping tests for CICD
Aishwarya-Tonpe 10ae1a3
Removing unnecessary code
Aishwarya-Tonpe fb21a9f
Adding Metadata Overriding logic to fetch metadata from the log file …
Aishwarya-Tonpe f3bb260
Adding Metadata Overriding logic to fetch metadata from the log file …
Aishwarya-Tonpe 172b02b
Lint Fixes
Aishwarya-Tonpe de326d5
Pipeline failure fix
Aishwarya-Tonpe 6497bf5
Adding test for coverage
Aishwarya-Tonpe 8a8599e
Pipeline failure fix
Aishwarya-Tonpe a68b4df
Pipeline failure fix
Aishwarya-Tonpe e59fc61
Adding Info about deterministic traning to docs
Aishwarya-Tonpe 7c6120d
Adding Info about deterministic traning to docs
Aishwarya-Tonpe 860f0f9
Merge branch 'main' into aishwaryatonpe/deterministic-training
polarG 2892a69
Comments resolve: Add docstrings, Make changes to ensure same lenghts…
Aishwarya-Tonpe 0195d98
COmment resolve : Remove process_info, deprecated
Aishwarya-Tonpe ea6f7fc
Fixing Lint errors
Aishwarya-Tonpe d8acbf2
Lint checkes resolve
Aishwarya-Tonpe 8629e8b
Lint checkes resolve
Aishwarya-Tonpe b15393f
Test case fixes : removing log-path from test-pytorch_determinism_all
Aishwarya-Tonpe 529ab12
Comments removed
Aishwarya-Tonpe 2cb80c0
Merge branch 'main' into aishwaryatonpe/deterministic-training
Aishwarya-Tonpe 54d3449
Fixing test_pytorch_deterministic_all
Aishwarya-Tonpe e91ec63
Comments address : Removing redundant code
Aishwarya-Tonpe 8fc3d5f
Moving seeding logic to make it centralised to model base
Aishwarya-Tonpe 0848c7a
Moving seeding logic to make it centralised to model base
Aishwarya-Tonpe 42718f0
Merge branch 'main' into aishwaryatonpe/deterministic-training
Aishwarya-Tonpe 615bc94
Comments resolve: removing redundant method, adding loggers
Aishwarya-Tonpe a2e2e20
Merge branch 'main' into aishwaryatonpe/deterministic-training
Aishwarya-Tonpe 59cfdd1
Resolving merge conflicts
Aishwarya-Tonpe e893a5a
Merge branch 'main' into aishwaryatonpe/deterministic-training
Aishwarya-Tonpe d909477
Merge branch 'main' into aishwaryatonpe/deterministic-training
Aishwarya-Tonpe 436890e
Merge branch 'main' of https://github.com/microsoft/superbenchmark in…
e4d2f5e
Removing check_frequency parameter from is_finished method in train a…
d0bfd38
Comments resolve : Removing check_frequency assignment to the variable
197007a
Update superbench/benchmarks/model_benchmarks/pytorch_base.py
Aishwarya-Tonpe 4724815
Update tests/benchmarks/model_benchmarks/test_pytorch_determinism_all.py
Aishwarya-Tonpe fdc82ad
Update superbench/benchmarks/model_benchmarks/pytorch_base.py
Aishwarya-Tonpe 373fdf3
Logic change to add metrics to resuls_summary file, Logic change to m…
Aishwarya-Tonpe 11e945e
Moving CUBLAS_WORKSPACE_CONFIG=:4096:8 to the code base so that it do…
Aishwarya-Tonpe 4911580
Renaming --deterministic -> --enable-determinism
Aishwarya-Tonpe 67fca5c
Comments resolve: minor deletions
Aishwarya-Tonpe ce18856
Update superbench/benchmarks/model_benchmarks/pytorch_base.py
Aishwarya-Tonpe 31f46ad
Update superbench/benchmarks/model_benchmarks/pytorch_mixtral_impl.py
Aishwarya-Tonpe c5895b1
Update docs/user-tutorial/benchmarks/model-benchmarks.md
Aishwarya-Tonpe e457b83
Refactoring the code: Moving utility functions to model_log_utils
Aishwarya-Tonpe 02d568a
Merge branch 'aishwaryatonpe/deterministic-training' of https://githu…
Aishwarya-Tonpe a249916
Updating the user docs
Aishwarya-Tonpe 039b17e
Updating the test files and fixing lint errors
Aishwarya-Tonpe a26518c
Lint error fixes
Aishwarya-Tonpe c8abf0c
Pipeline erros resolve : Link errors, function complex error
Aishwarya-Tonpe 2f5493a
Resetting the env var cause of failing testcases in the pipeline, tes…
Aishwarya-Tonpe 8398f51
Resolving pipelines errors
Aishwarya-Tonpe 7c5405a
Resolving pipelines errors
Aishwarya-Tonpe 6b51a18
Resolving pipeline issues
Aishwarya-Tonpe c8ca973
Adding a new test file to cover the code logic in the model_utils file
Aishwarya-Tonpe 7f6bfeb
Resolving pipeline issues
Aishwarya-Tonpe 205934e
Resolving pipeline issues
Aishwarya-Tonpe 3e996f2
resolving pipeline issues
Aishwarya-Tonpe ea9f6b2
Resolving pipeline failures
Aishwarya-Tonpe 3b31c6a
Fix pipeline issues
Aishwarya-Tonpe 4384412
Minor change
Aishwarya-Tonpe b5967f7
Merge branch 'main' into aishwaryatonpe/deterministic-training
Aishwarya-Tonpe File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Some comments aren't visible on the classic Files Changed page.
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,137 @@ | ||
| # Copyright (c) Microsoft Corporation. | ||
| # Licensed under the MIT license. | ||
|
|
||
| """Unified PyTorch deterministic training example for all supported models. | ||
|
|
||
| Deterministic metrics (loss, activation mean) are automatically stored in results.json | ||
| when --enable-determinism flag is enabled. Use --compare-log to compare against a reference run. | ||
|
|
||
| Commands to run: | ||
| Run A (generate reference): | ||
|
|
||
| python3 examples/benchmarks/pytorch_deterministic_example.py \ | ||
| --model <model_from_MODEL_CHOICES> --enable-determinism --deterministic-seed 42 | ||
|
|
||
| This creates results-0.json with deterministic metrics. | ||
|
|
||
| Run B (compare against reference): | ||
|
|
||
| python3 examples/benchmarks/pytorch_deterministic_example.py \ | ||
| --model <model_from_MODEL_CHOICES> --enable-determinism --deterministic-seed 42 --compare-log results-0.json | ||
|
|
||
| Note: CUBLAS_WORKSPACE_CONFIG is now automatically set by the code when determinism is enabled. | ||
| """ | ||
|
|
||
| import argparse | ||
| import json | ||
| from pathlib import Path | ||
| from superbench.benchmarks import BenchmarkRegistry, Framework | ||
| from superbench.common.utils import logger | ||
|
|
||
| MODEL_CHOICES = [ | ||
| 'bert-large', | ||
| 'gpt2-small', | ||
| 'llama2-7b', | ||
| 'mixtral-8x7b', | ||
| 'resnet101', | ||
| 'lstm', | ||
| ] | ||
|
|
||
| DEFAULT_PARAMS = { | ||
| 'bert-large': | ||
| '--batch_size 1 --seq_len 64 --num_warmup 1 --num_steps 200 --precision float32 ' | ||
| '--model_action train --check_frequency 20', | ||
| 'gpt2-small': | ||
| '--batch_size 1 --num_steps 300 --num_warmup 1 --seq_len 128 --precision float32 ' | ||
| '--model_action train --check_frequency 20', | ||
| 'llama2-7b': | ||
| '--batch_size 1 --num_steps 300 --num_warmup 1 --seq_len 512 --precision float32 --model_action train ' | ||
| '--check_frequency 20', | ||
| 'mixtral-8x7b': | ||
| '--hidden_size=4096 --num_hidden_layers=32 --num_attention_heads=32 --intermediate_size=14336 ' | ||
| '--num_key_value_heads=8 --max_position_embeddings=32768 --router_aux_loss_coef=0.02 ' | ||
| '--check_frequency 20', | ||
| 'resnet101': | ||
| '--batch_size 1 --precision float32 --num_warmup 1 --num_steps 120 --sample_count 8192 ' | ||
| '--pin_memory --model_action train --check_frequency 20', | ||
| 'lstm': | ||
| '--batch_size 1 --num_steps 100 --num_warmup 2 --seq_len 64 --precision float16 ' | ||
| '--model_action train --check_frequency 30', | ||
| } | ||
|
|
||
|
|
||
| def main(): | ||
| """Main function for determinism example file.""" | ||
| parser = argparse.ArgumentParser(description='Unified PyTorch deterministic training example.') | ||
| parser.add_argument('--model', type=str, choices=MODEL_CHOICES, required=True, help='Model to run.') | ||
| parser.add_argument( | ||
| '--enable-determinism', | ||
| '--enable_determinism', | ||
| action='store_true', | ||
| help='Enable deterministic mode for reproducible results.', | ||
| ) | ||
| parser.add_argument( | ||
| '--compare-log', | ||
| type=str, | ||
| default=None, | ||
| help='Path to reference results.json file for deterministic comparison.', | ||
| ) | ||
| parser.add_argument( | ||
| '--deterministic-seed', | ||
| type=int, | ||
| default=None, | ||
| help='Seed for deterministic training.', | ||
| ) | ||
| args = parser.parse_args() | ||
|
|
||
| parameters = DEFAULT_PARAMS[args.model] | ||
| if args.enable_determinism: | ||
| parameters += ' --enable-determinism' | ||
| if args.deterministic_seed is not None: | ||
| parameters += f' --deterministic_seed {args.deterministic_seed}' | ||
| if args.compare_log: | ||
| parameters += f' --compare-log {args.compare_log}' | ||
|
|
||
| context = BenchmarkRegistry.create_benchmark_context(args.model, parameters=parameters, framework=Framework.PYTORCH) | ||
| benchmark = BenchmarkRegistry.launch_benchmark(context) | ||
| logger.info(f'Benchmark finished. Return code: {benchmark.return_code}') | ||
|
|
||
| # Save results to file for comparison | ||
| if not args.compare_log: | ||
| # Find next available results file name | ||
| counter = 0 | ||
| while Path(f'results-{counter}.json').exists(): | ||
| counter += 1 | ||
| results_file = f'results-{counter}.json' | ||
|
|
||
| # Parse benchmark results and create nested format like results-summary.json | ||
| benchmark_results = json.loads(benchmark.serialized_result) | ||
|
|
||
| # Create nested structure: raw_data -> benchmark_name -> metrics | ||
| # Extract the benchmark name from the results (e.g., "pytorch-lstm") | ||
| benchmark_name = benchmark_results.get('name', args.model) | ||
|
|
||
| # Create results in the format expected by comparison logic | ||
| nested_results = { | ||
| 'raw_data': { | ||
| f'model-benchmarks:{args.model}/{benchmark_name}': benchmark_results.get('raw_data', {}) | ||
| } | ||
| } | ||
|
|
||
| # Write results to file | ||
| with open(results_file, 'w') as f: | ||
| json.dump(nested_results, f, indent=2) | ||
| logger.info(f'Results saved to {results_file}') | ||
| logger.info(f'To compare against this run, use: --compare-log {results_file}') | ||
| else: | ||
| logger.info(f'Comparison completed against {args.compare_log}') | ||
|
|
||
| if hasattr(benchmark, '_model_run_metadata'): | ||
| logger.info(f'Run metadata: {benchmark._model_run_metadata}') | ||
| if hasattr(benchmark, '_model_run_periodic'): | ||
| num_checkpoints = len(benchmark._model_run_periodic.get('step', [])) | ||
| logger.info(f'Periodic fingerprints collected at {num_checkpoints} checkpoints') | ||
|
|
||
|
|
||
| if __name__ == '__main__': | ||
| main() |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.