fix: fix issues in automatic reproduction system by Vinay-Umrethe · Pull Request #352 · p-e-w/heretic

Vinay-Umrethe · 2026-06-04T13:06:48Z

Not ready for merge yet

This PR would likely implement few fixes or rest of the checks like requirements.txt installation, SHA256 checksum

Note

Not doing auto install requirements as it would change packages at system level (reverted), might implement it with venv setup at start

gemini-code-assist

Code Review

This pull request introduces support for gated Hugging Face repositories when collecting reproducibles by retrieving a token and checking model access. The review feedback identifies a critical bug where HfApi lacks an auth_check method (recommending api.model_info instead) and points out that checking model.gated is not False will incorrectly skip public models when no token is provided. Additionally, the variable name token_arg violates the repository style guide's rule against abbreviations and should be renamed to token_or_false.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

This reverts commit 1028792.

This reverts commit f4be1ab.

Vinay-Umrethe · 2026-06-05T03:13:50Z

Status

1. Fixed: inaccessible gated models are now skipped during reproducibles collection.
2. Fixed: torch.svd_lowrank is seeded before it's use so restoring a model from trial parameters is deterministic (independent of previous trials that required re-run sequentially them using config.toml manual reproduction)
3. Add merged OR adapter in reproduce.json to know what was originally exported (version bump "2")

@p-e-w

…uce-implementations

Vinay-Umrethe · 2026-06-05T18:54:38Z

Test conducted

branch: Vinay-Umrethe:auto-reproduce-implementations

method: --reproduce

Base model: https://huggingface.co/VINAY-UMRETHE/Qwen3-0.6B-heretic-Base2

Result: Hash matches

Type: Merged model export.

https://huggingface.co/VINAY-UMRETHE/Qwen3-0.6B-heretic-Reproduce2

Base model (adapter LoRA Only): https://huggingface.co/VINAY-UMRETHE/Qwen3-0.6B-heretic-LoRA

Result: Hash matches

Type: Adapter only export

https://huggingface.co/VINAY-UMRETHE/Qwen3-0.6B-heretic-LoRA-Reproduce

Using original export strategy: adapter
Uploading LoRA adapter...
...
...
Model uploaded to VINAY-UMRETHE/Qwen3-0.6B-heretic-LoRA-Reproduce.
Verifying hashes of weight files...
adapter_model.safetensors: Hash matches

so now torch.svd_lowrank does not bother reproduction in row_normalization.

@p-e-w

p-e-w · 2026-06-06T12:57:52Z

+                try:
+                    api.auth_check(model.id, repo_type="model")
+                except GatedRepoError:
+                    continue


Yes. This is one of the very, very few cases where it's appropriate to swallow an exception completely.

p-e-w · 2026-06-06T13:02:43Z

I'm wondering if we're approaching the export strategy thing correctly. Here's an alternative idea:

We store export_strategy in Settings, type str | None, default None.
If None, we prompt, as we do now, otherwise we use the given value.

This gets rid of the need to pass the value around, and allows users to simplify their export setup by fixing the value so they don't get prompted each time.

The current approach basically creates an ad-hoc setting, which I'm not sure is correct.

We can even use an enum that way, and get automatic validation.

Vinay-Umrethe · 2026-06-06T17:04:27Z

it is simplified now, and yes that is more correct approach.

I still wonder what to do with version 1 of reproduce.json (as it does not seed the fix done in model.py)

Leave it dead OR if v1, use "settings" and run all trials sequentially instead of direct applying which is possible now

p-e-w · 2026-06-06T18:19:35Z

        print()

+    if settings.export_strategy not in [None, ExportStrategy.NONE]:
+        return settings.export_strategy.value


This logic doesn't make sense. We are warning about consequences above, but if the strategy is set in the settings, you're not giving the user a chance to react to those warnings.

p-e-w · 2026-06-06T18:20:26Z

 max_shard_size = "5GB"

+# How to export the model: "none" (prompt the user), "merge", or "adapter".
+export_strategy = "none"


No, remove this setting from this file to get the default (which is None, not "none").

but will it work with a normal None or null in toml file?

class QuantizationMethod(str, Enum): NONE = "none" BNB_4BIT = "bnb_4bit" class RowNormalization(str, Enum): NONE = "none"

# Quantization method to use when loading the model. Options: # "none" (no quantization), # "bnb_4bit" (4-bit quantization using bitsandbytes). quantization = "none"

These two use the same

There is no None or null in TOML. This is a design limitation of TOML.

And the quantization case is not the same. Setting quantization = "none" means that we do not quantize. But setting export_strategy = None does not mean that we don't use an export strategy, but that we prompt the user for it. Those two are semantically different.

You can load the model without quantization, but you can't export the model without an export strategy. That's why "none" cannot be an export strategy.

making it commented would be correct? like how we explain seed in .toml

I actually think that neither the seed nor the export strategy belong in this file even as a comment. Setting those in the configuration is almost always a bad choice for the user.

Note that there are several settings that are deliberately not in config.default.toml.

p-e-w · 2026-06-06T18:21:20Z

                                # Set the number of trials to the number of actual completed trials
                                # for the reproduction configuration.
                                settings.n_trials = count_completed_trials()
+                                settings.export_strategy = ExportStrategy(strategy)


Unfortunately, it's not so easy. You have now stored the strategy in settings, and that means the user cannot go back and export the model in a different format.

I did noticed it, but what else we have to do to ensure users have the choice anyway

I don't know the solution right now, but it can't remain like this. Not being able to change the export format is a major regression.

if --reproduce would allow users to choose strategy, and they choose different one than original, the hash matching would fail which is misleading

now it seems correct to me, no matter what was the original export_strategy we make it None anyway but store that info for only telling/print the User (to ensure hash verification is exactly correct)

But users still get the prompt and have the choice

p-e-w · 2026-06-07T03:21:59Z


-# Random seed for reproducible optimization. Set to an integer to enable.
-# Applies to Python's random module, NumPy, PyTorch, and Optuna.
-# seed = 75


Yes, I think this is better. The biggest problem with having the seed in the config file is that it might be forgotten, and then lead to suboptimal results (identical runs) that are hard to debug. The seed parameter is much better passed as a command line argument, similar to evaluate_model, which also isn't in the config file.

p-e-w · 2026-06-07T03:27:40Z

+        print(
+            f"Original export strategy was [bold]{original_export_strategy.value}[/]; "
+            "choose it for proper hash verification."
+        )


No, the user shouldn't get a choice at all when running with --reproduce. The whole point of reproducing a model is to get exactly the same model, so we don't need to ask them, they already made that choice. Otherwise we get this weird situation where we can only verify checksums if the user made the "right" choice.

You have now stored the strategy in settings, and that means the user cannot go back and export the model in a different format. Not being able to change the export format is a major regression.

So what should be done allow choice or not

The choice should be allowed except in auto-reproduction mode. If you remove the original_export_strategy logic, everything will be correct.

There should be no choice when the user has explicitly said they want to reproduce.

p-e-w · 2026-06-07T03:37:43Z


    data = {
-        "version": "1",  # Version number of the reproduce.json file format, to allow for future changes.
+        "version": "2",  # Version number of the reproduce.json file format, to allow for future changes.


Why are we upgrading the version if the schema doesn't change?

I still wonder what to do with version 1 of reproduce.json (as it does not seed the fix done in model.py)

Leave it dead OR if it's v1, use "settings" and run all trials sequentially instead of direct applying which is possible now

like said we can do this

I think this is already handled nicely by the environment check.

We will release the auto-reproduction feature in Heretic 1.4. The old reproduce.json files that were missing the seed fix were made using Heretic 1.3. This will automatically get flagged as a "critical" mismatch, so there is no expectation of byte-for-byte reproducibility.

p-e-w · 2026-06-07T08:41:44Z

Ok, I've merged this. I'll decide what to do about the version number when I wrap up #326.

p-e-w · 2026-06-07T08:47:17Z

Ah, I just realized we don't need the version number distinction to handle the export strategy. That's because with the new code, the export strategy is always set, whereas before, it was never set. But until the upcoming version 1.4, there was only one export strategy: merge.

So if the strategy is missing from the settings, we can just assume it is "merge", and it will always be correct.

* feat: load reproduction information * feat: check reproduction environment against original environment * fix: remove `trust_remote_code` setting This improves security when running Heretic with an untrusted config file. The prompt is now always shown. This is NOT a breaking change, because we currently ignore values for unknown settings, so existing configs continue to work. * feat: reproduce model from JSON file * feat: verify hashes of uploaded weight files * fix: fix issues in automatic reproduction system (#352) * fix: Check if a model is gated / accessible * fix: handle unknown gated models * feat: Auto install requirements * simplify * Revert "simplify" This reverts commit 1028792. * Revert "feat: Auto install requirements" This reverts commit f4be1ab. * fix: Seed pytorch method * reference, style * simplify token * feat: Export strategy in reproduce.json, v2 * style: Name * simplify export strategy * style: Rename * enumeration * maybe remove seed as well * fix: don't lock settings with permanent strategy * simplify no choice, use try/finally block * feat: verify hashes of locally saved weight files * fix: remove obsolete code from merge * docs: add automatic reproduction instructions to reproduce README --------- Co-authored-by: Vinay-Umrethe <vinayumrethe99@gmail.com>

fix: Check if a model is gated / accessible

a1664c3

gemini-code-assist Bot reviewed Jun 4, 2026

View reviewed changes

Comment thread src/heretic/reproduce.py Outdated

Comment thread src/heretic/reproduce.py Outdated

Comment thread src/heretic/reproduce.py

Vinay-Umrethe added 7 commits June 4, 2026 18:46

fix: handle unknown gated models

b5d9d72

feat: Auto install requirements

f4be1ab

simplify

1028792

Revert "simplify"

96d7448

This reverts commit 1028792.

Revert "feat: Auto install requirements"

10aa173

This reverts commit f4be1ab.

fix: Seed pytorch method

72aa8ad

reference, style

fb3959c

p-e-w reviewed Jun 5, 2026

View reviewed changes

Comment thread src/heretic/reproduce.py Outdated

Comment thread src/heretic/reproduce.py

Comment thread src/heretic/reproduce.py Outdated

Vinay-Umrethe added 4 commits June 5, 2026 22:19

simplify token

27bebb0

Merge remote-tracking branch 'origin/auto-reproduce' into auto-reprod…

62624fe

…uce-implementations

feat: Export strategy in reproduce.json, v2

9e40c96

style: Name

05ce454

p-e-w reviewed Jun 6, 2026

View reviewed changes

Vinay-Umrethe added 2 commits June 6, 2026 22:24

simplify export strategy

92bfcb5

style: Rename

74f53dd

p-e-w reviewed Jun 6, 2026

View reviewed changes

Vinay-Umrethe added 3 commits June 7, 2026 00:22

enumeration

70222ad

maybe remove seed as well

f6111d3

fix: don't lock settings with permanent strategy

1ea2aaf

p-e-w reviewed Jun 7, 2026

View reviewed changes

simplify no choice, use try/finally block

54bdd25

p-e-w changed the title ~~feat: Auto reproduce implementations~~ fix: fix issues in automatic reproduction system Jun 7, 2026

p-e-w merged commit b9d5b45 into p-e-w:auto-reproduce Jun 7, 2026
1 check passed

Conversation

Vinay-Umrethe commented Jun 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Not ready for merge yet

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Vinay-Umrethe commented Jun 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Status

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Vinay-Umrethe commented Jun 5, 2026

Test conducted

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

p-e-w commented Jun 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Vinay-Umrethe commented Jun 6, 2026

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Vinay-Umrethe Jun 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

p-e-w Jun 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Vinay-Umrethe commented Jun 4, 2026 •

edited

Loading

Vinay-Umrethe commented Jun 5, 2026 •

edited

Loading

p-e-w commented Jun 6, 2026 •

edited

Loading

Vinay-Umrethe Jun 6, 2026 •

edited

Loading

p-e-w Jun 7, 2026 •

edited

Loading