Skip to content

sync: merge upstream/master (2026-04-25)#69

Merged
RyderFreeman4Logos merged 14 commits into
masterfrom
sync/upstream-2026-04-25-1333
Apr 25, 2026
Merged

sync: merge upstream/master (2026-04-25)#69
RyderFreeman4Logos merged 14 commits into
masterfrom
sync/upstream-2026-04-25-1333

Conversation

@RyderFreeman4Logos

Copy link
Copy Markdown
Owner

Summary

  • Synced with upstream/master (11 commits merged)
  • Upstream commits: reproducibility system, configurable residual processing, gemma 4 support, --help lazy import, UnboundLocalError fix, response prefix config, max_memory fix, dependency bumps
  • 7 conflict files resolved, new system.py accepted
  • Fork features preserved: LLM judge, caching, geometric median, thinking profiles, sequence KL, pipelined optimization

Quality Gates

  • ruff format
  • ruff check
  • ty check
  • uv build
  • lefthook (all 5 hooks)

Generated with Claude Code

p-e-w and others added 14 commits April 7, 2026 13:24
* fix: correct default value for max_memory.

The other does not compile.

* fix: update syntax for default value of max_memory
)

* feat: implement reproducibility features with safetensors

* feat: prompt user before creating reproducibility folder

* fix: use prompt_confirm wrapper

* style comment

* style comment

* fix: ignore None values in Settings dump for TOML compatibility

* fix: imports

* feat: auto-generate seed if none provided for full reproducibility

* style: fix ruff formatting issues

* style: ruff

* style: fix ty check errors with ty:ignore

* Update src/heretic/main.py

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

* Update src/heretic/utils.py

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

* add period at end.

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

* Improve: Add README, checkpoint.jsonl, to Reproduce

* fix: use centralize device info, remove random states file

* feat: Add CUDA driver version

* ruff

* ruff...

* ty fix

* LGTM: Rich native strip, use nvidia-smi

* ruff fix

* ruff

* revert kaggle hack)

* normalize names for deduplication of packages/versions

* docstring

* rufff

* cleanup, add suffix for torch CUDA version, distinguish ROCm

* add PyTorch index URL detection

* revert index URL to be simple

* flip priority of index..

* add Important note

* add exact suffix for WHL in instruction

* add warning for heterogeneous GPU env

* extend driver version info (more accelerators)

* fix: style

* sync

* no abbreviation

* use multi-line string

* fix: prompt_confirm

* feat: CPU info

* strip 'slow' warning from environment.txt

* feat: Add virtual env info to environment.txt

* ruffff

* feat: AMD (Radeon) GPU driver version

* Refactor: system.py

* feat: LGTM capturing specifc installation origin of heretic

* feat: Include chosen trial into reproduce/README

* style: run ruff format on utils.py

* feat: reproduce.json

* fix: seperate values in different keys

* restore comment

* style, clean, seperate commit key

* no abbreviation, cleanup

* remove labels, store only dependencies

* missed import, ruff

* sort import

* feat: More CPU Info

* only store direct dependencies of heretic

* complete comment

* refactor: use cpuinfo package instead

* ruff import sort

* distinguish cores & threads

* move function amd-driver

* rename

* moving heretic package info,

* rufff

* Move: cleanup memory cache

* fix: model.py import

* no unknowns

* generalize all accelerator info stuff

* ruff f

* move package info

* type change

* feat: no reproducibility suite for local saving/model used

* import fix

* fix: type check

* style change

* style ruff

* feat: no env.txt, SHA256SUMS file, cleanup

* feat: ADD tip to readme

* remove trial index, two-keys only

* fix: No time-zone

* feat: No suite for local datasets allowed

* simplify

* featt: capture both direct and transitive dependencies

* style: sort readme of reproducibility suite

* feat: Store commit hash for datasets too

* add total refusal prompts for evaluation display

* remove try/except from cpu

* extend SHA256 support

* remove .txt

* only have safetensors for SHA256

* style comment

* use HF api to get commit hash

* fix: requirements containing irrelevant dependencies

* only store heretic-llm if from PyPI..

* add SELECTED tag to the trial that was pushed

* AttributeError fix

* simplify trial preservation

* add direction_index in trial info

* remove unwanted CPU info

* style: rename

---------

Co-authored-by: Vinayyyy7 <vinayumrethe99@gmail.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Bumps [pillow](https://github.com/python-pillow/Pillow) from 12.1.1 to 12.2.0.
- [Release notes](https://github.com/python-pillow/Pillow/releases)
- [Changelog](https://github.com/python-pillow/Pillow/blob/main/CHANGES.rst)
- [Commits](python-pillow/Pillow@12.1.1...12.2.0)

---
updated-dependencies:
- dependency-name: pillow
  dependency-version: 12.2.0
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…-e-w#239)

* refactor residual memory optimizations

* formatting

* Fixed config.py positioning and default

* fixed analyzier declaration in main.py

* removing del statements

* ruff

* small updates

* ty moveback ish
…w#301)

* fix: prevent UnboundLocalError when analyzer is not initialized

Move cleanup of analyzer and residuals inside the conditional block
where they are actually defined to avoid crashing when
--print-residual-geometry or --plot-residuals are not used.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix: address AI review feedback on residual cleanup

---------

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Bumps [python-dotenv](https://github.com/theskumar/python-dotenv) from 1.2.1 to 1.2.2.
- [Release notes](https://github.com/theskumar/python-dotenv/releases)
- [Changelog](https://github.com/theskumar/python-dotenv/blob/main/CHANGELOG.md)
- [Commits](theskumar/python-dotenv@v1.2.1...v1.2.2)

---
updated-dependencies:
- dependency-name: python-dotenv
  dependency-version: 1.2.2
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
* fix: various cleanups and improvements for the reproducibility system

* fix: save only essential settings

* fix: improve model commit handling

* feat: make including system information optional

* fix: improve formatting of reproducibility README

* fix: fix remaining issues
Bumps [mako](https://github.com/sqlalchemy/mako) from 1.3.10 to 1.3.11.
- [Release notes](https://github.com/sqlalchemy/mako/releases)
- [Changelog](https://github.com/sqlalchemy/mako/blob/main/CHANGES)
- [Commits](https://github.com/sqlalchemy/mako/commits)

---
updated-dependencies:
- dependency-name: mako
  dependency-version: 1.3.11
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…6-04-25-1333

# Conflicts:
#	src/heretic/main.py
#	src/heretic/model.py
#	uv.lock
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a comprehensive reproducibility framework for the heretic tool, enabling the capture and upload of system metadata, environment details, and exact configuration settings to Hugging Face. It also implements VRAM optimizations through CPU offloading of intermediate tensors and adds support for sharded model exports. The review feedback identifies a critical NameError caused by an undefined function call, a logic error in the early help-handling mechanism that may trigger validation failures, and several violations of the project's type-annotation requirements for function signatures.

Comment thread src/heretic/main.py

# Parse and handle CLI help before importing heavyweight ML/runtime dependencies.
if _is_help_invocation():
Settings() # ty:ignore[missing-argument]

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The Settings() call here is likely intended to trigger a help message and exit when -h or --help is passed. However, pydantic-settings does not automatically print help and exit upon instantiation unless specifically configured (e.g., via cli_parse_args=True in model_config). If not so configured, this call will likely raise a ValidationError if required fields like model are missing from environment variables or config files. Furthermore, if it doesn't exit, the script will proceed to import heavyweight dependencies on line 20, defeating the purpose of this "lazy help" optimization.

Comment thread src/heretic/main.py
if reproducibility_information != "none":
# Set the number of trials to the number of actual completed trials
# for the reproduction configuration.
settings.n_trials = count_completed_trials()

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The function count_completed_trials is not defined in this module nor imported from any of the internal modules (.config, .evaluator, .model, .system, .utils). This will cause a NameError at runtime when attempting to upload reproducibility information.

Comment thread src/heretic/system.py
)


def empty_cache():

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The function signature is missing the return type annotation. Per the repository style guide (rule 5), all function and method signatures must be fully type-annotated.

Suggested change
def empty_cache():
def empty_cache() -> None:
References
  1. Function and method signatures must be fully type-annotated, including the return type (if any). (link)

Comment thread src/heretic/utils.py
return "\n".join(requirements) + "\n"


def set_seed(seed: int):

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The function signature is missing the return type annotation. Per the repository style guide (rule 5), all function and method signatures must be fully type-annotated.

Suggested change
def set_seed(seed: int):
def set_seed(seed: int) -> None:
References
  1. Function and method signatures must be fully type-annotated, including the return type (if any). (link)

Comment thread src/heretic/utils.py
Comment on lines +605 to +612
def create_reproduce_folder(
path: Path,
settings: Settings,
checkpoint_path: str | Path,
trial: Trial,
uploaded_model_hashes: dict[str, str],
include_system_information: bool,
):

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The function signature is missing the return type annotation. Per the repository style guide (rule 5), all function and method signatures must be fully type-annotated.

def create_reproduce_folder(\n    path: Path,\n    settings: Settings,\n    checkpoint_path: str | Path,\n    trial: Trial,\n    uploaded_model_hashes: dict[str, str],\n    include_system_information: bool,\n) -> None:
References
  1. Function and method signatures must be fully type-annotated, including the return type (if any). (link)

Comment thread src/heretic/utils.py
Comment on lines +678 to +685
def upload_reproduce_folder(
repo_id: str,
settings: Settings,
token: str,
checkpoint_path: str | Path,
trial: Trial,
include_system_information: bool,
):

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The function signature is missing the return type annotation. Per the repository style guide (rule 5), all function and method signatures must be fully type-annotated.

def upload_reproduce_folder(\n    repo_id: str,\n    settings: Settings,\n    token: str,\n    checkpoint_path: str | Path,\n    trial: Trial,\n    include_system_information: bool,\n) -> None:
References
  1. Function and method signatures must be fully type-annotated, including the return type (if any). (link)

@RyderFreeman4Logos

Copy link
Copy Markdown
Owner Author

PR Bot Audit Trail — Bot Findings Classification

Local CSA review: PASS (gemini-cli, session 01KQ37YY6CQKRMHNC7DZ22CW9F)
Cloud bot: gemini-code-assist — 6 findings (2 HIGH, 4 MEDIUM)

Classification: All 6 findings are FALSE POSITIVES

HIGH #1Settings() for --help (main.py:18)
Classification: FALSE_POSITIVE
Rationale: This is upstream's lazy-help optimization pattern. _is_help_invocation() guards it. pydantic-settings with cli_parse_args handles --help correctly at instantiation. Upstream PR p-e-w#293 specifically added and tested this.

HIGH #2count_completed_trials undefined (main.py:1651)
Classification: FALSE_POSITIVE
Rationale: count_completed_trials() is defined as a closure inside run() at line 1102: def count_completed_trials() -> int:. The bot failed to detect the nested function definition. All 6 call sites (lines 1106, 1145, 1177, 1434, 1651) are within run()'s scope.

MEDIUM #3-6 — Missing return type annotations (system.py:26, utils.py:345, utils.py:612, utils.py:685)
Classification: FALSE_POSITIVE
Rationale: These are from upstream code accepted as-is. The project uses ty (not mypy) as its type checker, and ty check --error-on-warning passes cleanly. The ty checker does not require return type annotations on all function signatures. No project style guide mandates this — the bot inferred a non-existent "rule 5".

Quality Gates

  • ruff format — passed
  • ruff check — passed
  • ty check --error-on-warning — passed
  • uv build — passed
  • lefthook pre-commit — all 5 hooks passed
  • Local CSA review — PASS (no findings)
  • Cloud bot — all findings classified as false positive

@RyderFreeman4Logos RyderFreeman4Logos merged commit e00eba6 into master Apr 25, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants