Skip to content

fix: gracefully handle processor load failures for multimodal models (#82)#364

Open
umran666 wants to merge 2 commits into
p-e-w:masterfrom
umran666:fix/multimodal-processor-load
Open

fix: gracefully handle processor load failures for multimodal models (#82)#364
umran666 wants to merge 2 commits into
p-e-w:masterfrom
umran666:fix/multimodal-processor-load

Conversation

@umran666

@umran666 umran666 commented Jun 8, 2026

Copy link
Copy Markdown
Contributor

Catches configuration/dependency load failures when initializing a multimodal model's processor, logging a warning rather than crashing the model initialization loop. This allows multimodal models to be abliterated on text-only pipelines even if optional vision dependencies (such as torchvision) are missing.

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request wraps the initialization of the multimodal processor in a try-except block to handle loading failures gracefully with a warning. The reviewer recommended catching specific exceptions rather than a broad Exception to prevent masking other programming errors.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment thread src/heretic/model.py Outdated
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
@p-e-w

p-e-w commented Jun 8, 2026

Copy link
Copy Markdown
Owner

Does merely loading a processor require torchvision? That would surprise me.

@umran666

umran666 commented Jun 8, 2026

Copy link
Copy Markdown
Contributor Author

Yeah, it actually does because Apriel-1.6-15b-Thinker uses PixtralProcessor.
In Hugging Face transformers, the PixtralProcessor class has a hard requirement for torchvision defined in its backend checks. As soon as you call AutoProcessor.from_pretrained(), it tries to load the class, runs that backend check, and immediately raises an ImportError if it's missing.
Since torchvision isn't in heretic's dependencies, it crashes on startup for anyone who doesn't have it installed. Catching this exception is a simple way to let users still run text abliteration on these models without needing to install extra vision libraries.

@p-e-w

p-e-w commented Jun 8, 2026

Copy link
Copy Markdown
Owner

Interesting, thanks for identifying this problem.

I'll need to think about what the best fix is here though. In general, I prefer Heretic to work out of the box in all reasonable situations, so simply depending on torchvision might be the better solution.

@umran666

umran666 commented Jun 9, 2026

Copy link
Copy Markdown
Contributor Author

Adding torchvision to dependencies in pyproject.toml is definitely cleaner for out-of-the-box multimodal support, though the main risk is dependency resolution conflicts since torchvision releases are tightly coupled to specific torch versions (which can sometimes get messy during installation on custom environments).

Let me know if you'd prefer me to update this PR to add torchvision to pyproject.toml (with or without the fallback catch block), or if we should stick to the defensive error-trapping warning.

@p-e-w

p-e-w commented Jun 9, 2026

Copy link
Copy Markdown
Owner

No, you don't need to do anything at this point. I'll have to think about this some more, and solving this niche issue isn't a high priority for me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants