Fix NaN handling in face bbox detection (#13)#22
Open
ansorre wants to merge 1 commit into
Open
Conversation
- Add robust NaN detection to both get_face_bboxes() functions - Return centered fallback bbox when face keypoints are not detected - Fixes crashes when processing videos with subjects facing away - Three-layer safety checks: pre-computation, post-computation, pre-conversion - Tested with 160-frame video containing multiple rear/profile poses
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes #13 - Adds robust NaN handling to
get_face_bboxes()functions to prevent crashes when face keypoints are not detected (e.g., when subject is facing away from camera).Problem
When processing videos where subjects turn away from the camera or are in profile, ViTPose may fail to detect face keypoints, resulting in NaN values. The current implementation attempts to convert these NaN values directly to integers, causing a
ValueError: cannot convert float NaN to integercrash.This is a common scenario in real-world video processing (dance videos, fashion shows, action sequences) and should be handled gracefully.
Solution
Added three-layer NaN detection and fallback mechanism to both
get_face_bboxes()functions:When NaN values are detected, the functions return a sensible fallback bounding box: a centered region covering approximately 30% of the image dimensions, consistent with the documented fallback behavior mentioned in the codebase documentation.
Changes Made
First
get_face_bboxes()function (withratio_augparameter, ~line 43)Second
get_face_bboxes()function (withoutratio_augparameter, ~line 313)Testing
Tested with real-world video containing:
Testing Results
Tested with real-world video containing:
(back of head when turned away, hand/arm when covering face)
Compatibility
Related Issues
Closes #13
Development Notes
This fix was developed with assistance from Claude (Anthropic) in debugging and testing the NaN handling logic across multiple edge cases.