Description
The test TestPlanningTripleDataset::test_split_independence in tests/data/test_planning_dataset.py is flaky due to random seed sensitivity in dataset splitting.
Error
FAILED tests/data/test_planning_dataset.py::TestPlanningTripleDataset::test_split_independence
AssertionError: Train ratio 0.5711529184756392 not ~0.7
assert 0.65 < 0.5711529184756392
Root Cause
The test expects exact split ratios (train ~0.7, val ~0.15, test ~0.15) but random sampling can cause variance. The test uses:
assert 0.65 < train_ratio < 0.75, f"Train ratio {train_ratio} not ~0.7"
However, with small dataset sizes or insufficient random seeding, the actual ratio can fall outside this range.
Impact
- Severity: Low
- Scope: Test infrastructure only
- User Impact: None - does not affect production code
- Merge Impact: Does not block merges (24/25 tests pass)
Discovered In
Suggested Fixes
-
Widen tolerance: Change assertions to allow more variance
assert 0.60 < train_ratio < 0.80, f"Train ratio {train_ratio} not ~0.7"
-
Fix random seed: Ensure deterministic seeding before split
random.seed(42)
torch.manual_seed(42)
-
Use larger sample: Increase dataset size for split test to reduce variance
-
Statistical approach: Use confidence intervals instead of hard thresholds
Related
Labels
- bug
- tests
- flaky-test
- low-priority
🤖 Generated with Claude Code
Co-Authored-By: Claude noreply@anthropic.com
Description
The test
TestPlanningTripleDataset::test_split_independenceintests/data/test_planning_dataset.pyis flaky due to random seed sensitivity in dataset splitting.Error
Root Cause
The test expects exact split ratios (train ~0.7, val ~0.15, test ~0.15) but random sampling can cause variance. The test uses:
However, with small dataset sizes or insufficient random seeding, the actual ratio can fall outside this range.
Impact
Discovered In
pytest tests/data/test_planning_dataset.py -vSuggested Fixes
Widen tolerance: Change assertions to allow more variance
Fix random seed: Ensure deterministic seeding before split
Use larger sample: Increase dataset size for split test to reduce variance
Statistical approach: Use confidence intervals instead of hard thresholds
Related
tests/data/test_planning_dataset.py::TestPlanningTripleDataset::test_split_independenceLabels
🤖 Generated with Claude Code
Co-Authored-By: Claude noreply@anthropic.com