Releases · BitMind-AI/gasbench

14 Jun 21:39

dylanuys

v0.7.3

a21dfc5

Release 0.7.3 Latest

Latest

Removes 12 MAVOS-DD freevc/knnvc subsets from the synthetic video pool pending correct relabeling. These datasets contain real (untampered) video with voice-converted audio only and were previously miscategorized as synthetic video. They will be reintroduced in v0.7.4 under the correct labels (real video / synthetic audio), once this subnet cycle completes.

Assets 2

12 Jun 23:08

dylanuys

v0.7.2

b2214e7

Release 0.7.2

Increased exam size to 5000 for image, video and audio.

Assets 2

12 Jun 22:43

dylanuys

v0.7.1

e0a1a22

Release 0.7.1

Summary

Two changes shipping together.

Sampling floors stopped mode budgets from meaning anything once the corpus grew: small mode asks for 600 total video samples, but the per-dataset minimums inflated that to ~24,200 across 238 datasets, pushing entrance exams past their 2-hour timeout. The floors now scale down to fit the mode's total budget; full-mode allocations are unchanged.

Version single-sourcing removes the duplicated version string — pyproject.toml is now the only place the version lives.

What changes

Sampling: per-dataset floors now respect the mode's total budget

calculate_weighted_dataset_sampling clamps every dataset's allocation to static floors (REGULAR_DATASET_MIN_SAMPLES=100, GASSTATION_DATASET_MIN_SAMPLES=500). With the video corpus at 238 datasets, these floors silently override the mode targets — the first v18 entrance exam processed 154/238 datasets in 2 hours, timed out, and the model was incorrectly failed and blocked.

The floors now cap at an even share of target_total_samples (target // num_datasets, gasstation-weighted), so a mode's budget holds regardless of corpus size:

Run	Before	After
small video, 238 datasets	~24,200 samples (timeout)	~486 samples (~20–40 min)
full video	107/dataset, ~25,900	unchanged
full image	282/dataset, ~55,000	unchanged

Full-mode allocations are numerically identical — the floors only ever bound when the budget was being ignored, so production benchmark scores are unaffected. If more per-dataset signal is wanted in exams, BENCHMARK_TOTAL_OVERRIDES["small"] is now an honest dial (e.g. 600 → 2,400 ≈ 10/dataset).

Version: `pyproject.toml` is the single source

__version__ was hardcoded in both pyproject.toml and gasbench/__init__.py and had to be bumped in lockstep. __init__.py now derives it via importlib.metadata.version("gasbench") (with a 0.0.0+unknown fallback for uninstalled source checkouts). The CLI --version flag is unaffected.

Assets 2

12 Jun 18:11

dylanuys

v0.7.0

9acd544

Release 0.7.0

Summary

Two improvements shipping together.

Score composition makes holdout and gasstation data carry a declared share of competition metrics instead of whatever their sample counts happen to contribute. sn34_score previously pooled all samples equally — holdout and gasstation carried roughly 16% / 4% (image), 12.5% / 3% (video), 8% / 0% (audio) of the metric purely by accident of dataset size. The existing holdout_weight knob only ever affected the benchmark_score accuracy field, never MCC/Brier/sn34.

Dataset config restructuring replaces the monolithic per-modality YAML files with paired real_<modality>.yaml / synthetic_<modality>.yaml files, tags datasets with a content_category for vertical-specific filtering, and consolidates benchmark sizes to a single source of truth.

What changes

Score composition

Metrics.update() accepts a per-sample weight — confusion matrix, MCC, Brier, and CE are all weight-aware. Semantics: weight=w contributes exactly as w duplicate samples would (property-tested).
compute_metrics_from_df(score_composition=...) — takes a target share per provenance class, e.g. {"public": 0.5, "holdout": 0.3, "gasstation": 0.2}, classifies samples by dataset name (-holdout- / gasstation / public), and derives per-class weights from target vs realized counts. Classes absent from a run (e.g. audio has no gasstation data) are dropped and remaining shares renormalized. Results include score_composition (target), realized_composition, and provenance_weights so every run is self-documenting.
Plumbing: BenchmarkRunConfig.score_composition, run_benchmark(score_composition=...), image / video / audio bench functions, CLI --holdout-share / --gasstation-share (public gets the remainder).

Dataset config restructuring

Config files split by real vs. synthetic per modality:

Before	After
`image_datasets.yaml` + `image_human_datasets.yaml`	`real_images.yaml` + `synthetic_images.yaml`
`audio_datasets.yaml`	`real_audio.yaml` + `synthetic_audio.yaml`
`video_datasets.yaml` + `video_human_datasets.yaml`	`real_videos.yaml` + `synthetic_videos.yaml`

Content category tagging

DatasetConfig gains two new optional fields: content_category (e.g. faces, documents) and generator_family. A new --content-category <CATEGORY> CLI flag filters a run to only datasets matching that tag — useful for vertical-specific evaluations without maintaining separate config files.

gasbench run --image-model ./my_model/ --content-category faces

Benchmark size consolidation

Full-mode sizes were embedded in per-YAML benchmark_size fields and a parallel set of bolted-on size configs. Both are replaced by a single declaration in config.py:

"full": {"image": 55000, "video": 26000, "audio": 37000}

Legacy image_benchmark_size / video_benchmark_size dual-format handling removed. Custom --dataset-config YAMLs still override size for that run.

Cleanup

Dead code removed from dataset/cache.py
dataset/config.py: dict-to-DatasetConfig path consolidated; size config removed
dataset/download.py: exception handling tightened; per-run download cap at 1 000 files

Multi-OS dependency support

onnxruntime and decord are now platform-conditional in pyproject.toml:

Platform	onnxruntime	decord
Linux	`onnxruntime-gpu==1.24.2`	`decord==0.6.0`
macOS	`onnxruntime>=1.22.0`	—
Windows	`onnxruntime>=1.22.0`	—

processing/media.py gains an opencv fallback when decord is unavailable (macOS dev installs, non-Linux CI).

Backwards compatibility

Score metrics: default paths unchanged. Without score_composition, all metrics compute exactly as before; the legacy holdout_weight accuracy-only path is preserved unchanged. Verified by back-compat tests (test_no_composition_is_backward_compatible, test_legacy_holdout_weight_only_affects_accuracy, test_uniform_composition_matches_pooled).

Config: existing --dataset-config custom YAMLs still work; benchmark_size in a custom config is still respected. The cpu extras group in pyproject.toml is removed — platform markers now handle the onnxruntime split automatically.

Tests

12 new unit tests in tests/unit/test_weighted_metrics.py:

weight == duplication equivalence across MCC / Brier / CE / sn34
weight scale-invariance
provenance classification; target-share weight derivation (incl. absent-class renormalization, zero-target fallback)
composition shifts sn34 toward holdout performance; pooled-equivalence golden test
legacy holdout_weight still affects accuracy only

Assets 2

04 Jun 19:31

dylanuys

v0.6.7

684604f

Release 0.6.7

Released dataset name	Obfuscated holdout name	Modality	Media type
`casia_web_face_part1`	`real-image-holdout-e7a7377b`	`image`	`real`
`casia_web_face_part10`	`real-image-holdout-9de807d2`	`image`	`real`
`casia_web_face_part11`	`real-image-holdout-a08902c2`	`image`	`real`
`casia_web_face_part12`	`real-image-holdout-09f696f5`	`image`	`real`
`casia_web_face_part13`	`real-image-holdout-1f3d2834`	`image`	`real`
`human-fg-net-real-face`	`real-image-holdout-cfa0ec04`	`image`	`real`
`inst-it-dataset-videos-raw`	`real-image-holdout-de9fe39e`	`image`	`real`
`inst-it-dataset-videos-vpt`	`real-image-holdout-315ec756`	`image`	`real`
`real-human-faces-data-set`	`real-image-holdout-2f459c5f`	`image`	`real`
`shhq-1.0`	`real-image-holdout-3099c51c`	`image`	`real`
`vtuav-dataset-test-lt-001`	`real-image-holdout-57c9ef9d`	`image`	`real`
`arc2face-1`	`synthetic-image-holdout-b210ba6d`	`image`	`synthetic`
`posedreamer28`	`synthetic-image-holdout-14eb8c65`	`image`	`synthetic`
`posedreamer29`	`synthetic-image-holdout-bb33c84e`	`image`	`synthetic`
`posedreamer30`	`synthetic-image-holdout-6db55ca9`	`image`	`synthetic`
`posedreamer31`	`synthetic-image-holdout-8169044e`	`image`	`synthetic`
`posedreamer32`	`synthetic-image-holdout-4ecace03`	`image`	`synthetic`
`sg2-n10k-arc-r14-lang-v1`	`synthetic-image-holdout-524c2534`	`image`	`synthetic`
`spoof_png`	`synthetic-image-holdout-abd2c011`	`image`	`synthetic`
`dh-facevid-1k-0000-part-08`	`real-video-holdout-05e1f42c`	`video`	`real`
`dh-facevid-1k-0000-part-09`	`real-video-holdout-b451e823`	`video`	`real`
`dh-facevid-1k-0000-part-10`	`real-video-holdout-b2369df1`	`video`	`real`
`dh-facevid-1k-0005-part_2`	`real-video-holdout-a86f29a5`	`video`	`real`
`dh-facevid-1k-0005-part_3`	`real-video-holdout-45cdc454`	`video`	`real`
`dh-facevid-1k-0005-part_4`	`real-video-holdout-ef672baf`	`video`	`real`
`dh-facevid-1k-0005-part_5`	`real-video-holdout-3018d860`	`video`	`real`
`haa500-v1-0`	`real-video-holdout-5e96e53f`	`video`	`real`
`mavos-dd-arabic-real`	`real-video-holdout-e421ee99`	`video`	`real`
`mavos-dd-english_real`	`real-video-holdout-6ab0ed11`	`video`	`real`
`mavos-dd-german_real`	`real-video-holdout-e71c3e03`	`video`	`real`
`mavos-dd-hindi_real`	`real-video-holdout-d0960f2d`	`video`	`real`
`mavos-dd-mandarin_real`	`real-video-holdout-3a63331d`	`video`	`real`
`oops-dataset`	`real-video-holdout-5d1b9f6b`	`video`	`real`
`vatex-3`	`real-video-holdout-6941c006`	`video`	`real`
`deepaction-v1`	`synthetic-video-holdout-bc6a27d4`	`video`	`synthetic`
`mcnet`	`synthetic-video-holdout-29358cff`	`video`	`synthetic`
`mobileswap`	`synthetic-video-holdout-ea6535cf`	`video`	`synthetic`
`mraa`	`synthetic-video-holdout-9d4fd175`	`video`	`synthetic`
`oneshot`	`synthetic-video-holdout-41e9d0ce`	`video`	`synthetic`
`pirender`	`synthetic-video-holdout-704b092a`	`video`	`synthetic`

Assets 2

05 May 08:38

dylanuys

v0.6.6

af192b3

Release 0.6.6

Released dataset name	Obfuscated holdout name	Modality	Media type
`v13-e4s`	`synthetic-video-holdout-90cc07d9`	`video`	`synthetic`
`v13-echonet-synthetic-v1`	`synthetic-video-holdout-589f8477`	`video`	`synthetic`
`v13-lemonade`	`real-video-holdout-0ea2b302`	`video`	`real`
`v13-live-whisperx-526k`	`real-video-holdout-40577fd5`	`video`	`real`
`v14-real-hallo3-training-data`	`real-video-holdout-e99a6cd3`	`video`	`real`
`v14-real-moments-in-time-raw`	`real-video-holdout-c627820a`	`video`	`real`

Released v15 video holdouts — Human

73 datasets.

Released dataset name	Obfuscated holdout name	Modality	Media type
`DH-FaceVid-1K-0003-part_3`	`real-video-holdout-bb599c24`	`video`	`real`
`DH-FaceVid-1K-0003-part_4`	`real-video-holdout-0c5992ea`	`video`	`real`
`DH-FaceVid-1K-0003-part_5`	`real-video-holdout-a778c46c`	`video`	`real`
`MAVOS-DD-english_sonic`	`synthetic-video-holdout-30ca29f6`	`video`	`synthetic`
`MAVOS-DD-freevc`	`synthetic-video-holdout-476bb844`	`video`	`synthetic`
`MAVOS-DD-german_echomimic`	`synthetic-video-holdout-4cda34f8`	`video`	`synthetic`
`MAVOS-DD-german_freevc`	`synthetic-video-holdout-0ddd910c`	`video`	`synthetic`
`MAVOS-DD-german_hififace`	`synthetic-video-holdout-2d68a443`	`video`	`synthetic`
`MAVOS-DD-german_inswapper`	`synthetic-video-holdout-ad7af038`	`video`	`synthetic`
`MAVOS-DD-german_knnvc`	`synthetic-video-holdout-319f0b94`	`video`	`synthetic`
`MAVOS-DD-german_liveportrait`	`synthetic-video-holdout-cd2b3dba`	`video`	`synthetic`
`MAVOS-DD-german_memo`	`synthetic-video-holdout-5f57b6ea`	`video`	`synthetic`
`MAVOS-DD-german_roop`	`synthetic-video-holdout-0f4d071b`	`video`	`synthetic`
`MAVOS-DD-german_sonic`	`synthetic-video-holdout-567ebf9e`	`video`	`synthetic`
`MAVOS-DD-hindi_echomimic`	`synthetic-video-holdout-ca8ba835`	`video`	`synthetic`
`MAVOS-DD-hindi_hififace`	`synthetic-video-holdout-b83ad09b`	`video`	`synthetic`
`MAVOS-DD-hindi_inswapper`	`synthetic-video-holdout-8ae3fcab`	`video`	`synthetic`
`MAVOS-DD-hindi_knnvc`	`synthetic-video-holdout-5f319c69`	`video`	`synthetic`
`MAVOS-DD-hindi_liveportrait`	`synthetic-video-holdout-28922fda`	`video`	`synthetic`
`MAVOS-DD-hindi_memo`	`synthetic-video-holdout-d36cb85b`	`video`	`synthetic`
`MAVOS-DD-hindi_roop`	`synthetic-video-holdout-1359e798`	`video`	`synthetic`
`MAVOS-DD-hindi_sonic`	`synthetic-video-holdout-34b113dc`	`video`	`synthetic`
`MAVOS-DD-knnvc`	`synthetic-video-holdout-71c4ccd8`	`video`	`synthetic`
`MAVOS-DD-mandarin_echomimic`	`synthetic-video-holdout-60de4b3a`	`video`	`synthetic`
`MAVOS-DD-mandarin_freevc`	`synthetic-video-holdout-421732bd`	`video`	`synthetic`
`MAVOS-DD-mandarin_hififace`	`synthetic-video-holdout-ef35d61b`	`video`	`synthetic`
`MAVOS-DD-mandarin_inswapper`	`synthetic-video-holdout-f4873f15`	`video`	`synthetic`
`MAVOS-DD-mandarin_knnvc`	`synthetic-video-holdout-488f6e35`	`video`	`synthetic`
`MAVOS-DD-mandarin_liveportrait`	`synthetic-video-holdout-1f619331`	`video`	`synthetic`
`MAVOS-DD-mandarin_memo`	`synthetic-video-holdout-1f087f85`	`video`	`synthetic`
`MAVOS-DD-mandarin_roop`	`synthetic-video-holdout-e44fa7cd`	`video`	`synthetic`
`MAVOS-DD-mandarin_sonic`	`synthetic-video-holdout-b4f9beec`	`video`	`synthetic`
`MAVOS-DD-romanian_echomimic`	`synthetic-video-holdout-8ea00713`	`video`	`synthetic`
`MAVOS-DD-romanian_freevc`	`synthetic-video-holdout-91ec85a3`	`video`	`synthetic`
`MAVOS-DD-romanian_hififace`	`synthetic-video-holdout-343d88d7`	`video`	`synthetic`
`MAVOS-DD-romanian_inswapper`	`synthetic-video-holdout-1bc66d5d`	`video`	`synthetic`
`MAVOS-DD-romanian_knnvc`	`synthetic-video-holdout-79b89c16`	`video`	`synthetic`
`MAVOS-DD-romanian_liveportrait`	`synthetic-video-holdout-ffff1093`	`video`	`synthetic`
`MAVOS-DD-romanian_memo`	`synthetic-video-holdout-23ccac96`	`video`	`synthetic`
`MAVOS-DD-romanian_roop`	`synthetic-video-holdout-b2ab16d9`	`video`	`synthetic`
`MAVOS-DD-romanian_sonic`	`synthetic-video-holdout-97a8e023`	`video`	`synthetic`
`MAVOS-DD-russian_echomimic`	`synthetic-video-holdout-41a03d06`	`video`	`synthetic`
`MAVOS-DD-russian_freevc`	`synthetic-video-holdout-1a2148f3`	`video`	`synthetic`
`MAVOS-DD-russian_hififace`	`synthetic-video-holdout-213d65fa`	`video`	`synthetic`
`v15-human-vid-celebv-hq`	`real-video-holdout-f10a679d`	`video`	`real`
`v15-human-vid-dfdm_cfr23-dfaker`	`synthetic-video-holdout-032b8470`	`video`	`synthetic`
`v15-human-vid-dfdm_cfr23-dfl-h128`	`synthetic-video-holdout-c92645dc`	`video`	`synthetic`
`v15-human-vid-dfdm_cfr23-iae`	`synthetic-video-holdout-9eee885c`	`video`	`synthetic`
`v15-human-vid-dfdm_cfr23-lightweight`	`synthetic-video-holdout-d127ad71`	`video`	`synthetic`
`v15-human-vid-dfdm_cfr23-real`	`real-video-holdout-bd7b715b`	`video`	`real`
`v15-human-vid-dh-facevid-1k-0002-part_1`	`real-video-holdout-a39f716a`	`video`	`real`
`v15-human-vid-dh-facevid-1k-0002-part_2`	`real-video-holdout-055386d3`	`video`	`real`
`v15-human-vid-dh-facevid-1k-0002-part_3`	`real-video-holdout-3ac018ea`	`video`	`real`
`v15-human-vid-dh-facevid-1k-0002-part_4`	`real-video-holdout-62772f42`	`video`	`real`
`v15-human-vid-dh-facevid-1k-0002-part_5`	`real-video-holdout-f4dab550`	`video`	`real`
`v15-human-vid-dh-facevid-1k-0003-part_1`	`real-video-holdout-8b493ecc`	`video`	`real`
`v15-human-vid-dh-facevid-1k-0003-part_2`	`real-video-holdout-1cf5d2f9`	`video`	`real`
`v15-human-vid-digifakeav_echomimic_21501_22000`	`synthetic-video-holdout-00a91199`	`video`	`synthetic`
`v15-human-vid-digifakeavfvfa_with_audio`	`synthetic-video-holdout-18c2b9ce`	`video`	`synthetic`
`v15-human-vid-mavos-dd-arabic-echomimic`	`synthetic-video-holdout-ec3d3cd0`	`video`	`synthetic`
`v15-human-vid-mavos-dd-arabic-hififace`	`synthetic-video-holdout-0920c4a7`	`video`	`synthetic`
`v15-human-vid-mavos-dd-arabic-inswapper`	`synthetic-video-holdout-af05e52b`	`video`	`synthetic`
`v15-human-vid-mavos-dd-arabic-liveportrait`	`synthetic-video-holdout-f61b042c`	`video`	`synthetic`
`v15-human-vid-mavos-dd-arabic-roop`	`synthetic-video-holdout-982fab07`	`video`	`synthetic`
`v15-human-vid-mavos-dd-arabic-sonic`	`synthetic-video-holdout-2ebd76d7`	`video`	`synthetic`
`v15-human-vid-mavos-dd-english_echomimic`	`synthetic-video-holdout-118ac16f`	`video`	`synthetic`
`v15-human-vid-mavos-dd-english_freevc`	`synthetic-video-holdout-4b760ab9`	`video`	`synthetic`
`v15-human-vid-mavos-dd-english_hififace`	`synthetic-video-holdout-51cd64cb`	`video`	`synthetic`
`v15-human-vid-mavos-dd-english_inswapper`	`synthetic-video-holdout-9451219f`	`video`	`synthetic`
`v15-human-vid-mavos-dd-english_knnvc`	`synthetic-video-holdout-909d326b`	`video`	`synthetic`
`v15-human-vid-mavos-dd-english_liveportrait`	`synthetic-video-holdout-372db1bb`	`video`	`synthetic`
`v15-human-vid-mavos-dd-english_memo`	`synthetic-video-holdout-19d06111`	`video`	`synthetic`
`v15-human-vid-mavos-dd-english_roop`	`synthetic-video-holdout-9f9485b2`	`video`	`synthetic`

Assets 2

24 Apr 22:27

dylanuys

v0.6.5

d543cff

Release 0.6.5

Removed shooter-fake image dataset, previously incorrectly marked synthetic
Fixing label for cosyvoice-instruct

Assets 2

17 Apr 06:04

dylanuys

v0.6.4

c60bd8b

Release 0.6.4

Released dataset name	Obfuscated holdout name	Modality	Media type
`v14-real-vcapcv-vggsound-test-15446-audio-cut`	`real-audio-holdout-ba647a50`	`audio`	`real`
`v14-real-vggsound-test-15446-video-cut`	`real-audio-holdout-4e846c0f`	`audio`	`real`
`v14-real-kallaama`	`real-audio-holdout-574a0401`	`audio`	`real`
`v14-real-chichewa-dataset`	`real-audio-holdout-dcb3cc7b`	`audio`	`real`
`v14-real-vivos`	`real-audio-holdout-74d82871`	`audio`	`real`
`v14-real-nisqa-corpus-dataset`	`real-audio-holdout-2c36fa90`	`audio`	`real`
`v14-real-natural-odss`	`real-audio-holdout-b8338bca`	`audio`	`real`
`v14-real-fastpitch-hifigan`	`real-audio-holdout-930542a5`	`audio`	`real`
`v14-real-daps`	`real-audio-holdout-b20f4647`	`audio`	`real`
`v14-real-bci-datasets`	`real-audio-holdout-aff9e8fe`	`audio`	`real`
`v14-real-ravdess-speech-16k`	`real-audio-holdout-ff483ea4`	`audio`	`real`
`v14-fake-somos`	`synthetic-audio-holdout-222cd5cc`	`audio`	`synthetic`
`v14-fake-diffgan-tts-aux`	`synthetic-audio-holdout-19638fa3`	`audio`	`synthetic`
`v14-fake-grad-tts`	`synthetic-audio-holdout-847ccfa5`	`audio`	`synthetic`
`v14-fake-tacotron2-dca-diffwave`	`synthetic-audio-holdout-7d903c0d`	`audio`	`synthetic`
`v14-fake-wavegrad2`	`synthetic-audio-holdout-4c09a9a9`	`audio`	`synthetic`
`v14-fake-diffgan-tts-naive`	`synthetic-audio-holdout-435ce3e0`	`audio`	`synthetic`
`v14-fake-natspeech-diffspeech`	`synthetic-audio-holdout-e6a66c72`	`audio`	`synthetic`
`v14-fake-fast-pitch`	`synthetic-audio-holdout-b9a6a0d3`	`audio`	`synthetic`
`v14-fake-tacotron2-dca`	`synthetic-audio-holdout-527cb6c8`	`audio`	`synthetic`
`v14-fake-tacotron2-dca-wavegrad`	`synthetic-audio-holdout-3d18d856`	`audio`	`synthetic`
`v14-fake-diffgan-tts-shallow`	`synthetic-audio-holdout-65bf3c65`	`audio`	`synthetic`
`v14-fake-prodiff`	`synthetic-audio-holdout-97af50ad`	`audio`	`synthetic`
`v14-fake-glow-tts`	`synthetic-audio-holdout-3051cd6a`	`audio`	`synthetic`
`v14-fake-tacotron2-dca-bddm`	`synthetic-audio-holdout-6c72fa6a`	`audio`	`synthetic`
`v14-fake-vits-1`	`synthetic-audio-holdout-0a918a26`	`audio`	`synthetic`
`v14-fake-vits`	`synthetic-audio-holdout-11335f29`	`audio`	`synthetic`
`v14-fake-vcapv-t2a`	`synthetic-audio-holdout-b2340f5f`	`audio`	`synthetic`
`v14-MLAAD-Fake-part_01`	`fake-audio-holdout-4aeede7d`	`audio`	`fake`
`v14-MLAAD-Fake-part_02`	`fake-audio-holdout-182aa55f`	`audio`	`fake`
`v14-MLAAD-Fake-part_03`	`fake-audio-holdout-823cfdef`	`audio`	`fake`
`v14-MLAAD-Fake-part_04`	`fake-audio-holdout-0a2f31b0`	`audio`	`fake`
`v14-MLAAD-Fake-part_05`	`fake-audio-holdout-065f3770`	`audio`	`fake`
`v14-MLAAD-Fake-part_06`	`fake-audio-holdout-df10edeb`	`audio`	`fake`
`v14-MLAAD-Fake-part_07`	`fake-audio-holdout-9b51a670`	`audio`	`fake`
`v14-MLAAD-Fake-part_08`	`fake-audio-holdout-0c25b759`	`audio`	`fake`
`v14-MLAAD-Fake-part_09`	`fake-audio-holdout-7ce01af9`	`audio`	`fake`
`v14-MLAAD-Fake-part_10`	`fake-audio-holdout-418c4a90`	`audio`	`fake`
`v14-dag-asr-audio`	`real-audio-holdout-c9491bb5`	`audio`	`real`
`v14-WaxalNLP-TTS-part_01`	`real-audio-holdout-de4859f5`	`audio`	`real`
`v14-WaxalNLP-TTS-part_02`	`real-audio-holdout-c1c223f4`	`audio`	`real`
`v14-WaxalNLP-TTS-part_03`	`real-audio-holdout-e53dfd16`	`audio`	`real`
`v14-WaxalNLP-TTS-part_04`	`real-audio-holdout-7f031cf1`	`audio`	`real`
`v14-WaxalNLP-TTS-part_05`	`real-audio-holdout-fcdd0715`	`audio`	`real`
`v14-WaxalNLP-TTS-part_06`	`real-audio-holdout-a6702abe`	`audio`	`real`
`v14-WaxalNLP-TTS-part_07`	`real-audio-holdout-c462f462`	`audio`	`real`
`v14-WaxalNLP-TTS-part_08`	`real-audio-holdout-a80cd5b3`	`audio`	`real`
`v14-WaxalNLP-TTS-part_09`	`real-audio-holdout-a0791dd4`	`audio`	`real`
`v14-WaxalNLP-TTS-part_10`	`real-audio-holdout-14d066a5`	`audio`	`real`
`v14-real-mmhu-h-videos`	`real-video-holdout-d4f56405`	`video`	`real`
`v14-real-mmhu-t-videos`	`real-video-holdout-14cc82ed`	`video`	`real`
`v14-real-mmhu-v-videos`	`real-video-holdout-b7174cb7`	`video`	`real`
`v14-real-vivid`	`real-video-holdout-8534b595`	`video`	`real`
`v14-real-soccernet-10s-5class`	`real-video-holdout-74616031`	`video`	`real`
`v14-real-ofdvdnet`	`real-video-holdout-b1b60d9c`	`video`	`real`
`v14-real-or-video-mov`	`real-video-holdout-aacc44e4`	`video`	`real`
`v14-real-poultry-videos`	`real-video-holdout-acaccb84`	`video`	`real`
`v14-real-spatialvid-group-001`	`real-video-holdout-a5173e55`	`video`	`real`
`v14-real-spatialvid-group-002`	`real-video-holdout-b4180215`	`video`	`real`
`v14-real-spatialvid-group-003`	`real-video-holdout-908daf61`	`video`	`real`
`v14-real-spatialvid-group-004`	`real-video-holdout-52ee57f0`	`video`	`real`
`v14-real-spatialvid-group-005`	`real-video-holdout-58bed19f`	`video`	`real`
`v14-real-videoespresso-train-video-01`	`real-video-holdout-2546c150`	`video`	`real`
`v14-real-videoespresso-train-video-02`	`real-video-holdout-e444b81e`	`video`	`real`
`v14-real-open-o3-video`	`real-video-holdout-73a9d20e`	`video`	`real`
`v14-real-panflow-1`	`real-video-holdout-51fd7e37`	`video`	`real`
`v14-real-panflow-2`	`real-video-holdout-2ed3f604`	`video`	`real`
`v14-real-panflow-3`	`real-video-holdout-4d085ece`	`video`	`real`
`v14-real-panflow-4`	`real-video-holdout-e4fe8a2c`	`video`	`real`
`v14-real-dh-facevid-1k-0001`	`real-video-holdout-6fc0b313`	`video`	`real`
`v14-real-tracking-any-granularity-videos`	`real-video-holdout-ccf53e73`	`video`	`real`
`v14-real-wild-animal-recognition-video-dataset`	`real-video-holdout-d2b5f026`	`video`	`real`
`v14-real-wlasl-videos`	`real-video-holdout-2e1dc2db`	`video`	`real`
`v14-real-wlasl-videos-1`	`real-video-holdout-1c782169`	`video`	`real`
`v14-real-wlasl-raw-videos-mp4`	`real-video-holdout-1122cfce`	`video`	`real`
`v14-real-youtubeclips`	`real-video-holdout-b055743a`	`video`	`real`
`v14-fake-allegro`	`synthetic-video-holdout-089c6870`	`video`	`synthetic`
`v14-fake-animatediffturbo`	`synthetic-video-holdout-d7c4ecc2`	`video`	`synthetic`
`v14-fake-ltxvideo`	`synthetic-video-holdout-3785e46b`	`video`	`synthetic`
`v14-fake-mochi1`	`synthetic-video-holdout-c4ccd94d`	`video`	`synthetic`
`v14-fake-pyramidflow`	`synthetic-video-holdout-38ebcccf`	`video`	`synthetic`
`v14-fake-videocrafter2`	`synthetic-video-holdout-0b71f5df`	`video`	`synthetic`
`v14-fake-animatediff`	`synthetic-video-holdout-b4755652`	`video`	`synthetic`
`v14-fake-cogvideox`	`synthetic-video-holdout-22f24e88`	`video`	`synthetic`
`v14-fake-fastsvd`	`synthetic-video-holdout-58631be3`	`video`	`synthetic`
`v14-fake-lavie`	`synthetic-video-holdout-e14b7524`	`video`	`synthetic`
`v14-fake-modelscope`	`synthetic-video-holdout-4df93a32`	`video`	`synthetic`
`v14-fake-opensora12`	`synthetic-video-holdout-8ab9a23d`	`video`	`synthetic`
`v14-fake-opensora`	`synthetic-video-holdout-59a9b3c9`	`video`	`synthetic`
`v14-fake-t2vturbo`	`synthetic-video-holdout-dd2a8901`	`video`	`synthetic`
`v14-fake-vcapav-t2v`	`synthetic-video-holdout-e31c3965`	`video`	`synthetic`
`v14-fake-cameraclone-0316`	`synthetic-video-holdout-82bcb0bf`	`video`	`synthetic`
`v14-fake-cameraclone-0317`	`synthetic-video-holdout-5a0cbed0`	`video`	`synthetic`
`v14-fake-cameraclone-0401`	`synthetic-video-holdout-d4b3ea97`	`video`	`synthetic`
`v14-fake-cameraclone-0402`	`synthetic-video-holdout-e194c1a8`	`video`	`synthetic`
`v14-fake-cameraclone-0404`	`synthetic-video-holdout-dc7cf915`	`video`	`synthetic`
`v14-fake-cameraclone-0407`	`synthetic-video-holdout-8e30a656`	`video`	`synthetic`
`v14-fake-cameraclone-0410`	`synthetic-video-holdout-decdec92`	`video`	`synthetic`
`v14-real-chinese-mp4-in-audio`	`real-video-holdout-6e020e98`	`video`	`real`
`Synthetic-Images-Fire-Scenario`	`synthetic-image-holdout-8fae48b4`	`image`	`synthetic`
`synthetic-dataset`	`synthetic-image-holdout-e4b4fa03`	`image`	`synthetic`
`Midjourneyv5-5K`	`synthetic-image-holdout-8dc0c9de`	`image`	`synthetic`
`fake_sdxl_12k-part-1`	`synthetic-image-holdout-edd5dd1b`	`image`	`synthetic`
`fake_sdxl_12k-part-2`	`synthetic-image-holdout-0f3be6af`	`image`	`synthetic`
`fake_sdxl_12k-part-3`	`synthetic-image-holdout-f291d1b7`	`image`	`synthetic`
`fake_sdxl_12k-part-4`	`synthetic-image-holdout-64663d9f`	`image`	`synthetic`
`Synthetic-Dog-Images`	`synthetic-image-holdout-c11a8170`	`image`	`synthetic`
`synthetic_data_0.1`	`synthetic-image-holdout-67a8e625`	`image`	`synthetic`
`syntheticdata_0.15`	`synthetic-image-holdout-af60fa23`	`image`	`synthetic`
`ptd-synthetic`	`synthetic-image-holdout-061fbf94`	`image`	`synthetic`
`image_patches_raw`	`synthetic-image-holdout-6405b7db`	`image`	`synthetic`
`stable-imagenet1k-flat`	`synthetic-image-holdout-3b5b6e0b`	`image`	`synthetic`
`Shooter-fake`	`synthetic-image-holdout-29eb8247`	`image`	`synthetic`
`SDv15R-dpmsolver-25-15K-part0`	`synthetic-image-holdout-1d5da5bc`	`image`	`synthetic`
`SDv15R-dpmsolver-25-15K-part1`	`synthetic-image-hol...

Assets 2

12 Apr 20:32

dylanuys

v0.6.3

370c3b8

Release 0.6.3

Deprecating old cache policy logic, previously used to determine what samples to keep in the gasstation cache when fool rates were more dynamic.

Assets 2

05 Apr 17:51

dylanuys

v0.6.2

d5d1369

Release 0.6.2

Parallelize gasbench data loading & fix memory leaks

Problem

Image and video benchmarks run unacceptably slowly when data was coming from NAS (not noticeable on local setups), and also occasionally OOM deep into runs. T

Root causes identified:

Sequential disk I/O from network volumes — The DatasetIterator reads each image/video file one-by-one in the producer thread. Each read from a NAS incurs network latency, paid serially N times.
"Drain-all-futures" stall — PrefetchPipeline accumulated num_workers * 2 futures then blocked on ALL of them (for future in futures: future.result()). The pipeline stalls on the slowest task even when other workers are idle.
Only 3 worker threads — With I/O-bound work (network volume reads + PIL decode), 3 threads underutilize available concurrency.
Memory leak from large images — Datasets with very large source images (100+ megapixels observed in logs) cause multi-GB memory spikes because image bytes are held in multiple places simultaneously: the sample dict, the result dict, and the batch queue. No explicit cleanup of PIL Image objects in multi-threaded workers.

Changes

gasbench/src/gasbench/dataset/iterator.py

Added lazy_read: bool parameter to DatasetIterator
When True, image samples yield {"image_path": ...} instead of reading file bytes; video samples yield {"video_path": ...} for file-based videos (frame directories are already lazy)
Iterating the dataset becomes near-instant (path collection only, no I/O)

gasbench/src/gasbench/benchmarks/image_bench.py

Rewrote PrefetchPipeline with three fixes:
- Parallel I/O: New _read_and_preprocess() does file read + PIL decode + augmentation as a single unit inside worker threads — 8 threads read from the network volume concurrently
- Bounded sliding window: Uses wait(FIRST_COMPLETED) with max_in_flight = num_workers * 4 = 32 instead of submit-all. Prevents unbounded memory growth from completed-but-unconsumed futures
- Sample metadata stripping: Drops heavy keys (image, image_bytes, image_path) from result dicts immediately after preprocessing — tracker only needs metadata fields
Default num_workers increased from 3 → 8
DatasetIterator created with lazy_read=True
executor.shutdown() now uses cancel_futures=True for clean teardown

gasbench/src/gasbench/benchmarks/video_bench.py

Same rewrite applied to VideoPrefetchPipeline
Default num_workers increased from 3 → 4 (fewer than image due to heavier per-sample memory)
max_in_flight = num_workers * 3 = 12 (tighter bound for video frames)
Strips video_bytes and video_path from result dicts

gasbench/src/gasbench/processing/media.py

Added explicit image.close() in process_image_sample() after extracting the numpy array — prevents PIL Image objects from lingering in multi-threaded workers

Expected impact

Metric	Before	After
Image I/O concurrency	1 (serial)	8 threads
Video I/O concurrency	1 (serial)	4 threads
Pipeline stall pattern	Drain all 6, block on slowest	`FIRST_COMPLETED`, no stalls
Peak in-flight samples (image)	6	32 (bounded)
Peak in-flight samples (video)	6	12 (bounded)
Image bytes in result dict	Held until tracker consumes	Stripped immediately
PIL Image cleanup	GC-dependent	Explicit `.close()`
Est. image benchmark time	~5 hours (52 datasets)	~1-2 hours

Assets 2

Releases: BitMind-AI/gasbench

Release 0.7.3

Uh oh!

Release 0.7.2

Uh oh!

Release 0.7.1

Summary

What changes

Sampling: per-dataset floors now respect the mode's total budget

Version: pyproject.toml is the single source

Uh oh!

Release 0.7.0

Summary

What changes

Score composition

Dataset config restructuring

Content category tagging

Benchmark size consolidation

Cleanup

Multi-OS dependency support

Backwards compatibility

Tests

Uh oh!

Release 0.6.7

Uh oh!

Release 0.6.6

Released v15 video holdouts — Human

Uh oh!

Release 0.6.5

Uh oh!

Release 0.6.4

Uh oh!

Release 0.6.3

Uh oh!

Release 0.6.2

Parallelize gasbench data loading & fix memory leaks

Problem

Changes

Expected impact

Uh oh!

Version: `pyproject.toml` is the single source