Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
39 commits
Select commit Hold shift + click to select a range
6585600
Bump version to 1.0.4
jorgeMFS Dec 3, 2025
e47ba25
Comprehensive performance optimization for large file processing
jorgeMFS Dec 4, 2025
a1060cc
Merge branch 'ieeta-pt:main' into main
jorgeMFS Dec 4, 2025
13662d8
Add StreamingGzipReader for bounded-memory gzip decompression
jorgeMFS Dec 4, 2025
37e62f6
Add C++ API documentation for vcfx_core library
jorgeMFS Dec 4, 2025
d516cb5
perf: Apply I/O optimizations to 43 tools
jorgeMFS Dec 4, 2025
2b24d69
perf(phred_filter): add mmap support for 26x faster large file proces…
jorgeMFS Dec 6, 2025
942b5c2
perf(missing_data_handler): add mmap and multi-threading for 50x speedup
jorgeMFS Dec 6, 2025
e1f465c
perf(allele_counter): optimize with zero-allocation parsing and 1MB b…
jorgeMFS Dec 6, 2025
3827dcc
perf(hwe_tester): replace O(n²) permutation with O(1) chi-square formula
jorgeMFS Dec 6, 2025
2968614
perf(fasta_converter): use temp file streaming to prevent OOM on larg…
jorgeMFS Dec 6, 2025
b2884de
perf(ld_calculator): optimize memory and I/O for large sample processing
jorgeMFS Dec 6, 2025
2a71781
perf(gl_filter): optimize genotype likelihood parsing and output
jorgeMFS Dec 6, 2025
3002405
perf(cross_sample_concordance): remove stringstream overhead
jorgeMFS Dec 6, 2025
e5fcb36
perf(distance_calculator): optimize pairwise distance computation
jorgeMFS Dec 6, 2025
b2321a0
perf(dosage_calculator): optimize dosage computation for large samples
jorgeMFS Dec 6, 2025
ad37f1b
perf(multiallelic_splitter): optimize allele splitting for large files
jorgeMFS Dec 6, 2025
a4b0593
perf(inbreeding_calculator): optimize F coefficient computation
jorgeMFS Dec 6, 2025
e229c0b
perf(field_extractor): optimize field parsing and output
jorgeMFS Dec 6, 2025
6f6b02e
perf(ancestry_assigner): optimize ancestry computation for large samples
jorgeMFS Dec 6, 2025
3bf19b3
chore(benchmarks): update benchmark tasks and results
jorgeMFS Dec 6, 2025
47b050f
test(fasta_converter): update large test VCF file
jorgeMFS Dec 6, 2025
a541ab4
chore: update .gitignore
jorgeMFS Dec 6, 2025
fac6bc0
Optimize VCFX_sorter with mmap and pre-computed chromosome IDs
jorgeMFS Dec 6, 2025
1ab17ac
perf(variant_counter): optimize with SIMD and memchr for 60x speedup
jorgeMFS Dec 6, 2025
37107d0
perf(indexer): optimize with mmap and SIMD for 15-20x speedup
jorgeMFS Dec 6, 2025
bc88b5d
perf: optimize haplotype_phaser, phase_checker, genotype_query, nonre…
jorgeMFS Dec 7, 2025
2ba6100
perf(fasta_converter): optimal two-pass algorithm with 12x speedup
jorgeMFS Dec 7, 2025
181d822
perf(diff_tool): optimize with mmap and SIMD for massive speedup
jorgeMFS Dec 7, 2025
2c9be1e
docs(diff_tool): update with v1.2 performance improvements
jorgeMFS Dec 7, 2025
369ca3d
perf(concordance_checker): optimize with mmap and SIMD for 56x speedup
jorgeMFS Dec 7, 2025
edc5eee
perf(allele_counter): optimize with mmap, SIMD, multi-threading and b…
jorgeMFS Dec 8, 2025
f04cdda
perf(inbreeding_calculator,hwe_tester): optimize with mmap and SIMD
jorgeMFS Dec 8, 2025
6307132
perf(allele_freq_calc,indel_normalizer,missing_detector): optimize wi…
jorgeMFS Dec 8, 2025
20064bc
perf(ld_calculator): optimize with mmap, SIMD r², and multi-threading…
jorgeMFS Dec 8, 2025
a7fbee0
perf(allele_balance_calc,haplotype_extractor): optimize with mmap and…
jorgeMFS Dec 8, 2025
323e404
docs: identify 8 additional tools needing optimization
jorgeMFS Dec 8, 2025
af59a0e
perf(af_subsetter,dosage_calculator,duplicate_remover): optimize with…
jorgeMFS Dec 8, 2025
71b9989
perf: optimize remaining tools with mmap and SIMD
jorgeMFS Dec 9, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,7 @@ Thumbs.db
tools.md
prompt.md
names.md
VCFX_Optimization_Guide.md

# Temporary outputs from genotype_query tests
tests/tmp/genotype_query/
Expand Down
118 changes: 118 additions & 0 deletions TOOLS_TODO.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,118 @@
# VCFX Tools Optimization TODO

**Last Updated:** December 9, 2025

## Summary

- **31 tools optimized** with mmap + SIMD acceleration
- **0 tools need optimization** (all slow tools have been optimized)
- **30 tools already fast** (<1 second on 4GB file)

---

## Optimized Tools (31 total)

| Tool | Optimization | Speedup | Status |
|------|-------------|---------|--------|
| **VCFX_validator** | mmap + SIMD | 1040x | ✅ Complete |
| **VCFX_variant_counter** | mmap + SIMD | 60x | ✅ Complete |
| **VCFX_fasta_converter** | mmap + SIMD + zero-copy + mmap temp | 50-100x | ✅ Complete |
| **VCFX_indel_normalizer** | mmap + SIMD | ~73x | ✅ Complete |
| **VCFX_missing_detector** | mmap + SIMD + MT pre-scan + zero-copy | ~42x | ✅ Complete |
| **VCFX_indexer** | mmap + SIMD | 32x | ✅ Complete |
| **VCFX_sorter** | mmap + precomputed IDs | 40x | ✅ Complete |
| **VCFX_phred_filter** | mmap | 26x | ✅ Complete |
| **VCFX_missing_data_handler** | mmap | 50x | ✅ Complete |
| **VCFX_nonref_filter** | mmap + SIMD | ~50x | ✅ Complete |
| **VCFX_genotype_query** | mmap + FORMAT caching | ~30x | ✅ Complete |
| **VCFX_phase_checker** | mmap | ~30x | ✅ Complete |
| **VCFX_haplotype_phaser** | mmap + SIMD + zero-copy | 16x | ✅ Complete |
| **VCFX_haplotype_extractor** | mmap + SIMD + zero-copy | 3.3x+ | ✅ Complete |
| **VCFX_allele_counter** | mmap + SIMD + MT + batch | 8-10x | ✅ Complete |
| **VCFX_diff_tool** | mmap + SIMD | ~20x | ✅ Complete |
| **VCFX_concordance_checker** | mmap + SIMD | ~20x | ✅ Complete |
| **VCFX_allele_balance_calc** | mmap + SIMD + incremental flush | ~50x | ✅ Complete |
| **VCFX_inbreeding_calculator** | mmap + SIMD | ~21x | ✅ Complete |
| **VCFX_hwe_tester** | mmap + SIMD | ~18x | ✅ Complete |
| **VCFX_allele_freq_calc** | mmap + SIMD | ~20x | ✅ Complete |
| **VCFX_ld_calculator** | mmap + SIMD r² + MT matrix + distance pruning | 5-60x | ✅ Complete |
| **VCFX_af_subsetter** | mmap + zero-copy | ~15x | ✅ Complete |
| **VCFX_dosage_calculator** | mmap + zero-copy | ~15x | ✅ Complete |
| **VCFX_duplicate_remover** | mmap + zero-copy | ~15x | ✅ Complete |
| **VCFX_multiallelic_splitter** | mmap + zero-copy + PL recoding | ~7x | ✅ Complete |
| **VCFX_distance_calculator** | mmap + zero-copy | ~16x | ✅ Complete |
| **VCFX_cross_sample_concordance** | mmap + MT + reusable buffers | ~16x | ✅ Complete |
| **VCFX_variant_classifier** | mmap + zero-copy | ~12x | ✅ Complete |
| **VCFX_metadata_summarizer** | mmap + zero-copy | ~12x | ✅ Complete |

---

## Fast Tools (Already performant, <1 second on 4GB file)

These tools already perform well without optimization:

**Original fast tools:**
- VCFX_reformatter: 0.15s
- VCFX_header_parser: 0.17s
- VCFX_merger: 0.17s
- VCFX_quality_adjuster: 0.18s
- VCFX_sv_handler: 0.19s
- VCFX_probability_filter: 0.21s
- VCFX_phase_quality_filter: 0.21s
- VCFX_outlier_detector: 0.21s
- VCFX_custom_annotator: 0.21s
- VCFX_file_splitter: 0.21s
- VCFX_subsampler: 0.22s
- VCFX_allele_balance_filter: 0.26s
- VCFX_ancestry_assigner: 0.28s

**Newly verified fast tools (on 4GB file):**
- VCFX_alignment_checker: 0.05s
- VCFX_ancestry_inferrer: 0.02s
- VCFX_annotation_extractor: 0.03s
- VCFX_compressor: 0.02s
- VCFX_field_extractor: 0.07s
- VCFX_format_converter: 0.02s
- VCFX_gl_filter: 0.02s
- VCFX_impact_filter: 0.02s
- VCFX_info_aggregator: 0.02s
- VCFX_info_parser: 0.02s
- VCFX_info_summarizer: 0.02s
- VCFX_population_filter: 0.06s
- VCFX_position_subsetter: 0.03s
- VCFX_record_filter: 0.02s
- VCFX_ref_comparator: 0.02s
- VCFX_region_subsampler: 0.02s
- VCFX_sample_extractor: 0.02s

---

## Optimization Pattern (Proven)

Apply this pattern for ~30-100x speedup:

```cpp
// 1. Memory-mapped I/O
#include <sys/mman.h>
struct MappedFile {
const char *data = nullptr;
size_t size = 0;
int fd = -1;
bool open(const char *path);
void close();
};

// 2. MADV_SEQUENTIAL hint
madvise((void*)data, size, MADV_SEQUENTIAL | MADV_WILLNEED);

// 3. SIMD line scanning (AVX2/SSE2/NEON)
static inline const char* findNewlineSIMD(const char* p, const char* end);

// 4. Zero-copy parsing with string_view
static inline std::string_view extractField(const char* line, int fieldIdx);

// 5. 4MB output buffer
class OutputBuffer { /* ... */ };

// 6. CLI: -i/--input FILE, -q/--quiet, -t/--threads (where applicable)
```
Loading
Loading