- Standardized and structurally validated 4 more local-archive helper skills:
xcommon-sh,xfetch,xfilter, andxinfo. - Captured several important local-vs-remote corrections in that batch, especially
xcommon.shbeing a shared shell library rather than a real CLI,xfetchfetching from a configured local archive instead of remote EFetch,xfilterquerying local postings throughrchive -query, andxinfolisting local postings fields/counts rather than calling remoteeinfo. - Structural backlog is now down to 20 skills still missing at least one house-style section.
- Working cumulative estimate is now about 364 manually standardized skills.
- Standardized and structurally validated 4 more diagnostic/text-helper skills:
test-pcre,test-pmc-index,test-pubmed-index, andword-at-a-time. - Captured several high-value corrections in that batch, especially that the real regex test binary is
test_pcrerather thantest-pcre, that both local archive smoke tests (test-pmc-index,test-pubmed-index) have no safe help/version path and will misfire noisily withoutEDIRECT_LOCAL_ARCHIVE, and thatword-at-a-timeis simply a lowercase alphanumeric tokenizer. - Structural backlog is now down to 24 skills still missing at least one house-style section.
- Working cumulative estimate is now about 360 manually standardized skills.
- Standardized and structurally validated 4 more SPDI / diagnostic wrapper skills:
spdi2tbl,tbl2prod,test-edirect, andtest-eutils. - Captured several strong semantic corrections in that batch, especially
spdi2tblflattening<SPDI>XML rather than arbitrary SPDI text,tbl2prodactually consumingspdi2tbl-style 8-column variant rows to emit reference/altered product sequences,test-edirectbeing a long-form live example suite rather than a terse pass/fail checker, andtest-eutilsusing progress dots /xmarkers for endpoint health checks. - Structural backlog is now down to 32 skills still missing at least one house-style section.
- Working cumulative estimate is now about 356 manually standardized skills.
- Standardized and structurally validated 4 more line-sorting helper skills:
sort-by-length,sort-table,sort-uniq-count, andsort-uniq-count-rank. - Captured several important semantic corrections in that batch, especially
sort-by-lengthoperating on plain text lines rather than FASTA records,sort-tablebeing justgrep '.' | sort -t '\t',sort-uniq-countsorting internally instead of requiring pre-sorted input, andsort-uniq-count-rankalways ranking by descending count after case-insensitive grouping. - Structural backlog is now down to 36 skills still missing at least one house-style section.
- Working cumulative estimate is now about 352 manually standardized skills.
- Standardized and structurally validated 4 more short-source wrapper skills:
run-roh-pl,skip-if-file-exists,snp2hgvs, andsnp2tbl. - Captured several meaningfully corrective behaviors in that batch, especially
run-roh.plproducing per-sample.bcf,.txt.gz, and.logfiles before writingmerged.txt,skip-if-file-existsbeing a stdin path filter rather than a command wrapper,snp2hgvsemitting structured<HGVS>XML from dbSNP docsum input, andsnp2tblactually being the chained pipelinesnp2hgvs | hgvs2spdi | spdi2tbl. - Structural backlog is now down to 40 skills still missing at least one house-style section.
- Working cumulative estimate is now about 348 manually standardized skills.
- Standardized and structurally validated 4 more lightly evidenced wrapper skills:
easel,plot-roh-py,remove-dup, andrun-ncbi-converter. - Captured several high-signal runtime and source-derived quirks in that batch, especially
easelbeing blocked here by missinglibopenblas.so.0while still exposing its dispatcher interface through binary strings,plot-roh.pyrequiring gzippedGTplus 8-columnRGrecords instead of rawbcftools rohoutput,removeDupdropping every read at loci whose depth meets the cutoff rather than preserving one representative alignment, andrun-ncbi-converterbeing an FTP bootstrapper with no safe local help/version path. - Structural backlog is now down to 44 skills still missing at least one house-style section.
- Working cumulative estimate is now about 344 manually standardized skills.
- Standardized and structurally validated 10 more plotting, cache, and table-helper skills:
plot-ampliconstats,plot-bamstats,plot-vcfstats,propmapped,rchive,ref-cache,ref2pmid,refseq-nm-cds,reorder-columns, andrepair. - Captured several environment-critical behaviors in that batch, especially
plot-ampliconstatsrequiringgnuplot,plot-bamstatsbeing blocked here by missing Perl moduleURI::Escape,plot-vcfstats -Psuccessfully bypassing the missing LaTeX/PDF toolchain,propmappedonly yielding useful output with-o,rchivepreferring-versionover--version,ref2pmidbeing justtransmute -r2p,refseq-nm-cdsdefaulting to human and triggering large download/process jobs,reorder-columnsbeing a tab-only awk wrapper, andrepairinjecting dummy mates unless-dis set. - Structural backlog is now down to 48 skills still missing at least one house-style section.
- Working cumulative estimate is now about 340 manually standardized skills.
- Standardized and structurally validated 4 additional legacy holdouts:
analyse-seqs,b2ct,md5fa, andmd5sum-lite. - Captured several non-obvious behaviors in that batch, especially
AnalyseSeqsrequiring stdin-terminated sequence blocks plus optional taxa-prefixed PostScript output,b2ctonly having a confirmed stdin-to-stdout conversion path,md5faemitting per-record plus>ordered/>unordereddigests, andmd5sum-litebehaving like a stripped-down HTSlibmd5sumwith stdin labeled as-. - Structural backlog is now down to 48 skills still missing at least one house-style section.
- Working cumulative estimate is now about 340 manually standardized skills.
- Standardized and structurally validated the final 2 BLAST-related outliers still missing the five-section house style:
blast2sam-plandtblastn-vdb. - Corrected
tblastn_vdbsupporting reference capture to match the live binary's single-dash-helpand-versionbehavior instead of stale autogenerated GNU-style guesses. - Added source-derived option semantics for
blast2sam.pl, including the real meanings of-sand-d. - Closed out the current BLAST cleanup batch at
REMAINING 0for the two residual nonstandard skills in that family. - Standardized and structurally validated 9 additional legacy SAM conversion helpers:
ace2sam,bowtie2sam-pl,export2sam-pl,maq2sam-long,maq2sam-short,novo2sam-pl,psl2sam-pl,soap2sam-pl, andzoom2sam-pl. - Standardized and structurally validated the remaining 2 true SAM-adjacent holdouts in that scan line:
interpolate-sam-plandsam2vcf-pl. - Standardized the final non-SAM name-collision outlier caught by the same scan:
disambiguate-nucleotides. - Cleared the current
*sam*structural scan toREMAINING 0. - Standardized and structurally validated 19 RNA / kinetics skills:
rna2-dfold,rnaaliduplex,rnaalifold,rnacofold,rnaconsensus,rnadistance,rnaforester,rnaheat,rnainverse,rnalalifold,rnalfold,rnalocmin,rnamultifold,rnapaln,rnaplot,rnapvmin,kinfold,kinwalker, andrnaseq-pipeline. - Cleared both current
rna*andkin*structural scans toREMAINING 0. - Verified or captured key runtime quirks for this batch, especially
RNAconsensus --versionfailure,kinwalker --versionfallback to usage, and theRNApvminstartup failure caused by missinglibopenblas.so.0. - Standardized and structurally validated 7 EDirect archive-wrapper skills:
archive-nihocc,archive-nlmnlp,archive-nmcds,archive-pids,archive-pmc,archive-pubmed, andarchive-taxonomy. - Standardized and structurally validated 4 EDirect UID-list helpers:
combine-uid-lists,difference-uid-lists,exclude-uid-lists, andintersect-uid-lists. - Corrected several stale autogenerated assumptions in that batch, especially that the archive wrappers have safe
--help/--versionoutput and that the UID-list tools require pre-sorted input or expose clean custom help. - Closed the current
archive-*and UID-list EDirect scans toREMAINING 0. - Standardized and structurally validated 15 EDirect XML / JSON converter skills:
asn2xml,csv2xml,fsa2xml,gbf2xml,gff2xml,ini2xml,json2xml,jsonl2xml,scn2xml,tbl2xml,toml2xml,yaml2xml,xml2fsa,xml2json, andxml2tbl. - Corrected converter-specific quirks in that batch, especially the hidden
transmute/xtractPATH dependency,json2xmlturning--help/--versioninto literal XML,jsonl2xmlemitting multi-root output, andxml2jsonfailing on the missingXML::Simple.pmdependency. - Closed the current XML / JSON converter scan to
REMAINING 0. - Standardized and structurally validated 10 additional helper skills across VCF, text filtering, and GTF extraction:
fill-aa,fill-an-ac,fill-fs,fill-ref-md5,filter-columns,filter-genbank,filter-record,filter-stop-words,extract-exons-py, andextract-splice-sites-py. - Captured several real behavior quirks in that batch, especially
fill-an-ac's diploid-only AC/AN recalculation,fill-fsonly using the first ALT allele plus command-order-sensitive mask settings, and the HISAT2 extractor scripts merging exon gaps of 5 bp or less before output. - Closed the current
fill-*,filter-*, andextract-*-pyscans toREMAINING 0. - Standardized and structurally validated 5 GenBank flatfile helper skills:
gbf2facds,gbf2fsa,gbf2info,gbf2ref, andgbf2tbl. - Captured the real pipeline composition in that batch, especially that
gbf2fsaandgbf2tblare composed wrappers, whilegbf2facdsexposes distinct nucleotide/protein CDS modes andgbf2infoemits structuredGenBankInfoXML. - Closed the current
gbf2*scan toREMAINING 0. - Standardized and structurally validated 7 additional GFF / interval helper skills:
gff-sort,gff2gff,gff2gff-py,flatten-gtf,fuse-ranges,fuse-segments, andfind-in-gene. - Captured several runtime and source-level quirks in that batch, especially
gff-sort's hidden EDirect PATH dependency plus comment stripping,gff2gff's stderr repair summaries,gff2gff.py's missinggffutilsdependency and scratch-DB positional argument,flattenGTF's nonstandard help/version behavior, the bogus0 0 1empty-input sentinel in bothfuse-*wrappers, andfind-in-geneactually requiringstrand min max. - Standardized and structurally validated 8 additional EDirect text / interval helpers:
accn-at-a-time,align-columns,args2slice,between-two-genes,expand-current,gene2range,join-into-groups-of, andjust-top-hits. - Captured several deceptive-name and wrapper quirks in that batch, especially
accn-at-a-timebeing only a tokenizer,align-columnsdepending ontransmute,expand-currentdoing destructive rebuild work while still exiting0in a broken environment, andjust-top-hitscounting first-column groups instead of scoring rows. - Standardized and structurally validated 4 more citation / annotation helpers:
amino-acid-composition,annot-tsv,asn2ref, andcit2pmid. - Corrected several autogenerated misconceptions in that batch, especially that
amino-acid-compositionhandles FASTA records, thatannot-tsv -his help, and thatcit2pmidsupports clean help/version metadata flags. - Standardized and structurally validated 7 additional EDirect / operational helper wrappers:
download-ncbi-software,download-pmc,exact-snp,fasta-sanitize-pl,get-species-taxids-sh,gm2ranges, andgm2segs. - Captured several non-obvious runtime quirks in that batch, especially
download-ncbi-software's effectively emptysra-toolkitLinux path,download-pmc's verification-and-delete retry flow,exactSNP's real-vversion flag plus VCF output, and the exact dependency/output shape ofgm2segs. - Standardized and structurally validated 5 more short-source wrapper skills:
pair-at-a-time,color-chrs-pl,pma2apa,pma2pme, andnhance-sh. - Corrected several misleading autogenerated assumptions in that batch, especially that
pair-at-a-timeis a read-pair utility, thatcolor-chrs.plis a generic plotter instead of a human-karyotype SVG renderer, thatpma2apa/pma2pmeexpose normal help flags, and thatnhance.shcurrently runs cleanly in this environment. - Standardized and structurally validated 5 more text-formatting and quality-helper skills:
print-columns,print-missing-subranges,quote-grouped-elements,qualfa2fq-pl, andquality-scores. - Corrected several wrapper-specific gotchas in that batch, especially
print-columnsrequiring single-quoted expressions,print-missing-subrangesimplicitly anchoring at1,quote-grouped-elementsbeing only a simple sed-based formatter,qualfa2fq.plsilently trusting FASTA / QUAL record order, andqualityScoresemitting per-read comma-separated vectors rather than summary statistics. - Standardized and structurally validated 5 more variant / docsum helpers:
guess-ploidy-py,hgvs2spdi,ds2pme,bsmp2info, andgen-random-reads. - Captured several workflow-critical behaviors in that batch, especially
guess-ploidy.py's PNG-only plotting path,hgvs2spdi's stdin-HGVS-plus-optional-transform-file contract,ds2pmeexpecting docsum rather than full PubMed XML,bsmp2infoproducing compact XML with lowercased harmonized tags, andgenRandomReadsdefaulting to one million reads when--totalReadsis omitted. - Standardized and structurally validated 5 more legacy binary / BLAST helpers:
ct2db,datatool,popt,clustalw2, andblst2gm. - Captured several high-signal CLI quirks in that batch, especially
ct2db's clean help/version path,datatool's NCBI single-dash long-option style,poptrelying on the embeddedRNAsubopt -s < seq | poptcontract,clustalw2entering an interactive menu on bare invocation, andblst2gmfailing with an explicitxtractstdin error when no data is supplied. - Standardized and structurally validated 5 more EDirect / PMC helper skills:
blst2tkns,ecommon-sh,ecollect,pmc2info, andpmc2bioc. - Captured several wrapper-critical behaviors in that batch, especially
blst2tknsbeing aSeq-align-set_Etokenization recipe rather than a generic BLAST converter,ecommon.shbeing source-only library code with silent direct execution,ecollect's PubMed-specific-count/-subsetmodes plus sorted UID output, and bothpmc2info/pmc2biocdepending onxtract/transmuteand real PMC<article>XML. - Standardized and structurally validated 3 more transport / alignment helper skills:
nquire,analyse-dists, andalimask. - Captured several environment-critical behaviors in that batch, especially
nquire's working EUtils GET path but failing FTP listing path,AnalyseDistsusing a capitalized executable name plus a singular typo in its usage string, andalimaskbeing blocked by missinglibopenblas.so.0while still exposing useful option text throughstrings. - Total structural backlog is now down to 48 skills still missing at least one house-style section.
- Working cumulative estimate is now about 340 manually standardized skills.
- Resumed manual skill standardization from prior bedtools-focused work.
- Finished remaining bedtools wrappers and validated full bedtools family against the five-section standard.
- Standardized
clustalw,iqtree3,hmmsim, andwgsim. - Standardized legacy helper skills:
wgsim-eval-pl,vcfutils-pl,split-at-intron,samtools-pl. - Standardized
STAR/STARlongplain wrappers plus CPU-specific builds. - Standardized and structurally validated 11 ViennaRNA skills:
rnaplfold,rnaduplex,rnapdist,rnaup,rnasubopt,rnaeval,rnaplex,rnasnoop,rnapkplex,rnados,rnaparconv. - Standardized and structurally validated 7 Bowtie2 wrapper skills:
bowtie2-align-l,bowtie2-align-s,bowtie2-build-l,bowtie2-build-s,bowtie2-inspect,bowtie2-inspect-l,bowtie2-inspect-s. - Standardized and structurally validated 7 HISAT2 core wrapper skills:
hisat2-align-l,hisat2-align-s,hisat2-build-l,hisat2-build-s,hisat2-inspect,hisat2-inspect-l,hisat2-inspect-s. - Standardized and structurally validated 6 HISAT2 helper skills:
hisat2-extract-exons-py,hisat2-extract-snps-haplotypes-ucsc-py,hisat2-extract-snps-haplotypes-vcf-py,hisat2-extract-splice-sites-py,hisat2-read-statistics-py,hisat2-simulate-reads-py. - Standardized and structurally validated 8 Easel core skills:
esl-sfetch,esl-afetch,esl-reformat,esl-seqstat,esl-alistat,esl-alimask,esl-alimanip,esl-translate. - Standardized and structurally validated 6 Easel alignment and comparison skills:
esl-alimap,esl-alimerge,esl-alipid,esl-alirev,esl-compalign,esl-compstruct. - Standardized and structurally validated the remaining 9 Easel skills:
esl-construct,esl-histplot,esl-mask,esl-mixdchlet,esl-selectn,esl-seqrange,esl-shuffle,esl-ssdraw,esl-weight. - Closed out the full current
esl-*family atREMAINING 0. - Verified local runtime quirks for this batch, including the
esl-histplotdefault-output mismatch,esl-seqrange1-based worker indexing,esl-shuffle -Ghelp/man disagreement, andesl-weightstartup failure on missinglibopenblas.so.0. - Standardized and structurally validated 4 HMMER profile utility skills:
hmmbuild,hmmconvert,hmmemit,hmmlogo. - Standardized and structurally validated 2 HMMER daemon skills:
hmmpgmd,hmmpgmd-shard. - Closed out the current HMMER family at
REMAINING 0acrosshmm*plusjackhmmer,nhmmer,nhmmscan, andphmmer. - Verified local runtime quirks for the HMMER batch, including
hmmemitmulti-model library emission,hmmlogotable-style default output, and the shared-library startup failures affectinghmmbuild,hmmpgmd, andhmmpgmd_shard. - Standardized and structurally validated the remaining 2 Subread skills that still used generic docs:
subread-fullscan,sublong. - Closed out the current Subread family at
REMAINING 0acrossfeature-counts,subread-align,subread-buildindex,subread-fullscan,subjunc, andsublong. - Verified Subread-specific CLI quirks for this batch, especially that
subread-fullscantakes a literal read string andsublongexpects a full one-block index. - Working cumulative estimate is now about 198 manually standardized skills.
- Added persistent planning files to the project root so future batches can track status on disk.
- Select the next cohesive high-value cluster outside the now-completed HISAT2, Easel, HMMER, Subread, archive-wrapper, UID-list, XML/JSON converter, and current helper families.
- Continue preferring tools with real local executables, help text, or man pages over purely autogenerated summaries.
- Preserve batch discipline: inspect runtime behavior first, patch the five standard sections, then run structural validation immediately.
- Best next target is now the remaining lightly evidenced cluster around
analyse-seqs,b2ct,bioinformatics-toolkit,biomni, and possiblyeasel, while continuing to defer opaque binary-only holdouts until there is stronger evidence to document them safely.
- Standardized and structurally validated 4 more local helper skills:
xlink,xsearch,xa2multi-pl, anduniq-table. - Reduced the five-section structural backlog from
20remaining skills to16. - Corrected four stale autogenerated assumptions in this batch:
xsearchis a local archive/postings search wrapper rather than a remote Entrez client,xlinkresolves local link targets throughxlink.ini,xa2multi.plhas no real help/version interface, anduniq-tableremoves invariant columns instead of deduplicating rows. - Captured live/runtime evidence for this batch, including the missing-
EDIRECT_LOCAL_ARCHIVEfailure path inxsearch, the currentxlink.initarget mapping (CITED,CITES,PMCID), the exact secondary-alignment expansion behavior ofxa2multi.pl, and the row-2 baseline rule insideuniq-table. - Working cumulative estimate is now about 368 manually standardized skills.
- Standardized and structurally validated 4 more helper skills:
run-with-lock,seq-cache-populate-pl,subindel, andstarlong. - Reduced the five-section structural backlog again from
16remaining skills to12. - Corrected four more stale autogenerated assumptions in this batch:
run_with_lockis a broken-but-identifiable NCBI lock wrapper rather than a self-documenting generic helper,seq_cache_populate.plbuilds MD5-keyedREF_CACHEtrees rather than a loose FASTA cache,subindelexposes a usage-only interface with ambiguous output-prefix semantics, andSTARlongin this environment is a CPU-dispatch wrapper rather than a single binary. - Captured live/runtime evidence for this batch, including the missing
get_lockdependency inrun_with_lock, real cache paths andREF_CACHEoutput fromseq_cache_populate.pl, the invalid--h/ unrecognized---versionbehavior ofsubindel, and thebash -xproof thatSTARlongselectsSTARlong-avx2on this host. - Working cumulative estimate is now about 372 manually standardized skills.
- Standardized and structurally validated the final 4 concrete-CLI residual skills in the current backlog:
project-tree-builder,roh-viz,systematic-mutations, andvrfs-variances. - Reduced the five-section structural backlog from
12remaining skills to8, leaving only meta/project-style skills. - Corrected several high-value documentation traps in this batch:
roh-vizactually requires-ifor the ROH file even though its own example/error text says-r,systematic-mutationsis a stdin-onlytransmutewrapper rather than an option-driven CLI,vrfs-variancesmixes stdout/stderr outputs in default mode and can duplicate the terminal site in-smode, andproject_tree_buildercan succeed silently on-dryrun. - Captured live/runtime evidence for this batch, including
project_tree_builderversion4.12.3, realsystematic-mutationsexpansion output, reproduciblevrfs-variancesMEAN/VAR2and-vbehavior on toy input, and theroh-vizparser mismatch between its true-ioption and broken built-in example text. - Working cumulative estimate is now about 376 manually standardized skills.
- Standardized and structurally validated the remaining 8 meta/project skills:
bioinformatics-toolkit,biomni,evo2,phage-design,protein-structure,rfdiffusion,sequence-analysis, andyeast_database. - Closed the five-section structural backlog to
TOTAL_MISSING 0. - Converted the last batch from vague autogenerated overviews into workspace-grounded gateway skills tied to real local assets under
repositories/active/andprojects/. - Captured critical environment reality in the final batch: Biomni top-level import works but deeper tool imports fail on missing
langchain_core; Evo 2 import fails on missingvortexand its Docker image is not built; RFdiffusion repo is present but its image is not built; the yeast project is a teaching project whose real entrypoints are Bash scripts pluspipeline.py --steps. - Working cumulative estimate is now about 384 manually standardized skills.