Skip to content

Guillain-RDCDE/FLAC_Detective

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

234 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

🎡 FLAC Detective

PyPI version PyPI Downloads CI Docs License: MIT

Find the fake FLACs in your music library.

Anyone can take an MP3, re-save it as FLAC, and it looks lossless β€” but the quality is already gone. FLAC Detective reads each file, spots the fingerprints a lossy codec leaves behind, and tells you which files are real and which are fakes.

pip install flac-detective       # needs Python 3.10+
flac-detective /path/to/music    # scan a file or a whole folder

Every file gets a verdict, like a traffic light:

βœ… AUTHENTIC      real lossless         β†’ keep it
❓ WARNING        borderline            β†’ give it a listen
⚠️  SUSPICIOUS     probably a transcode  β†’ likely a fake
❌ FAKE_CERTAIN   definitely a fake     β†’ replace it

The scan only reads your files β€” it never changes anything.

🟒 New to all this? β†’ Start Here β€” the 5-minute beginner's guide No command line, no jargon. From "what is this?" to "I checked my music".


πŸ“Š See why a file was flagged

Add --format html and you get a single self-contained page: a triage table sorted worst-first, plus a spectrum plot for every flagged file. The MP3 "cliff" β€” a sharp drop well below the real ceiling β€” is right there for the eye, with the detected cutoff marked.

FLAC Detective HTML report β€” triage table and per-file spectrum cliffs

Three transcodes at different MP3 bitrates show the wall falling at different frequencies (96 kbps ~11 kHz, 128 kbps ~16 kHz, 160 kbps ~17.5 kHz); the authentic file runs full-range.


πŸ” How it works

An MP3 re-saved as FLAC is lossless as a container, but the audio already passed through a lossy codec β€” and that leaves fingerprints. The clearest is the spectral cliff: MP3 discards everything above a bitrate-dependent frequency, so the spectrum falls off a wall where a real recording keeps going.

FLAC Detective scores each file with 11 heuristic rules built around that idea (cutoff frequency, MP3-bitrate signatures, compression artefacts) plus protection rules so genuine vinyl rips, cassette transfers and naturally quiet recordings aren't flagged. An optional 12th rule β€” a small CNN β€” sharpens borderline verdicts. The rules sum to a 0–150 score:

Verdict Score What to do
βœ… AUTHENTIC ≀ 30 keep it
❓ WARNING 31–54 borderline β€” check manually
⚠️ SUSPICIOUS 55–85 likely a transcode
❌ FAKE_CERTAIN β‰₯ 86 definitely transcoded

The guiding principle is "protect authentic files first": a false alarm on real music is worse than missing a borderline fake. Treat AUTHENTIC as "no evidence of transcoding", not a guarantee.

β†’ Every rule explained: Technical Details.


βš™οΈ Usage

flac-detective /path/to/music              # scan a folder
flac-detective                             # interactive (prompts for a path)

flac-detective /music --format csv  -o triage.csv   # spreadsheet, worst-first
flac-detective /music --format html -o report.html  # visual report (see above)
flac-detective /music --deep                        # catch high-bitrate AAC/Opus/Vorbis (slower)

Analyses FLAC, WAV, ALAC (.m4a) and APE (.ape) β€” codec-agnostic, and a lossy .m4a is correctly rejected (the real codec is probed, never trusted by extension).

β†’ Full guide & every flag: User Guide.

Install options & upgrading
pip install flac-detective                 # base
pip install "flac-detective[ml]"           # + optional CNN (Rule 12)
docker pull ghcr.io/guillain-rdcde/flac_detective:latest   # or Docker (amd64 + arm64)

pip install does not upgrade an existing install β€” use -U to get the latest release:

pip install -U flac-detective
flac-detective --version
Use it from Python or beets

Python API:

from flac_detective import FLACAnalyzer

result = FLACAnalyzer().analyze_file("song.flac")
print(result["verdict"])   # AUTHENTIC, WARNING, SUSPICIOUS, or FAKE_CERTAIN

beets plugin β€” triage transcodes without leaving your workflow:

pip install "flac-detective[beets]"
# in config.yaml:  plugins: flacdetective

beet flacdetective                          # analyse & tag the whole library
beet ls flacdetective_verdict:FAKE_CERTAIN  # list the certain fakes

Stores flacdetective_verdict and flacdetective_score on each item; an optional auto: yes analyses files as they're imported.


πŸ€– The ML side: a case study worth reading

Rule 12's model went through a real R&D saga, written up as a learning resource: a false-positive audit over 11 234 real FLACs, four instructive dead-ends, a debunked "AUC 0.99" caught by cross-validation, and a twist where a "fundamental limit" turned out to be an artifact of listening in mono β€” fixed by going stereo. Real-world specificity on 11 234 authentic FLACs climbed from 80 % to 95 %.

πŸ“– Read the ML detective story β†’


πŸ“š More


Licensed under the MIT License.

About

Detect MP3-to-lossless transcodes (FLAC, ALAC, APE, WAV) with an 11-rule spectral analysis plus an optional CNN classifier. CLI + Python API + multi-arch Docker. Auto-repairs corrupted FLACs.

Topics

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages