The Audio De-identification Tool is a Python script designed to remove Protected Health Information (PHI) from audio tracks by replacing specified time intervals with beep sounds. It supports both audio-only files and video files with audio tracks.
- Python 3.x
- FFmpeg (required by moviepy for video processing)
pip install virtualenvpython -m venv env
source env/bin/activate # Use ".\env\Scripts\activate" on Windowspip install -r requirements.txtThe script is executed from the command line with the following parameters:
--source: Path to the source audio or video file--json: Path to the JSON file containing PHI time intervals--output: Path where the scrubbed result will be saved
--target_video: Path to a different video file to reattach the scrubbed audio to--log: Enable detailed logging to file (logs saved inlogs/directory)
Process an audio file and save scrubbed audio:
python scrub.py --source audio.mp3 --json phi_intervals.json --output scrubbed_audio.mp3 --logExtract audio from video, scrub it, and save as new video:
python scrub.py --source video.mp4 --json phi_intervals.json --output scrubbed_video.mp4 --logExtract audio from one video, scrub it, then attach to a different video:
python scrub.py --source original.mp4 --json phi_intervals.json --output final.mp4 --target_video processed.mp4 --logThis is useful when you have a video that's already been processed (e.g., face-blurred) and want to add the scrubbed audio to it.
The JSON file must contain PHI time intervals in one of two formats:
{
"word_segments": {
"1": {
"word": "**NAME**",
"start": "12.5",
"end": "15.2",
"score": "0.95",
"speaker": "SPEAKER_01"
},
"2": {
"word": "**LOCATION**",
"start": "28.1",
"end": "29.8",
"score": "0.87",
"speaker": "SPEAKER_02"
}
}
}If your JSON file contains a full transcript with nested segments and words, the tool will automatically extract the word_segments section.
word: The detected PHI text (words surrounded by**are processed)start: Start time in seconds (as string or float)end: End time in seconds (as string or float)score: Confidence score (optional)speaker: Speaker identifier (optional)
Only segments where the word field matches the pattern **text** will be replaced with beeps.
- Audio processing: Generates MP3 files with PHI segments replaced by beeps
- Video processing: Generates MP4 files with original video and scrubbed audio
- Logging: Optional detailed logs saved in the
logs/directory with timestamps and processing information- Log files are named with timestamp:
log_YYYYMMDD_HHMMSS.log - PHI content is sanitized in logs (shown as
[REDACTED]) - Includes progress indicators and warnings for any issues
- Log files are named with timestamp:
- Memory Efficient: Uses streaming audio processing to handle large files
- Privacy Protected: PHI words are replaced with
[REDACTED]in all log outputs - Robust Error Handling: Handles videos without audio tracks gracefully
- Automatic Cleanup: Temporary files are created in system temp directory and cleaned up
- Overlap Detection: Warns if PHI intervals overlap in the JSON file
- Audio: MP3, WAV, M4A, FLAC
- Video: MP4, AVI, MOV, MKV
- The
beep.mp3file must be present in the same directory as the script - Temporary files are created in the system temp directory and automatically cleaned up
- Processing shows progress indicators (e.g., "Processing interval 1/37")
- Videos without audio tracks will raise an informative error
- All logs are organized in the
logs/subdirectory for better file management