A hands-on introduction to passive acoustic data analysis using Ocean Networks Canada (ONC) hydrophone recordings.
This repository contains a Jupyter notebook covering common acoustic-analysis workflows, from loading audio and visualizing spectrograms to exploring machine-learning embeddings and computing simple acoustic metrics.
In this workshop, we will:
- Explore hydrophone recordings from Ocean Networks Canada (ONC)
- Learn how acoustic data can be accessed through ONC's API
- Compute and visualize spectrograms
- Generate or load machine-learning embeddings from audio clips
- Visualize acoustic structure using PCA, UMAP, and DBSCAN
- Inspect candidate embedding clusters with spectrograms
- Compare simple relative soundscape metrics across days
No prior machine-learning experience is required. Familiarity with basic Python and bioacoustics concepts is helpful but not necessary.
oceans_acoustic_workshop.ipynb— main workshop notebookmodels/hallo_seq_kw_det_v1.kt— detector model used for embedding generationenvironment.yml— recommended Conda environmentrequirements.txt— alternative pip installationworkshop_outputs/— outputs generated while running the notebook
The notebook creates plots, tables, and embedding outputs inside workshop_outputs/ and never modifies the source audio files.
The workshop uses a preselected subset of ONC hydrophone recordings rather than the full ONC archive.
Download the workshop audio folder from Dropbox:
https://www.dropbox.com/scl/fo/2dxd2yjgozansw2rw225u/AJIcF-BrnoqBqpX_tUdAgzU?rlkey=9jxaa5iw78b00amb4ibsm2bpl&st=h88icz36&dl=0
After downloading, move the folder into this repository and rename it, if needed, so the path is:
echo3_h1_day_view_begin_middle_end/
The folder should contain one subfolder for each selected day:
echo3_h1_day_view_begin_middle_end/
├── 20240707/
├── 20240723/
└── 20240806/
If your audio files are stored elsewhere, update WORKSHOP_AUDIO_DIR in the notebook configuration section.
One section of the workshop demonstrates how embeddings can be extracted from a trained killer whale detector model.
The notebook is designed to work even if a detector model is not available.
If no model is provided, the notebook automatically generates deterministic dummy embeddings so that all visualization and analysis sections can still be completed.
The default model path is repo-local:
models/hallo_seq_kw_det_v1.kt
Create a new environment:
conda create -n onc-acoustic-workshop python=3.10Activate it:
conda activate onc-acoustic-workshopInstall the required packages:
conda install -c conda-forge \
pandas \
numpy=1.26.4 \
matplotlib \
pysoundfile \
scikit-learn \
jupyter \
librosa \
umap-learn \
scipy \
scikit-image \
pipInstall the detector-model dependencies:
pip install tensorflow==2.8.0 ketos==2.7.1Launch Jupyter:
jupyter labAlternatively, create the environment directly from the included file:
conda env create -f environment.yml
conda activate onc-acoustic-workshopIf you already have a Python environment available:
pip install \
pandas \
numpy==1.26.4 \
matplotlib \
soundfile \
scikit-learn \
jupyter \
librosa \
umap-learn \
scipy \
scikit-image \
tensorflow==2.8.0 \
ketos==2.7.1Or install from:
pip install -r requirements.txtOpen the notebook:
jupyter lab oceans_acoustic_workshop.ipynbor open it directly in VS Code.
Run the notebook section by section.
The workshop is organized as a sequence of independent steps:
- Load and inspect audio files
- Visualize spectrograms
- Generate embeddings
- Explore PCA and UMAP projections
- Perform similarity searches
- Compute acoustic metrics
- Visualize changes through time