Skip to content

SPEECHCOG/PFML

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

18 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

A PyTorch implementation of the PFML algorithm for speech, EEG, and multi-sensor inertial measurement unit (IMU) data

This repository contains code for pre-training models using the Prediction of Functionals from Masked Latents (PFML) algorithm for speech, EEG, and multi-sensor IMU data, and also code for fine-tuning the pre-trained models using labeled data. The code has been implemented using PyTorch. For a thorough description of the PFML algorithm, see Section III of the publication. The arXiv pre-print version of the paper is available here.

The present PFML implementation has been used in the following publication: E. Vaaras, M. Airaksinen, and O. Räsänen, "PFML: Self-Supervised Learning of Time-Series Data Without Representation Collapse", IEEE Access, vol. 13, pp. 60233–60244, 2025.

If you use the present code or its derivatives, please cite the repository URL and/or the aforementioned publication.

Requirements

Any PyTorch version newer than version 1.9.0 should work fine. You can find out how to install PyTorch here: https://pytorch.org/get-started/locally/. You also need to have NumPy, scikit-learn, Librosa, and SciPy installed.

Repository contents

  • conf_finetune_pfml_pretrained_eeg_models.py: Example configuration file for fine-tuning pre-trained models for EEG data, using the same configuration settings that were used in the present paper.
  • conf_finetune_pfml_pretrained_imu_models.py: Example configuration file for fine-tuning pre-trained models for multi-sensor IMU data, using the same configuration settings that were used in the present paper.
  • conf_finetune_pfml_pretrained_speech_models.py: Example configuration file for fine-tuning pre-trained models for speech data, using the same configuration settings that were used in the present paper.
  • conf_pfml_pretrain_eeg.py: Example configuration file for PFML pre-training for EEG data, using the same configuration settings that were used in the present paper.
  • conf_pfml_pretrain_imu.py: Example configuration file for PFML pre-training for multi-sensor IMU data, using the same configuration settings that were used in the present paper.
  • conf_pfml_pretrain_speech.py: Example configuration file for PFML pre-training for speech data, using the same configuration settings that were used in the present paper.
  • finetune_pfml_pretrained_eeg_models.py: A script for fine-tuning a pre-trained model using labeled EEG data.
  • finetune_pfml_pretrained_imu_models.py: A script for fine-tuning a pre-trained model using labeled multi-sensor IMU data.
  • finetune_pfml_pretrained_speech_models.py: A script for fine-tuning a pre-trained model using labeled speech data.
  • pfml_data_loader.py: A file containing data loaders for PFML pre-training and fine-tuning for all three different data modalities (speech, multi-sensor IMU, and EEG data).
  • pfml_model.py: A file containing the neural network model implementations of the present paper, including data modality-specific encoders for framed speech, multi-sensor IMU, and EEG data.
  • pfml_pretrain_eeg.py: A script for running PFML pre-training and/or using a pre-trained model to extract features for EEG data.
  • pfml_pretrain_imu.py: A script for running PFML pre-training and/or using a pre-trained model to extract features for multi-sensor IMU data.
  • pfml_pretrain_speech.py: A script for running PFML pre-training and/or using a pre-trained model to extract features for speech data.
  • py_conf_file_into_text.py: An auxiliary script for converting .py configuration files into lists of text that can be used for printing or writing the configuration file contents into a text file.
  • transformer_encoder_pytorch.py: A file containing a slightly modified version of PyTorch's Transformer encoder implementation.

Examples of how to use the code

How to run PFML pre-training:

For example for speech data, you can either use the command

python pfml_pretrain_speech.py

or

python pfml_pretrain_speech.py <configuration_file>

in order to run PFML pre-training. Using the former of these options requires having a configuration file named conf_pfml_pretrain_speech.py in the same directory as the file pfml_pretrain_speech.py. In the latter option, <configuration_file> is a .py configuration file containing the hyperparameters you want to use during pre-training. By default, the configuration file conf_pfml_pretrain_speech.py uses the Librispeech dataset.

How to fine-tune pre-trained models:

For example for speech data, you can either use the command

python finetune_pfml_pretrained_speech_models.py

or

python finetune_pfml_pretrained_speech_models.py <configuration_file>

in order to fine-tune pre-trained models. Using the former of these options requires having a configuration file named conf_finetune_pfml_pretrained_speech_models.py in the same directory as the file finetune_pfml_pretrained_speech_models.py. In the latter option, <configuration_file> is a .py configuration file containing the hyperparameters you want to use during fine-tuning.

About

Code for pre-training models using the PFML algorithm for speech, EEG, and multi-sensor inertial measurement unit data. The repository also contains code for fine-tuning the pre-trained models using labeled data. Implemented using PyTorch.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages