This is the repository for the pseudonymisation part of the BBMRI.cz data catalog.
Pseudonymizes predictive numbers, collects clinical data and removes unnecessary files before moving the data to SensitiveCloud at ICS-MUNI.
Miseq, New Miseq, MammaPrint
- Install requirements
pip install -r requiremenents.txt- Run main.py
python main.py -s /path/to/runs/for/pseudonymization -d /path/to/sensitive/cloud/destination
-t /path/to/pseudonymisation/tables/folder -l /path/to/libraries
-lsc /path/to/sensitive/cloud/libraries"docker compose up -f compose.dev.yml -d --build/seq/NO-BACKUP-SPACE/test/
├── Libraries/ # Required library files for pseudonymisation
├── logs/ # Logs from test runs
└── TRANSFER/ # Input data to be pseudonymized
- Copy the run you want to test into */test/TRANSFER/:
cp -a /path/to/original/run/ /seq/NO-BACKUP-SPACE/test/TRANSFER/
- Switch to export user and navigate to script folder:
su export
cd ~/data-catalogue-pseudonymisation
- Start the pseudonymization script:
docker compose -f compose.test.yml up --build
Logs for each run are in the /seq/NO-BACKUP-SPACE/test/logs directory.
To view all service logs:
docker compose -f compose.test.yml logs
# connect to seq server
su export
cd /home/export/data-catalogue-pseudonymisation
docker compose up -f compose.prod.yml --build -d# connect to seq serve
su export
crontab -e
# setting cron to run every Monday, Wednesday, Friday at 22:00
0 22 * * 1,3,5 /usr/local/bin/docker-compose -f /home/export/data-catalogue-pseudonymisation/compose.prod.yml up -d &>> /home/export/logs/`date +\%Y\%m\%d\%H\%M\%S`.logsu export
cd /home/export/data-catalogue-pseudonymization
git switch main
git pullThe new version shouldthe new version should automatically start in production once the cronjob is run automatically start in production once the cronjob is run.