qc-sample

Overview

The pipeline has been designed in close collaboration with TACC. It is not expected to be useable on another platform without substantial effort.

Preconfigured Pipelines

MRIQC
fMRIPrep
cat12_app and singularity container to run CAT12
QSIprep

Computing Details

tldr; most of the work happens in folders with the suffix "_app", specifically the bash scripts that are called either runner-template.sh or runner.sh (e.g., runner-template.sh).

Several files in this repository are not related to processing but instead configuration files necessary for computational environment. If you would like to know more about how the repository is structured, please review

Tapis V2, especially the format of an App
Tapis V3, especially for format of an Abaco Actor

The "apps" comprise a distinct module of analysis (e.g., a pipeline like mriqc, conversion to BIDS, aggregation of files). Apps are triggered by "actors".

The apps are each a bash script that is defined by several parameters (e.g., the mriqc app uses BIDS_DIRECTORY, which defines the location of the bids directory that will be passed to the mriqc container). The parameters are set for each job, through a JSON configuration file (e.g., job.json). Tapis interprets the app structure and uses the job script to both fill in the parameters in the runners and embed the runners in a script suitable for scheduling on a cluster (e.g., TACC uses SLURM, and so the resulting script has the #SBATCH lines filled). So, an app receives a JSON file, uses it to configure a cluster batch script, and then runs that script on the cluster.

The JSON files are created by the actors. The parameters that will be filled are set in YAML files (e.g., mriqc_actor/config.yml), which are processed by the reactor.py scripts (e.g., reactor.py). After creating that job, the actor monitors the job's status, and if it is successful it may trigger another actor.

Note that the Tapis apps in this repository are analogous to a BIDS App, but there is not a one-to-one correspondance. For example, in this repository, the QSIprep, MRIQC, and fMRIPrep apps are essentially light wrappers around an individual BIDS App, wrappers that define a particular way of calling the containers. However, there is no official CAT12 BIDS App for this repo's CAT12 app to use. Similarly, there is no HeuDiConv BIDS App because HeuDiConv is designed for getting data into a BIDS format in the first place.

Workflow

Scans are processed on a per-subject and per-app workflow.

A2CPS
[...]
├── products                 # files generated by the pipeline
│  ├── consortium-data       # beyond mri, A2CPS generates several kinds of products
│  ├── development
│  ├── dirc-data
│  ├── mris                 # outputs of mri_imaging_pipeline live mainly under A2CPS/products/mris
        ├── all_sites       # results that aggregate across several sites
        │  ├── bids
        │  ├── cat12
        │  ├── fcn
[...]
        ├── NS_northshore    # most products are stored in a subdirectory associated with the pipeline
        │  ├── aa-fmri-phantom-qa
        │  ├── bids
               ├── NS10042V1 # pipelines process individual sessions 
               └── NS10047V1 # each pipeline subdirectory (e.g., cat12, dicoms, fmriprep) is filled with
               └── NS10047V3 # a similar collection of products associated with individual sessions
        │  ├── cat12  
        │  ├── dicoms
                ├── NS10042V1.zip # note that the zip files have names that match the outputs of other pipelines
                └── NS10047V1.zip
                └── NS10047V3.zip

[...]
├── submissions            #   original data, restricted access! sites upload scans into their assigned folder. 
│  ├── a2cps_testdata      #   files under /submissions are not touched by us. Any modifications are (e.g., editing DICOM header) done by site
│  ├── a2dtn01                  
│  ├── NS_northshore
│  ├── NS_northshore_EHR
[...]

After a scan has been completed, sites export the DICOMs from the scanner¹, zip² them³, and then upload the scan⁴ to TACC⁵. The existence of new uploads are monitored with a cronjob (see cronjob.sh). When the cronjob detects a new upload, it submits a job to the dicom_reader_actor, which in turn triggers the dicom_reader_app to make a copy of DICOMs⁶.

After a successful copy, the dicom_reader_actor then triggers the HeuDiConv actor/app to convert the DICOM files into BIDS⁷. This app not only runs the heudiconv function, but then also takes several steps to clean and check the outputs of that function⁸.

If the conversion is successful, then the scan will be in a BIDS format and the HeuDiConv actor will trigger more Tapis actors. Currently, these include MRIQC⁹, fMRIPrep, QSIprep, and CAT12.

Outputs are aggregated weekly with the aggregator actor/app (scheduled via cronjob). The aggregator_app gathers outputs from products/mris/ and stores them together in products/mris/all_sites. Participants are only aggregated when all possible derivatives have been generated (e.g., a participant will not be aggregated if they have gone through the entire pipeline except CAT12). Analyses typically rely on aggregated outputs (e.g., the aggregated bids output). These aggregated outputs are a superset of releases, typically including more participants and more types of derivates.

QC results are generated by the aggregator_qc actor/app (scheduled via cronjob). This step uses outputs from the aggregregator app, and several others. The primary output of this app is a table of ratings¹⁰, one rating per scan¹¹.

site	sub	ses	scan	rating	source	date	notes
NS	10042	V1	CUFF1	green	auto
NS	10042	V1	CUFF2	green	auto
NS	10042	V1	DWI	green	auto
NS	10042	V1	REST1	green	auto
NS	10042	V1	REST2	green	auto
NS	10042	V1	T1w	green	auto	2022-09-20
NS	10047	V1	CUFF1	green	psadil	2022-09-09	wrap-around, ghosts-other, uncategorized
NS	10047	V1	CUFF2	green	auto
NS	10047	V1	DWI	green	auto

For details about how ratings are assigned, see qaqc-summary.pdf and qa-qc-strategy.pptx.

cronjobs

Several apps/actors are scheduled to run via cron -- either in an admin's crontab or using the cron features of Tapis Actors.

job	schedule
dicom_reader_app	every 10 minutes
aggregator_app	weekly on Tuesday
imaging_log	nightly at 11pm
qc_aggregator_actor	weekly on Tuesday
aggregator_phantom	weekly on Tuesday
fcn_actor	weekly on Tuesday
fslanat_actor	weekly on Tuesday
signatures_actor	weekly on Tuesday

Reports

Aspects of the QA/QC pipeline are summarized in several reports, some of which are generated automatically (e.g., daily, via cronjobs) and some of which are curated. An example of an automated report is available here, which is produced by code in this github repo.

Note that although the reports summarize information about the pipeline (e.g., quality of scans received), they also draw on other sources of information. In particular, the research assistants and technologists at the site record information about each visit in REDCap (e.g., whether there were deviations from the protocol, a rating of the anatomical image, information about the task, other notes, etc).

Notifications

In addition to regular reports, several parts of the pipeline generate automated notifications. The notifications go to a slack channel in a workspace that is dedicated to A2CPS. These notifications are helpful for the following reasons:

The notifications indicate that scans are being processed, since the triggering of processing is automatic
The notifications help quickly identify when there is an issue in a scan (e.g., the acquisition parameters were not as expected).

On each scanner, the runs are named according to the ReproIn specification. The specification allows for conversion by HeuDiConv into BIDS with a heuristic that is built into the heudiconv package. However, not all sites followed this specification, and so we have needed to use these heuristic modifications. ↩
Several of the scanners export 2D DICOM, which means that functional runs can comprise tens of thousands of files. We have found that this many files can "stress" the filesystem, which manifests as either general slowness or even a refusal to create new files. See the wikipedia article on inodes. ↩
One site was unable to zip the DICOM files. This kind of lack of standardization should be avoided, because it creates several unexpected headaches. ↩
Participants are associated with unique label that follows the pattern <numeric_id>. For example, the scans associated with the first visit of a participant from site NS that has ID 10001 would be associated with the label NS10001V1. These labels are stored in the PatientName DICOM header field and used to name the uploaded zip files. ↩
Each site has read and write permissions for just one folder on a secure storage system on TACC, and the sites can access that folder over ssh. ↩
Although most of the apps could run participants in parallel, we group each session into a distinct BIDS dataset and then process those datasets separately. ↩
By default HeuDiConv makes a gzipped tar arvhice of each DICOM series. This means that we are storing three copies of the DICOMs (the original data that was submitted by the site, a zipped copy, and the copy stored by HeuDiConv in the BIDS folders). Each of these serve a slightly different purpose, but they take up lots of space (not an issue for our environment). ↩
As a few examples: files that do not follow the BIDS standard are removed, the resulting JSON sidecars are compared against reference values for that site (e.g., image dimensions, phase encoding direction, scan length), and the final output is checked with the bids validator. ↩
The A2CPS paradigm includes two tasks, and we process these as separate jobs. One of the measures produced by MRIQC (and fMRIPrep) is fd_perc, which is the proportion of frames in a run that are above a given threshold. This decision was made because the two tasks were expected to produce different amounts of motion and so a different threshold was selected for each of the tasks. MRIQC (and fMRIPrep) does not allow task-specific thresholds, hence the decision to process the jobs separately. Note that, after collecting several hundred scans, there is not yet substantial evidence to support the use to different thresholds. ↩
site: location of scan; sub: numeric id associated with participant; ses: session/visit identifier; scan: name of scan in session (encodes run number); rating: quality rating of scan (green, yellow, or red); source: source of the rating (auto: based on automatically derived features; researcher name: that researcher made a judgement call); date: day when the rating was made; notes: standardized notes about the scan¹¹. ↩
Most of the notes are derived from the MRIQC visual reports. That report has a method of rating scans and provides a list of common scanner artifacts. The notes in the rating table are derived from that standardized list of artifacts. ↩ ↩²

Name		Name	Last commit message	Last commit date
Latest commit History 1,694 Commits
.vscode		.vscode
aggregator_actor		aggregator_actor
aggregator_app		aggregator_app
aggregator_phantom_actor		aggregator_phantom_actor
aggregator_phantom_app		aggregator_phantom_app
aggregator_qc_actor		aggregator_qc_actor
aggregator_qc_app		aggregator_qc_app
bedpostx_actor		bedpostx_actor
bedpostx_app		bedpostx_app
brainager_actor		brainager_actor
brainager_app		brainager_app
cat_actor		cat_actor
cat_app		cat_app
cronjobs		cronjobs
database_actor		database_actor
dicom_reader_actor		dicom_reader_actor
dicom_reader_app		dicom_reader_app
dwi_biomarker1_actor		dwi_biomarker1_actor
dwi_biomarker1_app		dwi_biomarker1_app
failure_actor		failure_actor
fcn_actor		fcn_actor
fcn_app		fcn_app
fmriprep_actor		fmriprep_actor
fmriprep_app		fmriprep_app
fslanat_actor		fslanat_actor
fslanat_app		fslanat_app
gift_actor		gift_actor
gift_app		gift_app
heudiconv_actor		heudiconv_actor
heudiconv_app		heudiconv_app
logphantom_actor		logphantom_actor
logphantom_app		logphantom_app
mriqc_actor		mriqc_actor
mriqc_app		mriqc_app
output_checker_app		output_checker_app
postdtifit_actor		postdtifit_actor
postdtifit_app		postdtifit_app
postgift_actor		postgift_actor
postgift_app		postgift_app
qaphantom_actor		qaphantom_actor
qaphantom_app		qaphantom_app
qsiprep_actor		qsiprep_actor
qsiprep_app		qsiprep_app
qsiprep_nodenoise_actor		qsiprep_nodenoise_actor
qsirecon-fsl_dtifit_actor		qsirecon-fsl_dtifit_actor
qsirecon-fsl_dtifit_app		qsirecon-fsl_dtifit_app
redcap_actor		redcap_actor
signatures_actor		signatures_actor
signatures_app		signatures_app
snapshot_app		snapshot_app
.gitignore		.gitignore
README.md		README.md
SECURITY.md		SECURITY.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

qc-sample

Overview

Preconfigured Pipelines

Computing Details

Workflow

cronjobs

Reports

Notifications

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

qc-sample

Overview

Preconfigured Pipelines

Computing Details

Workflow

cronjobs

Reports

Notifications

Footnotes

About

Resources

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages