Updatecode#2
Conversation
| # aws s3 sync \ | ||
| # --no-sign-request \ | ||
| # s3://openneuro.org/ds000224 \ | ||
| # ./data/ds000224 \ | ||
| # --exclude "*" \ | ||
| # --include "sub-MSC01/ses-struct*/anat/*T1w*" \ | ||
| # --include "sub-MSC01/ses-struct*/anat/*T2w*" \ | ||
| # --include "sub-MSC02/ses-struct*/anat/*T1w*" \ | ||
| # --include "sub-MSC02/ses-struct*/anat/*T2w*" \ | ||
| # --include "participants.tsv" \ | ||
| # --include "dataset_description.json" |
There was a problem hiding this comment.
We should find a variant that does not rely on a python package. Since the dataset is licensed as CC0, we could just redistribute the relevant files as julia artifacts (https://pkgdocs.julialang.org/v1/artifacts/).
| ref_path = joinpath( | ||
| dataset_dir, | ||
| measurements[1, :subject], | ||
| measurements[1, :session], | ||
| "anat", | ||
| measurements[1, :file] | ||
| ) |
There was a problem hiding this comment.
Maybe we should save the whole file path from the directory root in measurements
There was a problem hiding this comment.
I have updated generate_measurements to save the full BIDS relative path under the :file column.
|
|
||
| n_row = maximum(measurements.subject_number) | ||
| n_col = maximum(measurements.session_number) | ||
| n_voxels = isempty(coordinates) ? 0 : maximum(coordinates.voxel) |
There was a problem hiding this comment.
Since coordinates indexes the voxels by their original id, this produces a large array where all voxels preceding the ones in coordinates are missing.
There was a problem hiding this comment.
changed the first dimension of vw_data to nrow(coordinates) rather than the maximum voxel ID and then added a voxel_idx column to coordinates to map each voxel to its respective index in the compact array.
So, a array with the size of number of voxels is created.
| # For this tutorial the array shape will be (max_voxel_index, 2, 2) | ||
| # where 2 subjects and 2 sessions each contribute one T1w scan. | ||
|
|
||
| vw_data = voxel_wise_data(dataset_dir, measurements, coordinates) |
There was a problem hiding this comment.
Atm, this generates an array with mostly missings - see my comment in voxel_wise_data.jl
There was a problem hiding this comment.
Made changes now the array is sized exactly to the number of active brain voxels, so there are no unnecessary missings in arrray.
|
|
||
| # Save the log. condition_filename turns the named tuple into a filename string, | ||
| # e.g. (modality="T1w",) → "modality_T1w.jld2" | ||
| save_log(log, (modality = "T1w",)) |
There was a problem hiding this comment.
This is throwing an error:
ERROR: MethodError: no method matching names(::@NamedTuple{modality::String})
The function names exists, but no method is defined for this combination of argument types.
| save_voxel_wise_data(vw_data, "data/vw_data.jld2") | ||
|
|
||
| ############################################################################################ | ||
| # STEP 3 — Preprocessing and logging |
There was a problem hiding this comment.
Maybe we could
- add an outlier removal step
- use a few more voxels not only in the middle of the brain, such that outliers etc. are actually detected and removed in this tutorial
There was a problem hiding this comment.
Added outlier removal step and Expanded the mask to a larger size, 11×11×11 mask, with 1,331 voxels, so the outlier detection step worked.
| # For this tutorial we create a small | ||
| # 5×5×5 voxel mask in the centre of the volume so the pipeline runs quickly on any machine. | ||
|
|
There was a problem hiding this comment.
We can create this mask and distribute it as an artefact together with the data - this way, it does not have to be created during the tutorial
| # transpose to (n_sessions × n_subjects) as expected by the SEM | ||
| model_vox = replace_observed( | ||
| model; | ||
| data = voxel_matrix', |
There was a problem hiding this comment.
maybe we should have the data in the correct shape from the beginning, so we dont have to transpose?
| println("\nresults (first 5 rows):") | ||
| println(first(results, 5)) |
There was a problem hiding this comment.
The results are super unstable because we only use data from 2 participants and 2 sessions - I think we should use a bit more data to at least have sensible results in this tutorial
Co-authored-by: Maximilian Ernst <34346372+Maximilian-Stefan-Ernst@users.noreply.github.com>
No description provided.