Running Event Based GLMs in SPM

Today I’ll jot some notes for running event based GLMs in SPM. These may be a bit more scattered than usual, but hopefully will be useful for documenting these steps for running basic GLMs.

First you will want to create a stimulus onset file for extracting the betas from your nifti files. I’ve posted about stimulus onsets (SOTS) previous, and you can reference this page on Andy’s Brain Blog for more detail on why timing files are important.

makeSOTS.m

Below is a code chunk from a larger script that consolidates creating SOTS files for the different GLMs. (TCB_makeSOTS.m). Basically, here there are 4 regressors we care about (Cue onsets for the 4 conditions), plus a task bug interval regressor, which serves as a nuisance regressor because we want to remove the BOLD signal where there was a task bug in the code (oops). Regardless, this will create a .mat file, and subjects without the task bug (the study code was fixed halfway through the study), which will be fed into batch script for specifying the model in SPM.

Here’s an example of a Parametric modulator GLM:

Finally, here is the GLM for “AllIntervals“, which extracts the beta estimates for cue, interval, and feedback for each interval. This will be most useful if we want to do follow-up analyses in R or another software program down the road.

I’ve done this for both “Event” and “Parametric modulator” GLMs, as well as the “AllIntervals” GLM, and here are the the sots files for all of the GLMs I’ve set up. You can ignore any of them that have “Event_AllIntervPmod” in them, the bottom 10 are the relevant GLMs for the current analysis (e.g., TCB_Event… and TCB_Pmod…).

makeRegressor.m

Next, we run a “makeRegressor.m” script that creates the mean intercepts across the different runs (M1 – M7) and the trends across the different runs (T1-T8). This helps accounts for the differences between the runs that occur at the run-level (for a given Run X, MX = 1), as well as any trends or drift across the run (a scaled vector from -1 to 1 binned by the number of TRs (repetition time) within each nifti). In my current study, the number of TRs = 310 for each run. Below is a screenshoot to demonstrate how these matrices are created. Note that even though there 8 runs, there are only 7 mean intercepts. That is because the Level1 analyses within SPM estimates a global mean, so we do not need to account for this. Thus, you should have (2n-1) columns, where n = number of runs. Overall, this file is important because they will be fed into the fMRI Model Spec Estimate later.

runLevel1_fMRIModelSpec_Estimate.m

Next, we will want to create the batch file that feeds into spm. You will need to specify a few variables, which can be done from the GUI to generate these batch files. Because I already generated these from the GUI a while back, I’m going to use the scripts directly. But more info about batch processing with the GUI can be found here.

Importantly, you will want to make sure that spm is initialized if you plan to run these analyses on the cluster and use parallel processing and the command line.

runLevel1_contrast.m

I haven’t modified these contrast files yet, but we can additionally specify contrasts of interest. This requires putting “1” and “-1” values for the contrasts of interest, as well as including zeros for the nuisance regressors (Task Bugs, Mean Intercepts, Trends).

runLevel1_parallel.sh

Finally, we will want to run these scripts using our bash script. I have set this up with sbatch headers so that we can parallelize the Level1 analyses so each subject is run on a separate job (an array is created for the participant numbers). I’ve also added some “echos” with notes so that we can read the log file and understand what is happening while the script is running. Once we have clarified the GLMs we want to run, we can call all of these matlab scripts to do the following:

  1. make SOTS files
  2. make Regressor files
  3. runLevel1 Model Spec Estimate
  4. runLevel1 contrast

Note that Steps 1+2 are redundant since they create files for each subject every single time for the same GLM, but the redundancy doesn’t hurt and I haven’t come up yet with a more clever or more efficient way to do this. Maybe I can do an if-statement to run if these files doesn’t already exist for a GLM, but if I need to re-run these regressor for any reason it might be useful to ahve in there.

To run the scripts, we just run them in the command line.

Also, a useful tip to maximize resources is to check the efficiency of your jobs.

Beta Estimates / Nifti Output

When we are done running the Level 1 analysis, we will get beta files for the regressors from the model. Here, for glm1, betas 1-5 will be the task (4 cues + task bug regressor), plus the 15 mean interception and trend regressors, followed by 1 global regressor (5 + 15 + 1 = 21 betas).

We can confirm this in the SPM.mat file, in the variable “SPM.xX.X”, which should contain the values of the design matrix.

We can open this in SPM as well (SPM > Review > SPM.mat), which should display the regression matrix for the GLM of interest. We want to check the bottom row of “parameter estimability”, because if any of the boxes are gray it suggest collinearity (the regressors overlap with one another) which is problematic for running the GLM and estimating variance attributed to specific regressor.

Okay, I think that is it for now. Will add additional notes as I continue to run analyses. 🙂

Advertisement

What is orthogonalization and how do you implement it in SPM?

Over the past few days, I’ve been learning about how to deal with orthogonalization of regressors in your fMRI level 1 GLMs. Consequently, I’ve fallen deep into a new rabbit hole and now I figured it would be useful to the universe to share my collated knowledge over the last 24 hours.

For more leisure learning, I would highly recommend Jeanette Mumford’s excellent video tutorial on orthogonalization as not ideal for dealing with collinearity of your regressors.

Anyway, it turns out (as of June 2021) orthogonalization of modulations are default turned ON in spm 12, in case you were wondering. So we’re going to want to turn this off in our Level 1 GLMs. If you set up your models via the batch editor, you can just do this manually. However, pragmatically this is incredibly tedious and you may want to just tweak existing batch scripts you’ve already generated… for example. 😅

After much googling for how to actually change these settings via batch scripting, I stumbled across this useful link on the spm forum, where some folks were discussing how to turn off orthogonalization of parametric modulator regressors in spm 12. In brief, there are two ways to do this.

Option 1: Set orthogonalization on and off flexibly within each condition

To accomplish this, you will need to define each condition separately in your matlabbatch script. This is a bit more tenable if you are entering your conditions manually on the GUI or even in script with few conditions, but this sort of gets unwieldy fast depending on the number of conditions you want to set.

matlabbatch{2}.spm.stats.fmri_spec.sess(1).cond(1).name = 'ConditionName'
matlabbatch{2}.spm.stats.fmri_spec.sess(1).cond(1).onset = yourOnsetVector 
matlabbatch{2}.spm.stats.fmri_spec.sess(1).cond(1).pmod(1).name = 'PmodName'
matlabbatch{2}.spm.stats.fmri_spec.sess(1).cond(1).pmod(1).param = yourPmodVector 
matlabbatch{2}.spm.stats.fmri_spec.sess(1).cond(1).pmod(1).poly = 1
matlabbatch{2}.spm.stats.fmri_spec.sess(1).cond(1).orth = 0 or 1

Here’s a screenshot of my whole batch, for reference, in case you were wondering where this code snippet sits with respect to the other lines of code and spm commands.

Option 2: Set orthogonalization on and off flexibly using the “multiple conditions” option

Here, this assumes that you’ve set nicely set up your stimulus onsets in a .mat file, a cell arrays for “names”, “onsets”, “durations”, and “pmod” variables on your model (note your pmod should be a struct). You will also want to define an “orth” cell array (True/False or 1/0), for which of these variables you would like to be orthogonalized. If you want to remove orthogonalization for all of them, this is pretty straightforward and you set all of your array variables to 0. One addition note is that “orth()” is technically a function in matlab, but you can assign values to it without any problem, but you cannot just call orth by itself otherwise you will run into errors.

Here is a screenshot of all of these arrays. Basically the important thing to keep in mind is that the number of columns should be the same across all of your arrays/struct variables, and they should correspond to your regressors. There’s a bit more coding that goes into how I’ve set this up, but I have not gone into that here.

Here is a snippet of code for the multiple conditions version, as well as a screenshot

matlabbatch{2}.spm.stats.fmri_spec.sess.cond = struct('name',{},'onset',{},'duration',{},'tmod',{}, 'pmod', {}, 'orth', {});
matlabbatch{2}.spm.stats.fmri_spec.sess.multi = {'../SOTS/TCB_Event_AllIntervPmod_RewPenRTacc_sots_allTcat.mat'};
matlabbatch{2}.spm.stats.fmri_spec.sess.regress = struct('name', {}, 'val', {});

Alright, that is pretty much it. Feel free to let me know if there’s anything I’ve missed, or if you have any other useful suggestions for implementing orthogonalization in your GLM models in spm12!

fMRIPrep – Parallelizing Bash Scripts

It has been a while, but alas I am resurrecting the fMRIprep preprocssing pipeline. There have been a few updates and tweaks to the xnat-tools put out by the CCV folks, which also has great documentation. Shoutout to Isabel Restrepo. 🙂

I won’t put the scripts in this post, but you should be able to access and download them from my GitHub Repo.

Step 1: TCB_xnat2bids_bidspostprocess_parallel.sh
This script pulls Dicoms from XNAT to the server and stores in xnat-export folder (note: naming irregularities should be noted in a .json file), converts dicoms to nifti format, and then converts to bids format which is stored in a bids folder. Note that you will need an Ascension number associated with each subject, which you can find from the XNAT GUI. Alternatively, you can go to your “https://bnc.brown.edu/xnat/data/experiments/” URL to identify the ascension number associated with your subject. Different institutions may have different specifics wrt the URL, but it should generate a list of your acquired scans like the following:

List of the scans on XNAT

Next, you will want to edit and submit the scripts to the cluster. A minor addition to these scripts are that since we have collected. field maps, the “bidspostprocess” function updates the associated .json file to include the runs that the fieldmaps should be applied for (more detail below). My scripts are set up so all you will need is are edit is the input to the associative array. Note that our T1 mprage scan spits out both and RMS and expanded image, and we want to ignore the non-RMS T1 scan. Once we update this info, then we can run the script.

Navigate to the directory with the scripts and then submit job to cluster.

Using “sbatch” to submit jobs to the cluster. You can also use “sacct” to check the status of your submitted job
Using “myq” to list running and queued jobs

Once the job has been completed successfully, we should see the following:

Step 2: TCB_bids-validator.sh
Next we will want to check that the data are BIDS validated. If you look in the “bids” folder BIDS Validation, and specifically in the func folder (for task-based fMRI), the data should be in BIDS format.

Additionally, you will want to check that the bidspostprocess worked, and if it works you you will see an “IntendedFor” argument in the .json file with the associated functional runs it will be applied for.

bidspostprocess adds an “IntendedFor” argument in the .json files for each fieldmap to indicate which BOLD runs to apply fieldmap correction for

Next, you can run the bids-validator script, which tests for bids compliance within the bids folder. Here, I haven’t specified the events.tsv files (will do that later), so the bids-validator is just warning me. But otherwise it looks great.

Step 3: TCB_fmriprep_fieldmap_ica_parallel.sh
Next, we will run fMRIprep with fieldmap correction and ICA aroma. The script is parallelized so that participants will be sent to separate nodes. (There is more information in the header of the bash script not included below).

Step 4: TCB_postfmriprep_data.sh
This step is for unzipping the .nii.gz files that are generated after fmriprep. This puts all of the relevant functional and anatomical nifti images into an spm-data file for GLM analyses. This is an optional step, and highly variable depending on what software you wish to perform your analyses in.

Anyways, that’s all for preprocessing. Stay tuned for more updated scripts regarding generating stimulus onset files, and setting up first and second level GLMs in SPM.

Transferring data from XNAT and Converting to BIDS format

In today’s post, I will cover transferring data from XNAT (storage server for MRI data) and converting it to BIDS format. This is an important precursor step to running fMRIPrep. Unfortunately, the steps I’ll walk through are specific to the Brown community

Before I get too deep into the weeds, I want to give a HUGE shoutout to the Behavioral Neuroimaging Core (BNC), who have provided the bulwark of these tools to the neuroimaging community. I wouldn’t have been able to implement half these tools without their support and heroic efforts to set up this infrastructure. Also, FYI for those following at home outside of the Brown community, all of these tools are available on Github and can be implemented using your own system.

Github Repo for XNAT Tools: https://github.com/brown-bnc/xnat-tools
Brown-Specific XNAT Portal: https://bnc.brown.edu/xnat/

Okay, so let’s get started.

First, our group has set up an XNAT, and neuroinformatics platform to store MRI data developed at Washington University in St. Louis. If you’ve worked with any of the Human Connectome Project Datasets or larger neuroimaging datasets generally, you may have used this interface before (e.g., IntraDB, CNDA, etc). It’s a great interface that allows the user to click buttons to look at what data they’ve stored after each scan. As someone who has spent a lot of time with this interface, I highly recommend it and think it’s super easy to use.

Screenshot 2020-03-30 14.50.56.pngOnce you’ve set up your project and your data are all stored, you should be all set to run the ‘xnat2bids’ function from the xnat tools singularity container.  The way it was set up for us, we need to extract the subject number and the accession number.  (See image below).

Screenshot 2020-03-30 15.41.52.png

As a side note, there may be MRI sessions in which you have extra runs you don’t want to use, and there are numerous ways you can deal with that. I tend to prefer to not delete raw data and just remove data at a post-processing step. So for our study, we’re just specifying which blocks to use and not use (relevant for scripts later).

Screenshot 2020-03-30 15.49.39.png

Finally, you want to set up your scripts. Here, I’ve created two loops – one that is more general and calls ‘xnat2bids’ for the “perfect” runs (i.e., the number of BOLD runs collected is as expected), and the “imperfect” runs (i.e. the number of BOLD runs collected is more than expected). For the imperfect/irregular runs, I’ve used the “skiplist” argument to skip the scan numbers that we do not want to use (see image above).

To run the script, you’ll want to navigate to the folder where the script is, and then rn it using sbatch (if you’re on a supercomputer) or bash (if you’re running locally or don’t have access to cluster). If the latter, make sure you set your computer to rocket launch for a period of time.

Screenshot 2020-03-30 15.47.14.png

*NOTE*: By default, this script will prompt you to login your password for every subject. you can circumvent this by setting your XNAT_USER and XNAT_PASSWORD in your .bashrc file. Some more detailed instructions are here. Here’s a bit of code below.

# Navigate to home directory
cd 

# Edit the .bashrc file
gedit .bashrc

# Run the bashrc file to implemenet changes in current Termminal window
. ~/.bashrc

Here is what my .bashrc file looks like (without my password):Screenshot 2020-03-30 16.55.04.png

Screenshot 2020-03-30 17.02.06.png

When your scripts are done, you should have your BIDS formatted data. You can check that your data are formatted corrected. In my dataset, I have T1mprage anatomical images (in anat folder), 8 functional task runs (in func folder), fieldmap images (in fmap folder).

Screenshot 2020-03-30 17.04.15.png

A quick note on file naming conventions

What if my DICOM files are named differently across runs or from what I want to name them in after conversion?

Here, we’ve created a bidsmaps file to edit the filenames so they are all the same. If the naming convention is correct/consistent across all studies, you won’t need this file.

Screenshot 2020-03-30 16.44.32.png

Well, that’s it for now. Please feel free to let me know if there’s anything else that should be added in terms of BIDS naming convention.

 

 

Running fMRIPrep on your BIDS-Compatible MRI Dataset

In this post, I’ll describe the steps for how I run fMRIPrep. (sorry that this is coming before some of the earlier steps, I’m currently in the middle of this so I figured I’d document this now before I forget. I’ll get to the earlier steps later, when I work with some additional data). As an aside, through some googling, I found this github repo  by a doctoral student to be very helpful for articulating some of these more details.

I tend to prefer to edit my scripts locally, so I will mount the server onto my computer so I have access to the files locally. Here is an example script below. Note that I’m using ‘singularity’ instead of ‘docker’, since I have access to fMRIPrep installed on a cluster. So if you decide to run fMRIPrep locally, you can user the docker command instead.

In my script, I plan to run both fMRIPrep and Freesurfer, with fieldmap correction (see also here) and ica-aroma.

singularity run --cleanenv \

   --bind ${bids_root_dir}/${investigator}/study-${study_label}:/data \

   --bind /gpfs/scratch/dyee7:/scratch \

   --bind /gpfs/data/bnc/licenses:/licenses \

   /gpfs/data/bnc/simgs/fmriprep/fmriprep-${fmriprep_version}.sif \

   /data/bids /data/bids/derivatives/fmriprep-${fmriprep_version}-nofs \

   participant \

   --participant-label ${pid} \

   --fs-license-file /licenses/freesurfer-license.txt \

   -w /scratch/fmriprep \

   --stop-on-first-crash \

   --nthreads 64 \

   --write-graph \

   --use-aroma \
Here are some arguments that are useful for consideration, especially if you don’t have fieldmaps and want to apply some signal distortion correction.
# If you don't want to run freesurfer
   --fs-no-reconall

# If you don't have field maps, or want to do fieldmap-less correction
   --use-syn-sdc
   --force-syn

Screenshot 2020-03-27 16.33.13.png

 

If you want to do fieldmap correction, you’re going to need to add an “IntendedFor” in your .json files for your fieldmaps. Here, because I am applying gradient echo field map, I want to apply the “phasediff” compressed nifti to all of my functional runs, so I added this argument below in alphabetical order, because I’m only a little neurotic. You’ll want to check that this file also has two Echo Times, EchoTime1 and EchoTime2.

(*It is worth noting that if you have spinecho field maps, ideally would have collected both A-P and P-A direction, if you are alternating AP and PA directions in your functional acquisition, and need to apply the opposite direction of the fieldmap to your function run. This may be more relevant if you analyze any of the connection datasets like HCP and ABCD, which default collect spinecho field maps, as well as other files like SB-refs.)

(**It is also worth noting that while there has been discussion about whether to automate this process in fMRIPrep, because there is so much variability in how fieldmaps are applied, they are deciding (as of right now) not to implement some general purpose tool (See Forum Here). There are some folks who are creating custom scripts that crawl through their data to automate this process, but it seems reasonable to me that because of the diversity of how fieldmaps are used in preprocessing, having some type of automation be project-specific or lab-specific make more sense.

Screenshot 2020-03-27 16.41.06.png

When you log on the VNC, you can open a terminal window and navigate to the directory where your scripts are located. I like to use the ‘gedit‘ function to check that script is correct.

gedit TCB-fmriprep_fieldmap2_ica.sh

Screenshot 2020-03-27 14.31.18.png

This is what the text file should look like, and should match the script you were editing outside of the VNC. Don’t worry if the spacing is a bit off, though it may be aesthetically unpleasant, it still works.

Screenshot 2020-03-27 14.31.09.png

To look at your available scripts before running, I always like to use an ‘ls’ function.

Screenshot 2020-03-27 14.32.08.png

Hooray! you are ready to run your script. You can run the script using sbatch function (or bash if you are not working on a supercomputer).

sbatch TCB-fmriprep_fieldmap2_ica.sh

If you are submitting a job on a cluster, you will also want to check that your script submitted the job and check the status of your job. More details on the Oscar website can be found here.

# check the status or your queued jobs
myq

# check the status of ongoing or recently completed jobs
sacct

Screenshot 2020-03-27 14.33.43.png

Occasionally you will run into ERRORS, in which the state will say “FAILED” or “INCOMPLETE”. When it says failed, it usually means something in your scripts (or the data) prevented the script from completing without issues. However, when you see this, don’t panic! You can easily investigate what happened by looking at your log and error files. These files live in a folder in the directory where your scripts are located. As you can see at the top of my script, I included n argument to output errors and logs by the job ID (%J), which you can see easily in your status table above. (As you can see, I have a lot of logs and errors lately…)

Screenshot 2020-03-27 17.10.14.png

Navigate to the folder, and you can use “ls -lt” to order your files by time and date. I used the ‘head’ argument to list only the top 20 of these recent files, since I don’t want to look at a million in my terminal window.

Screenshot 2020-03-27 17.06.01.png

You can use ‘gedit’ to open the log and error files, or you can just click them open if you have the server mounted onto your desktop.

Example Log File

Screenshot 2020-03-27 17.07.46.png

Example Error File

Screenshot 2020-03-27 17.07.01.png

Assuming that you don’t have any major errors and that your job is completed, then your ‘derivatives folder within your BIDS folder will have the preprocessed data.

Screenshot 2020-03-27 17.17.29.png

There’s a nice HTML file that gives a summary of all of the preprocessing steps, as well as some nice text about how to include these steps in your manuscript.

Screenshot 2020-03-27 17.18.30.png

Screenshot 2020-03-27 17.19.20.png

Happy scripting and debugging! Let me know if there are any key steps that I’ve missed here. 

 

 

Getting started with fMRIPrep on a computer cluster

It’s currently day n of shelter-at-home of covid-19 pandemic times, and I’ve started to dig myself into a hole with fMRI preprocessing using some (relatively) fancy new tools on a fancy supercomputer cluster, which I, fortunately, have access to at Brown. One of the blessings (and/or curse, depending on how you view it) of being a postdoc is that despite these pandemic times, you realize you really have no serious responsibilities and much freedom. In some ways, you can potentially be disposable, but also potentially indispensable depending on the circumstances. That being said, one of the true perks of being a postdoc is that your primary job is to do a lot of data crunching and analysis (and writing), so as long as you have data, it’s not too bad of a gig, and as long as you have a generous PI who is willing to pay for you to analyze the data.

Okay, so given that abundance of time I have to dig into the weeds of fMRI preprocessing, I’ve decided it’s not a terrible time to start documenting some of these analyses adventures, in case it may potentially be useful to others. Also, a colleague of mine who blogs her work regularly (check out her blog on mvpa here) encouraged me to do this a while back when I was fidgeting with using syringe pumps in the scanner. (I highly discourage any PhD candidate from doing this, unless you really enjoy tinkering with equipment). Basically, I’ve spend the last decade of my life tinkering around with different fMRI analyses Softwares, and recently have decided to make the full switch to fMRIPrep to make my life easier — so I suppose that these blogposts in upcoming weeks (months? lets hope not…) will be directly most useful folks who know a think or two (or many) about fMRI, and may want to get their feet wet with using this cool innovative wrapper that makes preprocessing less of a headache. That, and I have a terrible memory, so this will hopefully helpful for me 5 to 10 years down the road. Anyways, since here goes nothing.

So, you want to get started with fMRIprep (on a computer cluster)? 

Some basics to get familiar with, if you have limited computer programming background:

Some Excellent Books on getting started with fMRI analyses:

Here are some more Brown-specific resources:

fMRIPrep Resources

Since I’ll be working on our university’s supercomputer, I’ll likely provide more detailed instructions regarding those types of analyses, but if you plan to run some of these analyses from a local computer, you can easily apply much of the scripts using “bash” instead of “sbatch” commands, since you won’t have parallel computing available. The only downside is that your computer may sound like a rocket when you’re running such analyses, and it may take a longer time for your scripts to finish. I will probably have separate blog posts that address each of these steps as I do them, so stay tuned if your appetite for fMRI preprocessing is still growing. Nevertheless, hopefully, this will be useful to some of your hypothetical readers *fingers crossed.*

Some general high-level steps to consider

Step 1: Exporting Data and Converting to BIDS

Step 2: Validating that your MRI dataset is BIDS-compatible

(Optional) Step 2a: Quality Control checks for your MRI dataset

Step 4: Run fMRIprep on your MRI dataset using singularity or docker containers

It is worth noting that there are still a few quirks to this seemingly magical tool. Although fMRIPprep is pretty good automating most of the preprocessing steps, here are some things that it does NOT do automatically. It is still possible to implement these steps, it just requires a bit of tweaking, and such things would be good to consider in the data processing steam in the future.

 

I think I’ve inundated you with enough links for now. I will probably update this post as I think of more resources. Feel free to leave a comment below for suggestions on resources or anything I may have missed!