-
Notifications
You must be signed in to change notification settings - Fork 273
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature/course demo lifespan #120
base: master
Are you sure you want to change the base?
Conversation
I guess the batch script is a lot longer now. I guess I'm not sure that having 4 different ways of specifying the same argument is a good idea. Also, the stuff that the user modifies is all over the script. I prefer for all the user modifiable variables to be in one place. |
I agree that user-editable stuff would be best to keep in as few sections as possible, and closer to the top of the script is better (but not if it requires additional code to be able to, say, move something outside a loop when it uses the loop variable). There is a better way to take multiple flags to do the same thing:
On a side note, I have been working on a new set of functions to make this kind of argument parsing (and associated help info) simpler (but that shouldn't hold up pull requests using the existing conventions). |
Based on a quick look, this does appear to have changed more than I was anticipating. It looks like the batch script is itself being turned into another script that would be called with an extensive set of arguments, rather than users just directly editing parameters in the script itself. That wasn't necessarily the direction that I was suggesting we go. @tbbrown Can you give us a little more context as to why you went this route? |
In regards to Tim C.'s functions, this would be very useful if it could simplify the I/O of these scripts. I've found the Tim B. style of I/O quite cumbersome to write and so I just don't use it for new code. If a lot of that could be in external functions that might make things much easier to use. |
As a preview, here is some example code that uses the new parameter functions I wrote (the example will be at the top of the new shlib file):
The usage function manually prints a few lines explaining the overall purpose of the script, then calls I have put the functions through a little testing, and wrote them with mac's ancient bash in mind (requiring a creative hack to avoid readarray and an obscure property of printf -v). |
|
…lt global variables for run_local and lifespan style options. Added some comments.
I've push changes to:
|
I'm afraid that this has turned into something much more complicated than I was envisioning when I proposed that we do something along the lines of just the following:
What @tbbrown has drafted is sophisticated, but importantly, if we go this route, we are fundamentally changing the role/purpose of the Example scripts. Presently, the Example scripts are intended to serve as a didactic introduction to each sub-pipeline, in which users were expected to (1) review and engage with the Example script, and (2) frequently need to make changes appropriate for the parameters used in their particular study, guided by a fairly extensive set of comments explaining those parameters. Naturally, it makes sense for us to preset the values to something appropriate for HCP-YA or HCP-D/A. But, for me at least, it is dangerous to switch to a mode in which users are simply launching the Example scripts with or without a While this may be convenient for users for studies that literally acquired their data with the exact HCP-YA or HCP-D/A protocols, or are trying to process (by themselves) actual HCP-YA or HCP-D/A data, it does not well support users that are trying to use the Example scripts to process their own data, collected with their own modified protocols (possibly collected on GE or Philips scanners). For users in that latter situation, the splitting of the default parameter definitions into completely separate And even for users for whom executing the script in simply Also, there is the practical matter that extending this type of Example script organization to all the other sub-pipelines is going to entail a lot more work than I was envisioning. So, initially at least, I'm not a proponent of this particular implementation. That said, perhaps I'm missing some complexities that come to light if you try to implement my "simple" approach proposed above, which will inevitably drive the solution to this sort of more complicated implementation? |
It doesn't seem like a good idea to have two copies/versions of the error checking and job launching code, so if those can be moved outside of the conditional, that would be good. There are indeed a lot of parameters, and a lot of them change from YA to lifespan, so it would be painful to uncomment a large number of lines (and a user might miss one or two, causing weird hard-to-locate issues). However, I agree that having two sections that look inviting to edit is probably confusing. Maybe the YA should be the defaults, and documented for editing (and not inside a conditional), and --lifespan can just trigger a conditional that overrides the defaults, with no per-variable comments, and maybe only a comment like "this is the setup for lifespan processing, please edit the defaults above if you are using your own sequences". |
While, in general, I'm not a fan of duplicating content into separate scripts (because then edits to one in the future have to propagated to both), perhaps this is a situation where we'd be better off starting a separate "Lifespan" directory, and putting lifespan specific example scripts into that? (Similar to how we have an |
Separate example launch scripts with settings for YA and lifespan does seem like a possibility worth exploring, given the large number of differences in parameters. The user is expected to copy one or the other and then edit it, so as long as we can get them past the confusion of "which file should I copy", having only one section with defaults to edit in the chosen file should reduce any later confusion. |
And we could duplicate all the variable related comments in both scripts, to make it easier to keep the scripts otherwise aligned, except where absolutely necessary to add info specific to only HCP-YA or Lifespan. Along those lines, I'm not sure if referring to the sidecar json files with parameter values is appropriate for the Example scripts. |
My first cut at this was indeed to create a separate The requirement to have something that could successfully be run with a relatively minimal set of changes during the course (and also potentially for users beyond the course) made me reject the idea of just having every single parameter variable set with a bit of code that looks like:
As @coalsont points out, it would be painful to have all participants have to change every one of these variable settings to get things working. Particularly for an exercise during the course. Additionally, not all of these variables are just set to simple values. Look at how the Matt's requirement is that these batch scripts behave, by default, in a way that is HCP-YA compatible. So all the parameter values must be set, by default, to the values that are appropriate for HCP-YA. That means that having a user modify the script to run LifeSpan data requires that they modify the settings of quite a lot of variable values. That's painful and error prone. As for pulling the job launching code out of the conditional, I assume you mean the code that builds the actual @mharms, I think referring to the JSON sidecar files in these example batch scripts where parameters are set is exactly what should happen. My thought here is that when we set the The whole point of the JSON sidecar files is that they are where such "meta data" about the image files are stored. So pointing out that you get these values from the sidecar files and even giving the user some information about what sidecar file to look in and what named values to look for in the sidecar file is a real value to the users. Telling the user that the value set for the If the consensus is to leave the original As @coalsont points out this leads only to the issue of helping users answer the "which file should I copy?" question and removes the potentially more confusing issue of "These 2 blocks of code look very much the same, and I can't tell which one I should be changing." Is the plan to create a separate example batch script for LifeSpan acceptable to everyone? |
I can live with a separate example file for lifespan when there are a large number of parameters to change (and in the interest of time). I don't know if that generalizes to duplicating any other example scripts (and when there is more time to think about it). As for building the command to run, yes, that is what I meant, but you can instead do a "case" in the middle of building the string to execute, and for TOPUP add one set of arguments, and for FIELDMAP a different set. This means users wouldn't potentially have to edit the code building the command to run, only the settings, and the command building is generic enough to adapt to any sane set of settings. |
…LifeSpan/PreFreeSurferPipelineBatch.LifeSpan.sh
Also, I think it may be beneficial to reference JSON sidecars, especially if that is the definitive source of those values for some datasets. I don't follow why @mharms would want that removed. At some point, some effort should probably be put into having both the YA numbers and the potential JSON sidecar source of each applicable setting in the comments documenting them (it seems like the two versions of the example script deserve to be identical except for the settings). But, this is probably a topic for some other time. |
I've reverted the I am about to push this code over to the course machine so I can test it there in the context of the first practical session. |
As for "--Session" and "--session", etc, I am not a fan of supporting mixed capitalization of option flags (I would go all lower case). Other than compatibility concerns, I think one spelling/capitalization of each different word choice of the option is probably the right balance (so "--subject" and "--session" are okay, but adding "--Session" or "--session-id" looks too silly to me). Make the user always use lowercase, and they will stop making that mistake soon enough - support capitals in some scripts, and they will keep trying capitals in others. |
The change to just supporting the lowercase version (and only one version) of the command line options has been made in the |
I make a new copy of the ".bat" or "batch" script for each study and change the variables to be appropriate for that study. These scripts are indeed only supposed to be examples for the users to copy and modify appropriately based on the documentation contained within them. It would be nice to read parameters out of a json side car for some of the more esoteric arguments as an alternative to having to look them up and specify them. |
Regarding the jsons: I'm ok with us referring to them in some fashion, but we need to keep in mind that the generation of the sidecar json's (from Also, I'd very much like to keep all the comments regarding the parameters in sync between the two versions. That means that the HCP-YA version should incorporate the same comments regarding where to find the values in the sidecar jsons if they exist (they don't exist in our HCP-YA "unproc" packages, but if we re-converted that data to NIFTI with a modern version of Also, I very much support @coalsont's point that in terms of building the command to run, it needs to be flexible enough to work if users modify the parameter values (e.g., change to a different type of field map correction). Other than the two different example scripts having a different set of defaults, for users that have their own customized protocol and need to modify the parameter values, they should be able to start from either example script, and get to where they need to be. i.e., the HCP-YA version needs to continue to support using SE fields, and the Lifespan version needs to continue to support using gradient-echo field maps, and in neither case should users ever have to muck with changing the arguments provided to the actual invocation of the pipeline. Last, if we've settled on an approach, I was hoping that we could get this done for all 3 structural-related example scripts, and both fMRIVolume and fMRISurface (i.e., all the scripts for which we currently know what the variables will look like for the Lifespan data). |
For jsons perhaps there could be optional arguments to specify them? |
That would be a nice feature, but would be a non-trivial addition to the example scripts. Plus, that sort of automated parsing of the jsons and retrieval of parameters will be something that the containerized version handles. |
…SurferPipelineBatch.LifeSpan.sh
I conditionally added the I also changed the comments referring to the JSON files along the lines suggested by @mharms. We've already written Python code to extract values from the JSON files, but adding use of it to these examples would seem to be unnecessary given that this is all being transitioned into the QuNex container. Writing similar example scripts for the rest of the structural preprocessing and functional preprocessing etc. will have to wait until other things are done for the course. This arrangement of the example batch scripts (having separate examples on a per-project basis) calls into question whether there is any value/need for supporting any command line options in these example batch scripts at all. |
I often run locally when editing and testing the pipelines, though I have typically edited the default rather than using the --runlocal option. My new options code doesn't currently accept repeatable options (like --subject in this script, which would need to be turned into a single option with a delimiter-separated list to be supported with my current functions - a delimiter-separated list would be shorter than repeating the option), nor parameters without an = (like --runlocal in this script, which could be made into --runlocal=TRUE, which would also allow an edited default of true to be overridden with it - "--help" is an exception which is hardcoded into my parsing function currently). Thoughts on whether I should add support for these cases? |
Usually subjects are delimited by spaces in the Batch file. |
If you use quotes, you can do that on the command line, too. Or, the option can set a different variable and you can test that for nonempty. |
The |
I expect once the course is done and things have calmed down, and the command-building code is fully generic, we can copy the missing comments into the generic version, replace the second version with a copy of the first, and change its settings to match lifespan. From then on, it would be two copies of the same file, with only the settings changed. |
I expect that if I had started this set of changes along the lines of copying the original example to create a version for LifeSpan, then it would have ended up just as @coalsont describes. The original goals I had were to try to somehow keep things all in one example file, have that example file work for HCP-YA data by default, have a very minimal set of code changes for course participants (or others) to make to allow it to work on LifeSpan data (I could not have them changing each and every parameter setting), have it done by the time I had to submit my materials for the course, and have it done in time to be on the course master machine before cloning. The modifications to what used to be the Given that we are now going along the path of having a separate example for each project, such things as:
all make perfect sense. I might be able to get some of that done next week, but that will probably not be in time for inclusion on the course machine and may require some minor modifications to the materials for the first practical that just cannot be made at this point. |
At this point, I don't really care much about what ends up on the course machine as long as it works, but hope that we can get this all tidied up before you start your new job. |
@coalsont : Should we close this PR? |
This lifespan example looks slightly more capable than the current example, in that you can change between fieldmap types without editing the command section. But, it would be the only lifespan example. |
Actually, the current one seems to handle it by passing in NONE for the unused options (while the lifespan actually avoids specifying the flags), so maybe it doesn't matter in that respect (though I have never liked the NONE convention). |
Changes to Examples/Scripts/PreFreeSurferPipelineBatch.sh to support processing LifeSpan data while leaving HCP-YA data as the default.