Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add module parameters #51

Merged
merged 33 commits into from
Dec 20, 2024
Merged

add module parameters #51

merged 33 commits into from
Dec 20, 2024

Conversation

alxndrdiaz
Copy link
Member

PR checklist

  • This comment contains a description of changes (with reason).
  • If you've fixed a bug or added code that should be tested, add tests!
  • If you've added a new tool - have you followed the pipeline conventions in the contribution docs
  • If necessary, also make a PR on the nf-core/createtaxdb branch on the nf-core/test-datasets repository.
  • Make sure your code lints (nf-core pipelines lint).
  • Ensure the test suite passes (nextflow run . -profile test,docker --outdir <OUTDIR>).
  • Check for unexpected warnings in debug mode (nextflow run . -profile debug,test,docker --outdir <OUTDIR>).
  • Usage Documentation in docs/usage.md is updated.
  • Output Documentation in docs/output.md is updated.
  • CHANGELOG.md is updated.
  • README.md is updated (including new tool citations and authors/contributors).

@alxndrdiaz alxndrdiaz self-assigned this Oct 15, 2024
@alxndrdiaz alxndrdiaz linked an issue Oct 15, 2024 that may be closed by this pull request
@alxndrdiaz
Copy link
Member Author

For params.malt_sequencetype and malt_build_params decide if they should be used separately or joined in a single set of parameters (all parameters specified using malt_build_params and explaining later in the documentation).

@alxndrdiaz
Copy link
Member Author

Not sure why the following tests are failing:

  1. nf-core CI / nf-test (24.04.2, docker, test, false) (pull_request)
  2. nf-core CI / nf-test (latest-everything, docker, test, false) (pull_request)

@alxndrdiaz alxndrdiaz requested a review from jfy133 October 15, 2024 18:09
@jfy133
Copy link
Member

jfy133 commented Oct 15, 2024

Not sure why the following tests are failing:

1. nf-core CI / nf-test (24.04.2, docker, test, false) (pull_request)

2. nf-core CI / nf-test (latest-everything, docker, test, false) (pull_request)

firstly woohoo thanks for getting t this! I'm a bit swamped with the meta-omics hackathon, but looking closely at the nf-test runs on GHA:

ERROR ~ Error executing process > 'NFCORE_CREATETAXDB:CREATETAXDB:KRAKENUNIQ_BUILD (database)'
  
  Caused by:
    Process `NFCORE_CREATETAXDB:CREATETAXDB:KRAKENUNIQ_BUILD (database)` terminated with an error exit status (64)
  
  
  Command executed:
  
    mkdir database && mv library taxonomy nucl2tax.map database
    
    krakenuniq-build \
        null \
        --threads 4 \
        --db database
    
    cat <<-END_VERSIONS > versions.yml
    "NFCORE_CREATETAXDB:CREATETAXDB:KRAKENUNIQ_BUILD":
        krakenuniq: $(echo $(krakenuniq --version 2>&1) | sed 's/^.*KrakenUniq version //; s/ .*$//')
    END_VERSIONS
  
  Command exit status:
    64
  
  Command output:
    (empty)
  
  Command error:
                                 from NCBI. Specify --taxids-for-genomes and --taxids-for-sequences
                                 separately, if desired.
    
      --help                     Print this message
      --version                  Print version information
    
    Options:
      --db DBDIR                 Kraken DB directory (mandatory except for --help/--version)
      --threads #                Number of threads (def: 1)
      --new-db NAME              New Kraken DB name (shrink task only; mandatory
                                 for shrink task)
      --kmer-len NUM             K-mer length in bp (build/shrink tasks only;
                                 def: 31)
      --minimizer-len NUM        Minimizer length in bp (build/shrink tasks only;
                                 def: 15)
      --jellyfish-hash-size STR  Pass a specific hash size argument to jellyfish
                                 when building database (build task only)
      --jellyfish-bin STR        Use STR as Jellyfish 1 binary.
      --max-db-size SIZE         Shrink the DB before full build, making sure
                                 database and index together use <= SIZE gigabytes
                                 (build task only)
      --shrink-block-offset NUM  When shrinking, select the k-mer that is NUM
                                 positions from the end of a block of k-mers
                                 (default: 1)
      --work-on-disk             Perform most operations on disk rather than in
                                 RAM (will slow down build in most cases)
      --taxids-for-genomes       Add taxonomy IDs (starting with 1 billion) for genomes.
                                 Only works with 3-column seqid2taxid map with third 
                                 column being the name
      --taxids-for-sequences     Add taxonomy IDs for sequences, starting with 1 billion.
                                 Can be useful to resolve classifications with multiple genomes
                                 for one taxonomy ID.
      --min-contig-size NUM      Minimum contig size for inclusion in database.
                                 Use with draft genomes to reduce contamination, e.g. with values between 1000 and 10000.
      --library-dir DIR          Use DIR for reference sequences instead of DBDIR/library.
      --taxonomy-dir DIR         Use DIR for taxonomy instead of DBDIR/taxonomy.
    
    Experimental:
      --uid-database             Build a UID database (default no)
      --lca-database             Build a LCA database (default yes)
      --no-lca-database          Do not build a LCA database
      --lca-order DIR1           Impose a hierarchical order for setting LCAs.
      --lca-order DIR2           The directories must be specified relative to the libary directory
      ...                        (DBDIR/library). When setting the LCAs, k-mers from sequences in
                                 DIR1 will be set first, and only unset k-mers will be set from
                                 DIR2, etc, and final from the whole library.
    							 Use this option when including low-confidence draft genomes,
                                 e.g use --lca-order Complete_Genome --lca-order Chromosome to
                                 prioritize more complete assemblies.
                                 Keep in mind that this option takes considerably longer.
  
  Work dir:
    /home/runner/work/createtaxdb/createtaxdb/.nf-test/tests/aa06ef15b1d2bac6ab77b23519901408/work/a2/bd01a6d087be11da6551b92dbec428
  
  Tip: view the complete command output by changing to the process work dir and entering the command `cat .command.out`
  
   -- Check '/home/runner/work/createtaxdb/createtaxdb/.nf-test/tests/aa06ef15b1d2bac6ab77b23519901408/meta/nextflow.log' file for details
  ERROR ~ Pipeline failed. Please refer to troubleshooting docs: https://nf-co.re/docs/usage/troubleshooting
  
   -- Check '/home/runner/work/createtaxdb/createtaxdb/.nf-test/tests/aa06ef15b1d2bac6ab77b23519901408/meta/nextflow.log' file for details
  Nextflow stderr:
  
  Nextflow 24.04.4 is available - Please consider updating your version to it

I gess something might be wrong with the Krakenuniq args insertion? I've not looked at your changes yet though!

@alxndrdiaz
Copy link
Member Author

It seems that defining the parameters as I did:

   withName: KRAKENUNIQ_BUILD {
        ext.args = { "${params.krakenuniq_build_params}" }
    }

is generating the error, also the null value should not be part of the command executed by each module

krakenuniq-build \
        null \
        --threads 4 \
        --db database

@jfy133
Copy link
Member

jfy133 commented Oct 16, 2024

It seems that defining the parameters as I did:

   withName: KRAKENUNIQ_BUILD {
        ext.args = { "${params.krakenuniq_build_params}" }
    }

is generating the error, also the null value should not be part of the command executed by each module

krakenuniq-build \
        null \
        --threads 4 \
        --db database

Maybe try:

        ext.args = { params.krakenuniq_build_params ? : "${params.krakenuniq_build_params}" : '' }

Or something to insert only if the parameter is actually filled with something

@alxndrdiaz
Copy link
Member Author

alxndrdiaz commented Oct 16, 2024

Using an empty string '' instead of null solves the bug for all modules, except for malt and krakenuniq, in nextflow.config file:

    bracken_build_params     = ''
    kraken2_build_params     = ''
    centrifuge_build_params  = ''
    diamond_build_params     = ''
    kaiju_build_params       = ''

In the case of malt it might work but we still need to decide if params.malt_sequencetype and malt_build_params should be used separately or joined in a single set of parameters (all parameters specified using malt_build_params).

@jfy133
Copy link
Member

jfy133 commented Oct 22, 2024

So for the MALT issue, we can maybe have an input validation check (see the nfcore_utils_createtaxdb<...> file or w/e it's called) something like parsing the --malt_build_params string to detect for the strings we are already defining. Something like:

def malt_build_params = '-m hello --fast'

malt_build_params.tokenize(' ').contains('--fast')

And if it does contain the flag (probably short and long flags, if they ahve them), then give an errror

@alxndrdiaz
Copy link
Member Author

For krakenuniq, the error message seems to be related to a dependency: " db-build error: JELLYFISH_BIN: unbound variable #136 "

@jfy133
Copy link
Member

jfy133 commented Nov 6, 2024

Ahhh, ok I know that error.

I need to double check where I've fixed that error before (either in the classification module, or in a modules conf in taxprofiler), but probably need to copy the same fix to the build module

@jfy133
Copy link
Member

jfy133 commented Nov 28, 2024

Sorry for the delay @alxndrdiaz , travelling and then sick kid since :(

Looking into this now!

@jfy133
Copy link
Member

jfy133 commented Nov 28, 2024

Although @alxndrdiaz it looks like tests are passing now?

But if you stil have the same issue, see the bottom of conf/test.config :)

Copy link
Member

@jfy133 jfy133 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Almost there, I think just missing krakenuniq at the moment!

nextflow.config Outdated
@@ -62,11 +62,17 @@ params {
build_diamond = false
build_kaiju = false
build_malt = false
malt_sequencetype = "DNA"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we don't use this anymore, we should remove this from the nextflow_schema.json file, and references to it elsehwere, (e.g. test profiles etc)

Currently we get the following warning:

WARN: Access to undefined parameter `malt_sequencetype` -- Initialise it to a default value eg. `params.malt_sequencetype = some_value`

@jfy133
Copy link
Member

jfy133 commented Nov 28, 2024

@alxndrdiaz I also just merged in ganon-biuld so will need an entry for this too now, but looks like it should be quite straight forward :)

@jfy133
Copy link
Member

jfy133 commented Dec 11, 2024

@alxndrdiaz if kids don't get sick in the next 12 hours I can resolve the conflicts here, and I may go ahead and add ganon on your behalf tomorrow morning (unless you do it in the meantime)

@alxndrdiaz
Copy link
Member Author

alxndrdiaz commented Dec 11, 2024

@alxndrdiaz if kids don't get sick in the next 12 hours I can resolve the conflicts here, and I may go ahead and add ganon on your behalf tomorrow morning (unless you do it in the meantime)

@jfy133 I checked the files that could be related to the malt warning but didn't find any reference to the parameter. I have time to check that again and do the ganon update.

@nf-core-bot
Copy link
Member

Warning

Newer version of the nf-core template is available.

Your pipeline is using an old version of the nf-core template: 3.0.2.
Please update your pipeline to the latest version.

For more documentation on how to update your pipeline, please see the nf-core documentation and Synchronisation documentation.

Copy link
Member

@jfy133 jfy133 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once template is in, we can merge that into here update tests if necessary and the nwe are good to go!

@jfy133
Copy link
Member

jfy133 commented Dec 19, 2024

@nf-core-bot fix linting

conf/modules.config Outdated Show resolved Hide resolved
conf/modules.config Outdated Show resolved Hide resolved
@jfy133
Copy link
Member

jfy133 commented Dec 20, 2024

OK got the test working and updated your affiliation @alxndrdiaz ! I will review as a last check and then can merge in :)

Copy link
Member

@jfy133 jfy133 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Woohooo! Thanks @alxndrdiaz !!

We can chat on slack what to do next :)

@jfy133 jfy133 merged commit 5b5fcda into nf-core:dev Dec 20, 2024
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Provide way to provide all optional parameter to each module
4 participants