Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrade simpleaf modules to 0.18.4 #7335

Merged
merged 47 commits into from
Jan 24, 2025
Merged

Upgrade simpleaf modules to 0.18.4 #7335

merged 47 commits into from
Jan 24, 2025

Conversation

an-altosian
Copy link
Contributor

PR checklist

See discussion at nf-core/scrnaseq#361 (comment). This pull request is the second attempt of updating simpleaf modules after #6319. It's still Dongze here but using my company GitHub account.

In brief, I am updating simpleaf in scrnaseq and end up being here updating the central module. Following are what I did:

  1. I updated the conda and docker image versions of simpleaf.
  2. I added some changes to reflect the latest changes in simpleaf, mainly switching the default mapper from salmon to piscem, which entirely changes the layout of the index directory generated by simpleaf index.
  3. I added a new input option to simpleaf, which is a directory containing existing mapping results.
  4. Because of the stochasticity of the orders of the content of files generated by cuttlefish, a piscem dependency, I have to remove all md5sum tests to pass the github workflow. I have tried very hard to adjust things here and there, but eventually failed because of the internal stochasticity of cuttlefish. If any of you can help or want to discuss the potential solution, I am happy to talk.
  • This comment contains a description of changes (with reason).
  • If you've fixed a bug or added code that should be tested, add tests!
  • If you've added a new tool - have you followed the module conventions in the contribution docs
  • If necessary, include test data in your PR.
  • Remove all TODO statements.
  • Emit the versions.yml file.
  • Follow the naming conventions.
  • Follow the parameters requirements.
  • Follow the input/output options guidelines.
  • Add a resource label
  • Use BioConda and BioContainers if possible to fulfil software requirements.
  • Ensure that the test works with either Docker / Singularity. Conda CI tests can be quite flaky:
    • For modules:
      • nf-core modules test <MODULE> --profile docker
      • nf-core modules test <MODULE> --profile singularity
      • nf-core modules test <MODULE> --profile conda
    • For subworkflows:
      • nf-core subworkflows test <SUBWORKFLOW> --profile docker
      • nf-core subworkflows test <SUBWORKFLOW> --profile singularity
      • nf-core subworkflows test <SUBWORKFLOW> --profile conda

@DongzeHE DongzeHE requested a review from apeltzer January 20, 2025 16:11
@an-altosian an-altosian changed the title Upgrade simpleaf modules to 0.17.2 Upgrade simpleaf modules to 0.18.5 Jan 20, 2025
@an-altosian an-altosian changed the title Upgrade simpleaf modules to 0.18.5 Upgrade simpleaf modules to 0.18.4 Jan 20, 2025
@grst
Copy link
Member

grst commented Jan 21, 2025

Some tests are still commented out, other than that it looks good to me

@an-altosian
Copy link
Contributor Author

Hi @grst , because of the internal stochasticity caused by parallelization in cuttlefish, I could not figure out how to make the newest simpleaf pass the md5sum tests. If any of you can help, I am happy to discuss potential solutions.

Another caveat is, because piscem, the default indexer & mapper in the latest simpleaf, reads and writes a large number of intermediate files, the simpleaf/index module will run extremely slow if the CPU and storage are not physically connected, like in AWS. In this case, do you think I should add the derivative scratch true at the beginning of SIMPLEAF_INDEX module?

@grst
Copy link
Member

grst commented Jan 22, 2025

Hi @grst , because of the internal stochasticity caused by parallelization in cuttlefish, I could not figure out how to make the newest simpleaf pass the md5sum tests. If any of you can help, I am happy to discuss potential solutions.

It's ok to just check for file existence in such a case.

Another caveat is, because piscem, the default indexer & mapper in the latest simpleaf, reads and writes a large number of intermediate files, the simpleaf/index module will run extremely slow if the CPU and storage are not physically connected, like in AWS. In this case, do you think I should add the derivative scratch true at the beginning of SIMPLEAF_INDEX module?

No, I don't think this should be done at the module level (probably not even on the pipeline level) because it really depends on your architecture if this makes sense. I think it would be something for institutional profiles. For instance, even on AWS if you are using Seqera Fusion, scratch would be detrimental because everything is anyway cached on a local SSD. Or you might have a scratch on your HPC, but it is too small etc.

@DongzeHE
Copy link
Member

DongzeHE commented Jan 23, 2025

Hi @grst ,

I have added assertions for the existence of files and tested locally. This PR is 100% ready from my side. However, some github actions failed. The failed tests all show error like the following. I don't think it is because of the modules.

The self-hosted runner: ubuntu-2204-x64_i-06d6b553d497f449b lost communication with the server. Verify the machine is running and has a healthy network connection. Anything in your workflow that terminates the runner process, starves it for CPU/Memory, or blocks its network access can cause this error.

What should we do with this situation?

@grst
Copy link
Member

grst commented Jan 23, 2025

yeah, maybe retry in a day or two. if that doesn't help, ping @nf-core/a-team

@DongzeHE DongzeHE added this pull request to the merge queue Jan 24, 2025
Merged via the queue into nf-core:master with commit 5b33c12 Jan 24, 2025
31 checks passed
@DongzeHE
Copy link
Member

@grst I reran the failed tasks and they succeeded. Thank you very much for your patient!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants