Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CADD scripts with bundled required environments #546

Merged
merged 4 commits into from
Nov 21, 2023

Conversation

fa2k
Copy link
Contributor

@fa2k fa2k commented Nov 17, 2023

Submitting a Container

Attempt to address #528.

Warning: I'm creating a container with the same name as the existing cadd-scripts from bioconda (https://biocontainers.pro/tools/cadd-scripts). The hope that I can replace it with an improved container image. I hope this is fine, or we could maybe rename this to cadd-scripts-bundle or some other name. See the issue and first checklist item for the rationale.

Check BioContainers' Dockerfile specifications

Checklist

  1. Misc
  • My tool doesn't exist in BioConda
    • cadd-scripts exists in BioConda, but has problems functioning as a container image. (1) it requires the conda command, which is not included in the current quay.io/biocontainers/cadd-scripts:1.6--hdfd78af_1. (2) It requires write-access to the container filesystem when executed.
  • The image can be built
  1. Metadata
  • LABEL base_image
  • LABEL version
  • LABEL software.version
  • LABEL about.summary
  • LABEL about.home
  • LABEL about.license
  • MAINTAINER
  1. Extra (optionals)
  • I have written tests in test-cmds.txt
  • LABEL extra.identifier
  • LABEL about.documentation
  • LABEL about.license_file
  • LABEL about.tags

@fa2k
Copy link
Contributor Author

fa2k commented Nov 17, 2023

One question may be: Why not "fix" the bioconda package upstream instead, and let the container images be generated from the bioconda package? My reason is that the existing cadd-scripts works great as a bioconda package, because the conda command (required by the snakemake pipeline in the CADD scripts) is available. I think that bundling another conda command with the bioconda package could cause some havoc.

Edit to add: I'm not affiliated with CADD in any way, just a user.


ARG CADD_VERSION=1.6

MAINTAINER "biocontainers" <biodocker@gmail.com>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can probably set your own email here, since it's a custom image.

cadd-scripts/1.6/Dockerfile Outdated Show resolved Hide resolved
cadd-scripts/1.6/Dockerfile Outdated Show resolved Hide resolved
@mboudet
Copy link
Contributor

mboudet commented Nov 21, 2023

Seems ok, to me, but there should be a name difference between the official 'cadd-script' and this image, to avoid confusion I think.

(This means changing the LABEL software=, and the directory name to match)

@fa2k
Copy link
Contributor Author

fa2k commented Nov 21, 2023

Thanks for the review! I have another question: Is there a place I can add some usage documenation for the image itself?

The user has to mount their CADD reference data into /opt/conda/share/cadd-scripts-1.6-1/data/annotations to use it (and that's not really obvious).

@mboudet
Copy link
Contributor

mboudet commented Nov 21, 2023

Hmm, you should be able to add a README.md in the 'cadd-scripts-with-envs' folder, but I'm not sure if this will be enough visibility-wise?

@biocontainers-bot
Copy link
Collaborator

about.license field is not in spdx list: https://spdx.org/licenses/, if it is a typo error, please fix it. If this is not a standard license, please specify Custom License and use about.license_file label to specify license location (in container or url).

@biocontainers-bot
Copy link
Collaborator

No biotools label defined, please check if tool is not already defined in biotools (https://bio.tools) and add extra.identifiers.biotools label if it exists. If it is not defined, you can ignore this comment.

@biocontainers-bot
Copy link
Collaborator

No test-cmds.txt (test file) present, skipping tests

about.summary="CADD is a tool for scoring the deleteriousness of single nucleotide variants as well as insertion/deletions variants in the human genome" \
about.home="https://cadd.gs.washington.edu/" \
about.documentation="https://github.com/kircherlab/CADD-scripts" \
about.license="SPDX:Custom" \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
about.license="SPDX:Custom" \
about.license="Custom License" \

@fa2k
Copy link
Contributor Author

fa2k commented Nov 21, 2023

Thanks for the suggestions about README and micromamba clean! I've now fixed the issues, but it doesn't build on ARM64 arch. I'm troubleshooting, so it can wait a bit.

@fa2k
Copy link
Contributor Author

fa2k commented Nov 21, 2023

I'm troubleshooting, so it can wait a bit.

I could get tabix installed, but mamba is unable to solve CADD's environment in the arm64 arch - so I think this container will only be supported on x86_64.

@mboudet
Copy link
Contributor

mboudet commented Nov 21, 2023

It should be fine to only build x86_64. I believe as a whole, biocontainers does not support arm64 build anyway (for now)
(#425)

@biocontainers-bot
Copy link
Collaborator

No biotools label defined, please check if tool is not already defined in biotools (https://bio.tools) and add extra.identifiers.biotools label if it exists. If it is not defined, you can ignore this comment.

@biocontainers-bot
Copy link
Collaborator

No test-cmds.txt (test file) present, skipping tests

@mboudet
Copy link
Contributor

mboudet commented Nov 21, 2023

Should be fine to merge on my end.

@mboudet mboudet merged commit dfb7189 into BioContainers:master Nov 21, 2023
1 check passed
@fa2k fa2k mentioned this pull request Nov 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants