Skip to content

Commit

Permalink
Merge pull request #31 from samhorsfield96/docker
Browse files Browse the repository at this point in the history
Add docker to CI
  • Loading branch information
johnlees authored May 31, 2024
2 parents ab252d6 + 5778152 commit ebcdd26
Show file tree
Hide file tree
Showing 9 changed files with 127 additions and 28 deletions.
47 changes: 47 additions & 0 deletions .github/workflows/docker_push.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
name: Build and push Docker image

on:
push:
branches:
- 'master'
- 'docker'
tags:
- 'v*'
pull_request:
branches:
- 'master'
create:
tags:
- v*

jobs:
docker-upload:
runs-on: ubuntu-latest
steps:
- name: Checkout code
uses: actions/checkout@v2
- name: Docker meta
id: meta
uses: docker/metadata-action@v4
with:
images: samhorsfield96/ggcaller
- name: Set up QEMU
uses: docker/setup-qemu-action@v1
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v1
- name: Login to DockerHub
uses: docker/login-action@v3
with:
username: ${{ secrets.DOCKER_REGISTRY_USERNAME }}
password: ${{ secrets.DOCKER_REGISTRY_PASSWORD }}
- name: Build and push
id: docker_build
uses: docker/build-push-action@v3
with:
push: ${{ github.event_name != 'pull_request' }}
tags: ${{ steps.meta.outputs.tags }}
labels: ${{ steps.meta.outputs.labels }}
file: docker/Dockerfile
provenance: false
- name: Image digest
run: echo ${{ steps.docker_build.outputs.digest }}
26 changes: 21 additions & 5 deletions docker/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -4,11 +4,27 @@ USER root

# create a project directory inside user home
ARG MAMBA_DOCKERFILE_ACTIVATE=1
COPY . /src
WORKDIR /src

# create a project directory inside user home
# (this isn't used with a clone running snakemake)
ENV PROJECT_DIR $HOME/app
RUN mkdir $PROJECT_DIR
# copy the code in
COPY . $PROJECT_DIR
WORKDIR $PROJECT_DIR

# build conda env
ENV ENV_PREFIX $PROJECT_DIR/env
COPY --chown=$user:$user docker/environment_docker.yml /tmp/environment_docker.yml

COPY --chown=$user:$user docker/entrypoint.sh /usr/local/bin/
RUN chmod u+x /usr/local/bin/entrypoint.sh

RUN micromamba install -y -n base -f /tmp/environment_docker.yml && \
micromamba clean --all --yes && python -m pip install --no-deps --ignore-installed . \
&& PATH=$PATH:/opt/conda/bin
WORKDIR /workdir
micromamba clean --all --yes && \
python -m pip install --no-deps --ignore-installed . && \
mkdir ggc_db && \
ggcaller --balrog-db ggc_db && \
PATH=$PATH:/opt/conda/bin

ENTRYPOINT [ "/usr/local/bin/entrypoint.sh" ]
4 changes: 4 additions & 0 deletions docker/entrypoint.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
#!/bin/bash --login
set -e

exec "$@"
8 changes: 5 additions & 3 deletions docs/installation.rst
Original file line number Diff line number Diff line change
Expand Up @@ -17,11 +17,11 @@ First, install `Docker <https://docs.docker.com/get-docker/>`_ for your OS. If r

To use the latest image, run::

docker pull samhorsfield96/ggcaller:latest
docker pull samhorsfield96/ggcaller:master

To run ggCaller from the Docker Hub image, run::

cd test && docker run --rm -it -v $(pwd):/workdir samhorsfield96/ggcaller:latest ggcaller --refs pneumo_CL_group2.txt
cd test && docker run --rm -it -v $(pwd):/workdir -v $(pwd):/data samhorsfield96/ggcaller:master ggcaller --balrog-db /app/ggc_db --refs /workdir/pneumo_CL_group2_docker.txt --out /workdir/ggc_out

You can also build the image yourself. First download and switch to the ggCaller repository::

Expand All @@ -33,7 +33,9 @@ Finally, build with Docker. This should take between 5-10 minutes to fully insta

To run ggCaller from a local Docker build, run::

cd test && docker run --rm -it -v $(pwd):/workdir ggc_env:latest ggcaller --refs pneumo_CL_group2.txt
cd test && docker run --rm -it -v $(pwd):/workdir -v $(pwd):/data ggc_env:latest ggcaller --balrog-db /app/ggc_db --refs /workdir/pneumo_CL_group2_docker.txt --out /workdir/ggc_out

Please ensure you keep ``--balrog-db /app/ggc_db`` and ``/workdir`` paths as specified above.

Installing with singularity
-----------------------------------
Expand Down
12 changes: 8 additions & 4 deletions docs/quickstart.rst
Original file line number Diff line number Diff line change
Expand Up @@ -15,13 +15,17 @@ The easiest way to get up and running is using Docker. To get up and running, pu
Preparing the data
------------------

Place all of you samples to analyse in the same directory. Then navigate inside and run::
Place all of your samples to be analysed in the same directory. Then navigate inside and run::

ls -d -1 $PWD/*.fasta > input.txt

If using Docker, instead navigate to the directory containing the fasta files and run the below command, to ensure file paths are relative (the docker version will not work with absolute paths)::

ls -d -1 *.fasta > input.txt
ls -d -1 *.fasta > input_docker.txt

Then, append the prefix ``/data/`` to each line to enable ggCaller to find the files::

sed -i -e 's|^|/data/|' input_docker.txt

Running ggCaller
------------------
Expand All @@ -39,9 +43,9 @@ To run ggCaller with just reads::

ggcaller --reads input.txt --out output_path

If using Docker, run with the below command. You must ensure all paths are relative, including in ``input.txt``::
If using Docker, run with the below command, ensuring you keep ``--balrog-db /app/ggc_db`` and ``/workdir`` paths as specified below. Replace ``path to files`` with the absolute path to the directory of files in ``input_docker.txt``::

docker run --rm -it -v $(pwd):/workdir samhorsfield96/ggcaller:latest ggcaller --refs input.txt --out output_path
docker run --rm -it -v $(pwd):/workdir -v <path to files>:/data samhorsfield96/ggcaller:master ggcaller --balrog-db /app/ggc_db --refs /workdir/input_docker.txt --out /workdir/output_path

.. important::
We haven't extensively tested calling genes within
Expand Down
2 changes: 1 addition & 1 deletion ggCaller/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,4 +2,4 @@

'''ggCaller: a gene caller for Bifrost graphs'''

__version__ = '1.3.4'
__version__ = '1.3.5'
16 changes: 13 additions & 3 deletions ggCaller/__main__.py
Original file line number Diff line number Diff line change
Expand Up @@ -326,6 +326,12 @@ def get_options():

# Other options
Misc = parser.add_argument_group('Misc. options')
Misc.add_argument("--balrog-db",
default=None,
dest="balrog_db",
help="Path to save BALROG and default annotation databases. If not specified will download"
"automatically on first run."
"[Default = None]")
Misc.add_argument("--quiet",
dest="verbose",
help="suppress additional output"
Expand Down Expand Up @@ -420,20 +426,24 @@ def main():
options.reads is not None) and (options.query is None):
graph_tuple = graph.build(options.refs, options.kmer, stop_codons_for, stop_codons_rev, start_codons_for,
start_codons_rev, options.threads, False, options.no_write_graph, options.reads, ref_set)
elif options.balrog_db is not None:
db_dir = download_db(options.balrog_db)
sys.exit(0)
else:
print("Error: incorrect number of input files specified. Please only specify the below combinations:\n"
"- Bifrost GFA and Bifrost colours file (with/without list of reference files)\n"
"- Bifrost GFA, Bifrost colours file and list of query sequences\n"
"- List of reference files\n"
"- List of read files\n"
"- A list of reference files and a list of read files.")
"- A list of reference files and a list of read files.\n"
"- A path to download the balrog gene model files.")
sys.exit(1)

# unpack ORF pair into overlap dictionary and list for gene scoring
input_colours, nb_colours, overlap, ref_list = graph_tuple

# download balrog and annotation files
db_dir = download_db()
db_dir = download_db(options.balrog_db)

# set rest of panaroo arguments
options = set_default_args(options, nb_colours)
Expand Down Expand Up @@ -486,7 +496,7 @@ def main():
# load models models if required
if not options.no_filter:
print("Loading gene models...")
ORF_model_file, TIS_model_file = load_balrog_models()
ORF_model_file, TIS_model_file = load_balrog_models(db_dir)

else:
ORF_model_file, TIS_model_file = "NA", "NA"
Expand Down
35 changes: 23 additions & 12 deletions models/__main__.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,28 +4,39 @@

""" Get directories for model and seengenes """
module_dir = os.path.dirname(os.path.realpath(__file__))
zipped_db_dir = db_dir = os.path.join(module_dir, "ggCallerdb.tar.bz2")
db_dir = os.path.join(module_dir, "ggCallerdb")
balrog_model_dir = os.path.join(db_dir, "balrog_models")
module_zipped_db_dir = os.path.join(module_dir, "ggCallerdb.tar.bz2")
module_db_dir = os.path.join(module_dir, "ggCallerdb")
module_balrog_model_dir = os.path.join(module_db_dir, "balrog_models")

def download_db():
if not os.path.exists(zipped_db_dir):
def download_db(download_db=None):
if download_db is None:
zipped_db_path = module_zipped_db_dir
db_path = module_db_dir
output_dir = module_dir
else:
zipped_db_path = os.path.join(download_db, "ggCallerdb.tar.bz2")
db_path = os.path.join(download_db, "ggCallerdb")
output_dir = download_db

if not os.path.exists(zipped_db_path):
print("Downloading databases...")
url = "https://ftp.ebi.ac.uk/pub/databases/pp_dbs/ggCallerdb.tar.bz2"
filename = wget.download(url, out=module_dir)
filename = wget.download(url, out=output_dir)
print("")
if not os.path.exists(db_dir):
tar = tarfile.open(db_dir + ".tar.bz2", mode="r:bz2")
tar.extractall(module_dir)
if not os.path.exists(db_path):
tar = tarfile.open(zipped_db_path, mode="r:bz2")
tar.extractall(output_dir)
tar.close()

return db_dir
return db_path

def load_balrog_models():
def load_balrog_models(db_path):
balrog_model_dir = os.path.join(db_path, "balrog_models")

# check if directory exists. If not, unzip file
if not os.path.exists(balrog_model_dir):
tar = tarfile.open(balrog_model_dir + ".tar.gz", mode="r:gz")
tar.extractall(db_dir)
tar.extractall(db_path)
tar.close()

geneTCN = os.path.join(balrog_model_dir, "geneTCN_jit.pt")
Expand Down
5 changes: 5 additions & 0 deletions test/pneumo_CL_group2_docker.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
/data/CR931658_Streptococcus_pneumoniae_strain_559_66_serotype_12a.fa
/data/CR931659_Streptococcus_pneumoniae_strain_Gambia_1_81_serotype_12b.fa
/data/CR931660_Streptococcus_pneumoniae_strain_6312_serotype_12f.fa
/data/CR931717_Streptococcus_pneumoniae_strain_Hammer_serotype_44.fa
/data/CR931719_Streptococcus_pneumoniae_strain_Eddy_nr._73_serotype_46.fa

0 comments on commit ebcdd26

Please sign in to comment.