Skip to content

Commit

Permalink
Merge pull request nf-core#153 from mirpedrol/review-comments
Browse files Browse the repository at this point in the history
add suggestions from code review
  • Loading branch information
mirpedrol authored Jun 20, 2024
2 parents 0ce0ed0 + 358ad55 commit 0c53bdc
Show file tree
Hide file tree
Showing 5 changed files with 16 additions and 17 deletions.
1 change: 0 additions & 1 deletion .nf-core.yml
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,5 @@ lint:
- conf/test.config
- conf/test_full.config
files_unchanged:
- lib/NfcoreTemplate.groovy # Introduced a change ahead of the nf-core/tools release
- .github/PULL_REQUEST_TEMPLATE.md
nf_core_version: "2.14.1"
14 changes: 7 additions & 7 deletions docs/usage/screening.md
Original file line number Diff line number Diff line change
Expand Up @@ -48,23 +48,23 @@ MAGeCK count which is the main alignment software used is normally able to autom

### bowtie2

The MAGeCK count module supports bam files, which allows you to align with bowtie2 first. If you wish to do so (for instance to allow library with mismatches or to set the aligner with specific flags) you can provide a fasta file with `--fasta`. Currently, you also still need to provide the library file.
The MAGeCK count module supports bam files, which allows you to align with bowtie2 first. If you wish to do so (for instance to allow mapping reads to the library with mismatches or to set the aligner with specific flags) you can provide a fasta file with `--fasta` encoding the library. Currently, you also still need to provide the tab-separated library file with `--library`.

### library

If you are running the pipeline with fastq files and wish to obtain a count table, the library parameter is needed. The library table has three mandatory columns : id, target transcript (or gRNA sequence) and gene symbol.
An [example](https://github.com/nf-core/test-datasets/blob/crisprseq/testdata/brunello_target_sequence.txt) has been provided with the pipeline. Many libraries can be found on [addgene](https://www.addgene.org/).

After the alignment step, if you are performing KO (Knock-Out) screens, you can choose to correction of gene independent cell responses to CRISPR-cas9 targeting using crisprcleanr. If you are performing a CRISPR interference or activation screen, this step is not needed.
After the alignment step, if you are performing KO (Knock-Out) screens, you can choose to correct gene-independent cell responses to CRISPR-Cas9 targeting using CRISPRcleanR. If you are performing a CRISPR interference or activation screen, this step is not needed.

The pipeline currently supports 3 algorithms to detect gene essentiality, MAGeCK rra, MAGeCK mle and BAGEL2. MAGeCK MLE (Maximum Likelihood Estimation) and MAGeCK RRA (Robust Ranking Aggregation) are two different methods provided by the MAGeCK software package to analyze CRISPR-Cas9 screens. BAGEL2 identifies gene essentiality through Bayesian Analysis.
The pipeline currently supports 3 algorithms to detect gene essentiality, MAGeCK RRA, MAGeCK MLE and BAGEL2. MAGeCK MLE (Maximum Likelihood Estimation) and MAGeCK RRA (Robust Ranking Aggregation) are two different methods provided by the MAGeCK software package to analyze CRISPR-Cas9 screens. BAGEL2 identifies gene essentiality through Bayesian Analysis.
We recommend to run MAGeCK MLE and BAGEL2 as these are the most used and most recent algorithms to determine gene essentiality.

### Running CRISPRcleanR

CRISPRcleanR is used for gene count normalization and the removal of biases for genomic segments for which copy numbers are amplified. Currently, the pipeline supports annotation libraries already present in the R package or a annotation file the user can provide.
[CRISPRcleanR](https://github.com/francescojm/CRISPRcleanR) is used for gene count normalization and the removal of biases for genomic segments for which copy numbers are amplified. Currently, the pipeline supports annotation libraries already present in the R package or user-provided annotation files.
Most used library already have an annotation dataset which you can find [here](https://github.com/francescojm/CRISPRcleanR/blob/master/Reference_Manual.pdf). To use CRISPRcleanR normalization, use `--crisprcleanr library`, `library` being the exact name as the library in the CRISPRcleanR documentation (e.g: "AVANA_Library").
Otherwise, if you wish to provide your own file, please provide it in csv form, and make sure it follows the following format, with the comma in front of "CODE" included :
Otherwise, if you wish to provide your own file, please provide it in CSV format, and make sure it follows the following format (with the comma in front of "CODE" included):

| ,CODE | GENES | EXONE | CHRM | STRAND | STARTpos | ENDpos |
| -------------------- | ----------- | ------------- | ---- | ------ | -------- | -------- |
Expand All @@ -89,7 +89,7 @@ Running MAGeCK MLE and BAGEL2 with a contrast file will also output a Venn diagr

### Running MAGeCK RRA only

MAGeCK RRA performs robust ranking aggregation to identify genes that are consistently ranked highly across multiple replicate screens. To run MAGeCK rra, you can define the contrasts as previously stated in the last section (with a `.txt` extension) and also specify `--rra` .
MAGeCK RRA performs robust ranking aggregation to identify genes that are consistently ranked highly across multiple replicate screens. To run MAGeCK RRA, you can define the contrasts as previously stated in the last section (with a `.txt` extension) and also specify `--rra`.

### Running MAGeCK MLE only

Expand All @@ -112,7 +112,7 @@ BAGEL2 uses the same contrasts from `--contrasts`.

### MAGECKFlute

The downstream analysis involves distinguishing essential, non-essential, and target-associated genes. Additionally, it encompasses conducting biological functional category analysis and pathway enrichment analysis for these genes. Furthermore, the function provides visualization of genes within pathways, enhancing user exploration of screening data. MAGECKFlute is run automatically after MAGeCK MLE and for each MLE design matrice. If you have used the `--day0_label`, MAGeCKFlute will be ran on all the other conditions. Please note that the DepMap data is used for these plots.
The downstream analysis involves distinguishing essential, non-essential, and target-associated genes. Additionally, it encompasses conducting biological functional category analysis and pathway enrichment analysis for these genes. Furthermore, it provides visualization of genes within pathways, enhancing user exploration of screening data. MAGECKFlute is run automatically after MAGeCK MLE and for each MLE design matrice. If you have used the `--day0_label`, MAGeCKFlute will be ran on all the other conditions. Please note that the DepMap data is used for these plots.

Note that the pipeline will create the following files in your working directory:

Expand Down
8 changes: 4 additions & 4 deletions main.nf
Original file line number Diff line number Diff line change
Expand Up @@ -44,11 +44,11 @@ include { CRISPRSEQ_SCREENING } from './workflows/crisprseq_screening'
workflow NFCORE_CRISPRSEQ {

take:
reads_targeted // channel: fastqc files read in from --input
reads_targeted // channel: fastqc files read in from --input
reads_screening // channel: fastqc files read in from --input
reference // channel: reference sequence read from --input
protospacer // channel: protospacer sequence read from --input
template // channel: template sequence read from --input
reference // channel: reference sequence read from --input
protospacer // channel: protospacer sequence read from --input
template // channel: template sequence read from --input

main:
//
Expand Down
2 changes: 1 addition & 1 deletion nextflow_schema.json
Original file line number Diff line number Diff line change
Expand Up @@ -171,7 +171,7 @@
},
"fasta": {
"type": "string",
"description": "Fasta file in case you want to map with bowtie2 and then MAGeCK count"
"description": "Library in fasta file format in case you want to map with bowtie2 and then MAGeCK count"
},
"day0_label": {
"type": "string",
Expand Down
8 changes: 4 additions & 4 deletions subworkflows/local/utils_nfcore_crisprseq_pipeline/main.nf
Original file line number Diff line number Diff line change
Expand Up @@ -98,11 +98,11 @@ workflow PIPELINE_INITIALISATION {
} else {
files = [ fastq_1 ]
}
reads_targeted: [ meta.id, meta - meta.subMap('condition') + [ single_end:fastq_2?false:true, self_reference:reference?false:true, template:template?true:false ], files ]
reads_targeted: [ meta.id, meta - meta.subMap('condition') + [ single_end : fastq_2 ? false : true, self_reference : reference ? false : true, template : template ? true : false ], files ]
reads_screening:[ meta + [ single_end:fastq_2?false:true ], files ]
reference: [meta - meta.subMap('condition') + [ single_end:fastq_2?false:true, self_reference:reference?false:true, template:template?true:false ], reference]
protospacer: [meta - meta.subMap('condition') + [ single_end:fastq_2?false:true, self_reference:reference?false:true, template:template?true:false ], protospacer]
template: [meta - meta.subMap('condition') + [ single_end:fastq_2?false:true, self_reference:reference?false:true, template:template?true:false ], template]
reference: [meta - meta.subMap('condition') + [ single_end : fastq_2 ? false : true, self_reference : reference ? false : true, template : template ? true : false ], reference]
protospacer: [meta - meta.subMap('condition') + [ single_end : fastq_2 ? false : true, self_reference : reference ? false : true, template : template ? true : false ], protospacer]
template: [meta - meta.subMap('condition') + [ single_end : fastq_2 ? false : true, self_reference : reference ? false : true, template : template ? true : false ], template]
}
.set { ch_input }

Expand Down

0 comments on commit 0c53bdc

Please sign in to comment.