Skip to content

Commit

Permalink
Merge pull request #44 from qbicsoftware/release/1.0.1
Browse files Browse the repository at this point in the history
Patch Release 1.0.1
  • Loading branch information
christopher-mohr authored Mar 26, 2021
2 parents 960a53c + 2fa49af commit 2308d84
Show file tree
Hide file tree
Showing 8 changed files with 108 additions and 25 deletions.
12 changes: 12 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,18 @@
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/)
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## v1.0.1 - 2021-03-26

### `Added`

### `Changed`
- Extend usage documentation and add examples
- [#43](https://github.com/qbicsoftware/variantstore-service/pull/43) - Prevent the creation of duplicate gene entries in the database ([#40](https://github.com/qbicsoftware/variantstore-service/issues/40))

### `Fixed`
- [#43](https://github.com/qbicsoftware/variantstore-service/pull/43) - Fix parsing of Ensembl version ([#42](https://github.com/qbicsoftware/variantstore-service/issues/42))
- [#43](https://github.com/qbicsoftware/variantstore-service/pull/43) - Fix `EnsemblParser` bug caused by missing `Gene` constructor ([#41](https://github.com/qbicsoftware/variantstore-service/issues/41))

## v1.0.0 - Valmart - 2021-03-02

Initial release of Variantstore.
Expand Down
4 changes: 2 additions & 2 deletions README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -14,9 +14,9 @@ The **Variantstore** is a Java/Groovy-based service application implemented usin

Features
--------
- Import metadata (JSON files using this `schema <https://github.com/qbicsoftware/mtb-metadata-specs/blob/master/schemes/mtb/variants.metadata.schema.json>`_)
- Import variants (VCF files, annotated using `SnpEff <http://snpeff.sourceforge.net>`_ or `VEP <https://www.ensembl.org/info/docs/tools/vep/index.html>`_)
- Import gene information (Ensembl, GFF3 files)
- Import metadata in JSON together with variants (see `Usage <https://oncostore-proto-project.readthedocs.io/en/latest/usage.html>`_ for details)
- Import gene information (Ensembl, GFF3 files)
- Query information on variants, genes, and cases via (secured) REST endpoints
- Ask Beacon endpoint if a specific variant exists in the store
- Export variants in Variant Call Format (VCF) and `FHIR <https://www.hl7.org/fhir/>`_
Expand Down
60 changes: 59 additions & 1 deletion docs/source/usage.rst
Original file line number Diff line number Diff line change
Expand Up @@ -28,9 +28,67 @@ Create executable jar
This command will create an executable jar in your current working directory under /target.

Import variants
---------------

Variants can be imported to the store by using a `POST` request to the corresponding **/variants** endpoint.
The variants have to be associated with metadata and the following properties have to be specified in `JSON` schema.

.. code-block:: javascript
"case": {"identifier": "string"},
"variant_annotation": {"version": "string", "name": "string", "doi": "string"},
"variant_calling": {"version": "string", "name": "string", "doi": "string"},
"reference_genome": {"source": "string", "version": "string", "build": "string"},
"is_somatic": "boolean",
"samples": [{"identifier": "string", "cancerEntity": "string"}]
For example using `curl` your upload command would look like this:

.. code-block:: bash
curl -X 'POST' \
'${host}:${variantstore-port}/variants' \
-H 'accept: application/json' \
-H 'Content-Type: multipart/form-data' \
-F metadata='{"case": {"identifier": "do1234"}, "variant_annotation": {"version": "bioconda::4.3.1t", "name": "snpeff", "doi": "10.4161/fly.19695"}, "is_somatic": "true", "samples": [{"identifier": "S123456", "cancerEntity": "HCC"}], "reference_genome": {"source": "GATK", "version": "unknown", "build": "hg38"}, "variant_calling": {"version": "bioconda::2.9.10", "name": "Strelka", "doi": "10.1038/s41592-018-0051-x"}}' \
-F files=@/path/to/variants.vcf.gz
Import additional gene information
----------------------------------

Information on genes, such as `biotype`, `name`, and `description`, can be imported to the store in `gff3` format.

For example using `curl` your upload command would look like this:

.. code-block:: bash
curl -X 'POST' \
'${host}:${variantstore-port}'/genes' \
-H 'accept: application/json' \
-H 'Content-Type: multipart/form-data' \
-F 'files=@/path/to/genes.GRCh38.87.gff3'
This feature is currently supported for `gff3` files derived from Ensembl. The Ensembl version (87 in the example above) is expected to be part of the file name otherwise the corresponding database field will be empty.
Retrieve data from the store
----------------------------
Stored data can be retrieved from the store by sending `HTTP GET` requests.
For example if you want to get a variant from the store at a specific genomic position using `curl` your command would look like this:
.. code-block:: bash
curl '${host}:${variantstore-port}'/variants?startPosition=22310284'
The full list of available endpoints can be seen below.


REST API
--------
The detailed documentation of the REST endpoints provided by the **Variantstore** can be found on `SwaggerHub <https://app.swaggerhub.com/apis/christopher-mohr/variantstore/1.0.0>`_. Additionally, views for the generated OpenAPI specification are generated as swagger-ui and rapidoc views. After startup, these views are accessible via /swagger-ui ``and`` .../rapidoc.
The detailed documentation of the REST endpoints provided by the **Variantstore** can be found on `SwaggerHub <https://app.swaggerhub.com/apis/christopher-mohr/variantstore/1.0.1>`_. Additionally, views for the generated OpenAPI specification are generated as swagger-ui and rapidoc views. After startup, these views are accessible via /swagger-ui ``and`` .../rapidoc.

| **GET /genes/{id}**
| Request a gene
Expand Down
2 changes: 1 addition & 1 deletion pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@
<artifactId>variantstore</artifactId>
<groupId>life.qbic</groupId>
<name>Variantstore</name>
<version>1.0.0</version>
<version>1.0.1</version>
<description>Variantstore is a service to store and manage genomic variants</description>
<properties>
<jdk.version>1.8</jdk.version>
Expand Down
2 changes: 1 addition & 1 deletion src/main/groovy/life/qbic/variantstore/Variantstore.groovy
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ import io.swagger.v3.oas.annotations.info.License
@OpenAPIDefinition(
info = @Info(
title = "Variantstore",
version = "1.0.0",
version = "1.0.1",
description = "Variantstore Restful API",
license = @License(name = "MIT License", url = "https://opensource.org/licenses/mit-license.php"),
contact = @Contact(url = "https://github.com/christopher-mohr", name = "Christopher Mohr", email = "christopher.mohr@uni-tuebingen.de")
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -1561,7 +1561,6 @@ annotationsoftware.name='${annotationSoftware}' AND geneid='${geneId}';"""))
annotationsoftware.name='${annotationSoftware}' AND consequence.genesymbol='${geneName}';"""))
}
else {
println(selectVariantsWithConsequences.replace(";", """ WHERE referencegenome.build='${referenceGenome}' AND consequence.genesymbol='${geneName}';"""))
result = sql.rows(selectVariantsWithConsequences.replace(";", """ WHERE referencegenome.build='${referenceGenome}' AND consequence.genesymbol='${geneName}';"""))
}
}
Expand Down Expand Up @@ -2024,12 +2023,14 @@ gene.id = consequence_has_gene.gene_id INNER JOIN consequence on consequence_has
private List<Gene> tryToStoreGeneObjects(List<Gene> genes) {
Sql sql = requestNewConnection()
sql.connection.autoCommit = false
sql.withBatch("insert INTO gene (symbol, name, biotype, chr, start, end, synonyms, geneid, description, strand, version) values (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?) ON DUPLICATE KEY UPDATE id=id") {
sql.withBatch("insert INTO gene (symbol, name, biotype, chr, start, end, synonyms, geneid, description, strand, version) values (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?) ON DUPLICATE KEY UPDATE symbol=?, name=?, biotype=?, chr=?, start=?, end=?, synonyms=?, geneid=?, description=?, strand=?, version=?") {
BatchingPreparedStatementWrapper ps ->
genes.each { gene ->
// we have to specify the values twice since we need them for the insert and "on duplicate" part of the sql query
ps.addBatch([gene.symbol, gene.name, gene.bioType, gene.chromosome, gene.geneStart,
gene.geneEnd, gene.synonyms[0], gene.geneId, gene.description, gene
.strand, gene.version])
gene.geneEnd, gene.synonyms[0], gene.geneId, gene.description, gene.strand, gene.version,
gene.symbol, gene.name, gene.bioType, gene.chromosome, gene.geneStart,
gene.geneEnd, gene.synonyms[0], gene.geneId, gene.description, gene.strand, gene.version])
}
}

Expand Down
42 changes: 27 additions & 15 deletions src/main/groovy/life/qbic/variantstore/parser/EnsemblParser.groovy
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,8 @@ import life.qbic.variantstore.model.ReferenceGenome
@Log4j2
class EnsemblParser {

private final GENOME_REFERENCE_SOURCE_GRC = "Genome Reference Consortium"

/**
* The genes
*/
Expand All @@ -40,11 +42,18 @@ class EnsemblParser {
.absolutePath.toString(), null, codec, false);

// try to extract reference genome and Ensembl version
def referenceMatch = (file.name =~ /(GRCh|hg)\d+/)
def versionMatch = (file.name =~ /(GRCh)\d+.\d+|\w+(v)\d+/)
String referenceGenome = ""
Integer ensemblVersion = 0
if (versionMatch.find()) {
(referenceGenome, ensemblVersion) = versionMatch[0][0].toString().split("\\.|v")
def foundPattern = versionMatch[0][0].toString().split("\\.|v")
referenceGenome = foundPattern.first()
ensemblVersion = Integer.valueOf(foundPattern.last())
}
else if (referenceMatch.find()) {
referenceGenome = referenceMatch[0][0].toString()
ensemblVersion = null
}

def splittedLine = ""
Expand All @@ -58,21 +67,22 @@ class EnsemblParser {
for (final Gff3Feature feature : reader.iterator()) {
if(feature.type.contains("gene")) {
numberOfGenes++
def gene = new Gene()
gene.geneId = feature.ID.split(":")[-1]
gene.bioType = feature.getAttribute("biotype")
gene.chromosome = feature.contig
gene.geneStart = feature.start
gene.geneEnd = feature.end
gene.strand = feature.strand
gene.symbol = feature.name
gene.version = feature.getAttribute("version") != null ? feature.getAttribute("version").toInteger(): -1

String geneId = feature.ID.split(":")[-1]
def bioType = feature.getAttribute("biotype")
def chromosome = feature.contig
def geneStart = feature.start as BigInteger
def geneEnd = feature.end as BigInteger
def strand = feature.strand.toString()
def symbol = feature.name
def version = feature.getAttribute("version") != null ? feature.getAttribute("version").toInteger(): -1
def description = (feature.getAttribute("description") != null) ? feature.getAttribute("description") : ''
def synonym = (feature.getAttribute("description") != null) && feature.getAttribute("description").contains("HGNC") ? feature.getAttribute("description").split("\\[").last().split("HGNC:").last().replace("]", "") : ''
gene.name = description.split("\\[").first().trim()
def name = description.split("\\[").first().trim()
def synonyms = [synonym]
def geneDescription = description.trim()

gene.synonyms = [synonym]
gene.description = description.trim()
def gene = new Gene(bioType, chromosome, symbol, name, geneStart, geneEnd, geneId, geneDescription, strand, version, synonyms)
genes.add(gene)
}
}
Expand Down Expand Up @@ -104,9 +114,11 @@ class EnsemblParser {
log.info("Read $numberOfGenes genes from provided Ensembl file.")

this.genes = genes
this.referenceGenome = new ReferenceGenome("Genome Reference Consortium", referenceGenome,
// if the reference genome is specified in the file under #!genome-build we will use this information
def refernceGenomeToDB = referenceGenomeFromFile ? referenceGenomeFromFile : referenceGenome
this.referenceGenome = new ReferenceGenome(GENOME_REFERENCE_SOURCE_GRC, refernceGenomeToDB,
referenceGenomeVersion as String)
this.version = ensemblVersion.toInteger()
this.version = ensemblVersion ? ensemblVersion.toInteger() : -1
this.date = updateDate
}
}
2 changes: 1 addition & 1 deletion src/test/groovy/life/qbic/controller/SwaggerSpec.groovy
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ class SwaggerSpec extends TestContainerSpecification{

def "swagger YAML is exposed"() {
when:
HttpResponse response = httpClient.toBlocking().exchange(HttpRequest.GET("/swagger/variantstore-1.0.0.yml"))
HttpResponse response = httpClient.toBlocking().exchange(HttpRequest.GET("/swagger/variantstore-1.0.1.yml"))

then:
response.status() == HttpStatus.OK
Expand Down

0 comments on commit 2308d84

Please sign in to comment.