Skip to content

Adding a new Bank Provider

Patrick Durand edited this page Nov 19, 2016 · 20 revisions

DevDoc

This document is intended fo the following audience: developers.

Introduction

BioDocumentProvider was originally designed to query and get sequence information from public web services: NCBI Entrez eUtils service (NCBI, Bethesda, USA) and EB-eye Search service (EBI, Hinxton, UK).

However, the software uses the Model-View-Controller (MVC) paradigm as well as a simple plugin architecture to enable the addition of new services, even ones that do not serve sequence data.

Software architecture

To achieve its business work, BioDocumentProvider (BDV) connects to these web services using what we call a BankProvider.

In turn, a BankProvider defines the list of available bank(s) for that service. Each such bank is a BankType. And this BankType is responsible for providing the MVC implementation:

  • a QueryModel: it defines the fields that are available to query the remote server;
  • a QueryEngine: it defines the controller, i.e. the entity responsible for managing transactions with the remote server. This entity is closely associated to a ServerConfiguration;
  • a Search and a Summary: the data models handled by BDV to present query results to the user. These two data models are augmented by the presentation layer, called SummaryDocPresentationModel.

Now, all these entities are Java interfaces you have to implement to setup a new data bank provider for BDV.

##Real services

To understand how to implement such services, you can have a look at those already available in BDV to query and retrieve sequence data from NCBI, EBI and Ensembl.

BDV was originally designed to handle NCBI Entrez databanks system (NCBI, Bethesda, USA) and more precisely to query, retrieve and display sequence information out of nucleotide and protein banks.

Then, it was augmented to use EBI-Search databanks system (EBI, Hinxton, UK), again to deal with sequence information.

Finally, more recently I have added an Ensembl Service to deal with a different type of data: variations. A way to show that BDV is opened to manage different data flavours related to sequences.

Those three services are contained in appropriate packages, as follows:

##How to create a new service?

To illustrate how to add a new BankProvider service, let's take the example of the NCBI one; it is actually the one for which BDV was originally designed for.

###Step 1 - analyze the remote API

Considering NCBI databanks access, one can use the NCBI Entrez eUtils service. This is a URL-based API relying upon the use of several sevices among which:

  • ESearch: to query NCBI given some terms such as sequence ID, gene name, publication date, organism, etc.
  • ESummary: to retrieve short descriptions of these sequences given their IDs.
  • EFetch: to query NCBI with a sequence ID and retrieve the full entry.

NCBI remote services will serve BDV with some data we have to deal with.

###Step 2 - prepare data models

Playing with the three NCBI services above mentioned led us to the design of the data model ready to be used by BDV. This data model is made of concrete implementations of Search and Summary interfaces. They are served by these two classes:

During the preparation of these classes, we had to deal with official NCBI DTDs: esearch.dtd and esummary-v1.dtd. They were used to automatically generate Java/XML binding classes using JAXB framework, part of the official Oracle's Java SDK. DTDs are located here and generated classes are in packages bzh.plealog.bioinfo.docviewer.service.ncbi.model.esearch and bzh.plealog.bioinfo.docviewer.service.ncbi.model.esummary.

###Step 3 - setup the query engine

The QueryEngine is the component used by BDV to handle connections to the remote service. Here, we target NCBI, so we setup these classes:

###Step 4 - setup the query model

As you may have seen when using BDV, it provides a graphical query editor. More precisely, you have to know that this editor directly derivates from BLAST Filter Tool. It is used to filter out BLAST results using a set of contraints called a filter.

In the context of BDV, we made a particular use of BLAST Filter Tool: we just wanted to reuse the graphical filter tool, not the filtering engine. For that purpose, we had to setup a BFilter-based data model and a concrete implementation of QueryModel interface.

Considering NCBI, we actually designed a specific query data model to target each specific databank we wanted to query, e.g. nucleotides, proteins, structures, etc. The root data model is:

and it is associated to several other ones that serve the business of:

###Step 5 - setup the list of banks

To provide list of banks available for query for a particular service, we have to implement BankType interface.

Considering NCBI service, we setup: EntrezBank.

###In short...

All in all, the entire NCBI service is contained in package src.bzh.plealog.bioinfo.docviewer.service.ncbi.

EBI and [Ensembl(https://github.com/pgdurand/BioDocumentViewer/tree/master/src/bzh/plealog/bioinfo/docviewer/service/ensembl) services (contained in their respective packages) exactly follow the same design as NCBI one. So, follow these principles to implement your own bank service provider.

##Using a plugin

You can setup your own BankProvider as an external JAR, as follows. Among others, this will enable you to maintain your code outside the BDV project.

###Step 1 - create your BankProvider

From the BDV project, make the BDV JAR:

ant makejar

Then copy that JAR, as well as BDV dependencies, to your own project and design the Java code of your own BankProvider.

###Step 2 - package your code

While packaging your code within a JAR file, add the following attribute in your manifest:

doc-viewer-bank-provider=com.foo.bar.MyBankProvider

Where "com.foo.bar.MyBankProvider" is your implementation of a BankProvider. You can also set a comma separated list of class names if you have designed several BankProviders.

###Step 3 - install your plugin

Finally, simply copy your JAR file next to the BDV application; setup "classpath" accordingly.

Start BDV with argument:

-DV_PROVIDER=your-provider-name

Your plugin will be automatically loaded and BDV will display your service.

Clone this wiki locally