Skip to content
krishnaTO edited this page Jan 6, 2022 · 2 revisions

Tools

3.1. Protege

This tool helps curators to visually work with their ontologies. There are many key features to cover, and can be highlighted best by the Term modification workflow steps:

  1. Open Protege with the chosen Ontology

  2. Go to the 'Entities' tab

  3. In the 'Classes' subtab, you'll see the full list of base ontology classes and imported classes from imported Ontologies.

3.2. ROBOT

The ROBOT tool uses makefile recipes to run routine tasks in the development and maintanence of an ontology. The most important recipes are:

  • all
    • all_imports
      • imports/%_import.owl
    • all_patterns
      • patterns/%.owl
    • test
    • release
      • prepare_release

Import

The imports recipe allows forming links from other Ontology into the 'release' version of Agromony Ontology. It uses ROBOT to import the list of terms (using IRI), which are compiled per Ontology (located in imports/*.txt).

In general cases, individual ontology imports are run using imports/%_import.owl (% = ontology). The src version of the Agronomy Ontology then locally imports from these Ontologies, until the release recipe makes hard imports.

Patterns

Patterns are non-unique classes which follow pre-defined definitions, additive labels and given axioms. It uses dosdp-tools to generate classes according to the given templates (src/ontology/patterns/*.yaml) and linked table of input variables (src/ontology/patterns/*.tsv).

It is encouraged to create patterns for NTRs where such commonalities exist, to reduce human-error and allow useable creation schema where additional similar classes are required.

Chemical Area Densities (CAD)

CAD is a pattern to define the area densities of chemicals. This template is used to relieve the repetitive NTR task per each chemical entity.

Entity Attribute Location (EAL)

EAL is a pattern to define a class's attribute within a specific environment. This template is based on the Environment Ontology pattern, which was highly useful by a specific curator.

Releases

To make new releases for the Agronomy Ontology, you must have write access to push the changes on the main repo. In general, the release process involves refreshing the imported ontologies, followed by updating the the main ontology file. All the instructions are stored as recipes in a Makefile in src/ontology.

Remove all existing imports

*usually run once on new systems

touch src/ontology/imports/*.owl

Imports all required/listed Ontologies listed within Makefile (src/ontology/Makefile)

make all

Removes {ontology}-edit.owl file

touch src/ontology/agro-edit.owl

Create the src agro file and copy to repo base

make agro.owl
make release

3.3. ODK

The Ontology Development Kit (ODK) is a high-level tool which is key to building a new Ontology template with its various required features available to curators and developers.

It includes the ROBOT and DOSDP-tools, because it natively performs Ontology related tasks using those tools.

Starting an Ontology

The ODK's 'creating a new repository' details how to create a new repository and attempts to entrap the user into getting a taste of its main function. For the purposes of this brief documentation, here is AgrO's build notes:

Requirements:

  • Computer with >20 gb RAM, or rent a VM using any cloud services with sufficient RAM (approx. ~$0.50CAD/Hr, build time <1 Hr)
  • Docker
  1. Follow ODK's 'creating a new repository' a. Install Docker

     ```
     sudo apt-get update
     ```
     ```
     sudo apt-get install ca-certificates curl gnupg lsb-release
     curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /usr/share/keyrings/docker-archive-keyring.gpg
     echo \
     "deb [arch=$(dpkg --print-architecture) signed-by=/usr/share/keyrings/docker-archive-keyring.gpg] https://download.docker.com/linux/ubuntu \
     $(lsb_release -cs) stable" | sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
     sudo apt-get update
     sudo apt-get install docker-ce docker-ce-cli containerd.io
     ```
     Having git setup is critical
     (Troubleshooting: Permission error - Will have to use 'sudo' before each)
     ```
     git config --global user.name <USERNAME/ORGANIZATION>
     git config --global user.email <EMAIL>
     ```
    

    b. Download seed-via-docker.sh

    wget https://raw.githubusercontent.com/INCATools/ontology-development-kit/master/seed-via-docker.sh
    

    c. Create a config (yaml) (See available examples in ODK/configs) or use AgrO's built config here with the following changes:

    Update 'github_org' to own username/Organization, as it will check local git config

    d. Build (Usually takes ~15-30 mins)

     bash /home/$USER/seed-via-docker.sh -C /home/$USER/agro-odk.yaml
    

    Troubleshooting: permission error - use sudo

    e. Add src/pattern/, src/ontology/import/.txt and src/reference/* files from main repo

3.4. SSSOM

Simple Standard for Sharing Ontology Mappings (SSSOM) is a schema, for writing 1:1 mapped terms between two sources. The SSSOM format is being adopted as a standard mapping schema for term equivalencies within the ontology community.

3.4.1 Biomappings

In order to build the mappings, a tool called Biomappings was used to identify mappings between AgrO and AGROVOC. See branch add-agrovoc-mappings/scripts/generate_agrovoc_mappings.py to predict new mappings into Biomapping's curation pipeline.

The main tool, Biomappings, loads a local server hub with mapping predictions, which were predicted using a script like generate_agrovoc_mappings.py into the src/biomappings/resources/predictions.tsv file. The local server functionality allows the curator to label 'accept', 'incorrect' or 'unsure' each mapping. Often, the rdfs:labels within the mapping will be the highest contributer the scoring that's included per mappings, so such matches would be easy to accept. Other terms which do not match labels, but still have been predicted to be potential mappings may be the result of synonym matches or other algorithm-based match, and will require a curator's function to determine. There are additional distinctions within Biomappings, such as choosing 'Exact' equivalencies between mappings, or which scoring algorithm to use. However, as a developing tool within our field, precise modifications will require time and documentation.

Installation

On Ubuntu 18.04+ [on fresh server]:

py_36 () {
  cur_path=ls
  cd /usr/bin
  sudo unlink python
  sudo ln -s /usr/bin/python3.6 python
  python --version
  cd cur_path
}

On Ubuntu 20.04+:

sudo apt update
sudo apt install python-is-python3

Run on Linux:

git clone https://github.com/biopragmatics/biomappings
cd biomappings
git checkout add-agrovoc-mappings
# py_36
sudo apt update
sudo apt install python3-pip
sudo pip install --upgrade setuptools pystow bioregistry pyobo gilda indra rdflib
sudo pip install -e .[web]
biomappings web

Run on Windows:

git clone https://github.com/biopragmatics/biomappings
cd biomappings
git checkout add-agrovoc-mappings
pip install --upgrade setuptools pystow bioregistry pyobo gilda indra rdflib
pip install -e .[web]
biomappings web

Generate AgrO-AGROVOC predictions

sudo python ./scripts/generate_agrovoc_mappings.py