Masters thesis work for Kai Blumberg conducted at the MPI Bremen and University Bremen, conducted under the supervision of Dr. Pier Luigi Buttigieg.
Title: Interconnecting Arctic observatory data through machine-actionable knowledge representation: are ontologies fit for purpose?
Summary:
The scientific community is faced with the challenge of managing ever increasing quantities of environmental and genomic data which are often published with ambiguous or unstated relationships between data types. Ontologies represent expert scientific knowledge in human- and machine-readable formats. Such expert knowledge can be used to annotate data, giving it context to be linked to other data, as well as be understood by machine agents tasked with retrieving such data. Coming are massive quantities of interdisciplinary environmental and genomic data which left unmanaged, will likely lack the capacity to be analyzed in combination. Here I address questions about what can be done to prepare such interdisciplinary data to be interconnected using ontologies to in order to facilitate future computer-automated processing of such data. In this work I demonstrate that ontologies can be used to perform a variety of tasks related to the interconnection of information data and knowledge, I demonstrate how environmental and genomic ontology knowledge can be used to interconnect and analyze environmentally-annotated genomic data. I also analyze how ontologies can be used to facilitate the discovery and retrieval information and data about a phenomenon of interest. Additionally, I show that ontologies can be used to track the provenance of information and knowledge such as authors contributing to ontology terms. Furthermore, I show that uniform and repetitive annotation patterns can aid in the mobilization of data annotated with ontology terms. Finally, in terms of limitations, I show that more knowledge is required to be input into an example polar knowledge ontology in order for that ontology to be better able to lead researchers to new information based on input information. Overall my results indicate that ontologies are fit for the purpose of annotating and interconnecting interdisciplinary data. A potential scope for future work could involve interfacing the environmental and genomic ontologies with future environmental sensor networks, allowing for sensor outputs to directly answer questions relevant to the field of marine microbiology.