Skip to content

athenarc/research-spotlight

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

22 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

In this repository you can find code snipets from research-spotlight workflows that were created specifically for educational purposes as part of the tutorial given during 'The Web Conference 2019' in San fransisco. Further information can be found in the following references:

  1. From Research Articles to Knowledge Graphs. Pertsas, Constantopoulos. The World Wide Web Conference (WWW). San Francisco. 2019
  2. Ontology Driven Extraction of Research Processes. Pertsas, Constantopoulos, Androutsopoulos. International Semantic Web Conference (ISWC). Monterey. 2018
  3. Ontology-Driven Information Extraction from Research Publications. Pertsas, Constantopoulos. International Conference on Transactions of Digital Libraries. Porto. 2018

Research Spotlight

Research Spotlight (RS) provides automated workflows that allow for the population of Scholarly Ontology's core entities and relations. To do so, RS provides distance supervision techniques, allowing for training of machine learning models, interconnects with various APIs to harvest (linked) data and information from the web, and uses pretrained ML models along with lexico/semantic rules in order to extract information from text in research articles, associate it with information from article's metadata and other digital repositories, and publish the infered knowledge as linked data. Simply put, Research Spotlight allows for the transformation of text from a research article into queriable knowledge graphs based on the semantics provided by the Scholarly Ontology.

RS employs a modular architecture that allows for flexible expansion and upgrade of its various components. It is writen in Python and makes use of various libraries such as SpaCy for parsing and syntactic analysis of text, Beautiful Soup for parsing the html/xml structure of web pages and scikit-learn for implementing machine learning methodologies in order to extract entities and relations from text.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages