Skip to content

The API WNetSS (WordNet Semantic Similarity) allows the reproducibility of a wide range of SS measures pertaining to different categories including taxonomic-based, features-based and IC-based measures. This API allows the extraction of the topological parameters from the WordNet “is a” taxonomy which are used to express the semantics of concep…

License

Notifications You must be signed in to change notification settings

MohamedAliHadjTaieb/WNetSS-API

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Introduction

Determining the Semantic Similarity (SS) between word pairs is an important component in several research fields such as artificial intelligence, information retrieval, natural language processing and biomedical domain. The majority of SS measures are assessed using the lexical database WordNet.


WNetSS-API

The API WNetSS (WordNet Semantic Similarity) allows the reproducibility of a wide range of SS measures pertaining to different categories including taxonomic-based, features-based and IC-based measures. This API allows the extraction of the topological parameters from the WordNet “is a” taxonomy which are used to express the semantics of concepts. Also, we give the different ways in expressing the topological parameters depth and the hyponyms’ subgraph. Moreover, an evaluation module is proposed to assess the measures accuracy that can be evaluated and compared according to several widely-used benchmarks through the correlations coefficients.


WNetSS API can be dowloaded: link


Information Content (IC) based semantic similarity measures

The IC-based similarity measure was first introduced by Resnik. The basic idea of IC is that general and abstract entities found in a discourse present less IC than more concrete and specialized ones. This principle is inspired from the work of Shannon. The more probable a concept appears, the less information it conveys. The concept has then been modified and extended by several authors to include other methods. Although they commonly rely on IC values assigned to the concepts in the ontology. IC-based measures are based on couples (IC computing method, IC measure). Concerning the computing IC methods, they follow two strategies: statistical corpora analysis and exploiting only the topological parameters of “is a” taxonomy known as intrinsic computing method.


Instructions for using the API

The instructions that must be followed are:

  • install the MySQL
  • install the English WordNet (2.1 and/or 3.0)
  • copy the file "file_properties.xml", that contains the configuration for accessing to the WordNet data, in your work folder.
  • Treating the wordNet "is a" taxonomy (verbal or nominal) as it is indicated in Example0.java (for creating the wordnet database) and Example1.java (for extracting the paramters)
  • Exploiting the semantic similarity measures such as presented in the provided examples.
For more information visit the file readme.txt

Examples

10 Exmaples using the WNetSS API are provided for helping the developers.

  • Example0: Creating data base and loading WordNet data.
  • Example1: Extracting Topological Parameters of Nominal WordNet "is a" taxonomy.
  • Example2: Wordnet Semantic Similarity Taxonomic Measures.
  • Example3: Wordnet Semantic Similarity Information Content Approach.
  • Example4: Wordnet Semantic Similarity Features Approach.
  • Example5: Studying the accuracy of semantic measures through the nominal benchmarks.
  • Example6: Extracting Topological Parameters of Verbal WordNet "is a" taxonomy.
  • Example7: Studying the accuracy of semantic measures through the verbal benchmarks.
  • Example8: Wordnet "is a" taxonomy - Topological paramters.
  • Example9: Wordnet "is a" taxonomy - Length Shortest Path Similarity Measure.


For downloading all Examples follow this link : download examples

Reference the work

Mohamed Ben Aouicha, Mohamed Ali Hadj Taieb, Abdelmajid Ben Hamadou: SISR: System for integrating semantic relatedness and similarity measures. Soft Comput. 22(6): 1855-1879 (2018) LINK

About

The API WNetSS (WordNet Semantic Similarity) allows the reproducibility of a wide range of SS measures pertaining to different categories including taxonomic-based, features-based and IC-based measures. This API allows the extraction of the topological parameters from the WordNet “is a” taxonomy which are used to express the semantics of concep…

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published