LM4KG: Improving Common Sense Knowledge Graphs with Language Models

This repository contains code for the 2020 ISWC paper "LM4KG: Improving Common Sense Knowledge Graphs with Language Models"

(Note: Link to paper and citation will be made available upon release of the conference proceedings)

Usage

Running REWEIGHT on a KG requires the following steps:

Download the KG you wish to REWEIGHT
Transform the graph to ConceptNet format (Note: Since KGs all have different formats, this requires some manual effort. Examples on transforming WebChild and YAGO are available under sentence_construction/WebChild_to_sentence.py and sentence_construction/yago_to_sentence.py)
Transform the graph triples to sentences: sentence_construction/graph_to_sentence.py
(Optional) Run a grammar checker on the sentences (Original paper uses https://github.com/awasthiabhijeet/PIE)
Generate perplexities for each sentence through a language model (Original paper uses https://github.com/xu-song/bert-as-language-model)
Transform the perplexities to edge scores and feed them back into the graph: graph_reweighting/perplexities_to_scores.py

The following links can be used to download the weighted KGs and KG enriched embeddings presented in the paper:

Weighted Knowledge Graphs
- ConceptNet REWEIGHT: Link will be provided soon.
- ConceptNet REWEIGHT_light: https://dmir.org/LM4KG/conceptnet_REWEIGHTed_light.csv.bz2
Knowledge Graph enriched word embeddings (through retrofitting):
- ConceptNet NumBERTBatch: Link will be provided soon.
- ConceptNet NumBERTBatch_light: https://dmir.org/LM4KG/embeddings_light.h5.bz2

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
analysis		analysis
data/examples		data/examples
graph_reweighting		graph_reweighting
imgs		imgs
sentence_construction		sentence_construction
Readme.md		Readme.md
requirements.txt		requirements.txt
util.py		util.py