Skip to content

Latest commit

 

History

History
200 lines (161 loc) · 28.1 KB

lcquad.md

File metadata and controls

200 lines (161 loc) · 28.1 KB

Large-scale Complex Question Answering

LC-QuAD v1.0 and v2.0 are large-scale QA datasets towards complex questions against knowledge graphs.

Table of contents

LC-QuAD v1

The Largescale Complex Question Answering Dataset 1.0 (LC-QuAD 1.0)[1] is a Question Answering dataset with 5000 pairs of question and its corresponding SPARQL query. The target knowledge base is DBpedia, specifically, the April, 2016 version. Please see the original paper for details about the dataset creation process and framework.

This dataset can be downloaded via the link.

Leaderboard

Model / System Year Precision Recall F1 Accuracy Language Reported by
T5-Base 2022 - - 91 - EN Banerjee et al
T5-Small 2022 - - 90 - EN Banerjee et al
PGN-BERT-BERT 2022 - - 88 - EN Banerjee et al
QA Sparql 2023 88 56 68 - EN Kosten et al.
mBERT 2021 73 - 85.50 - EN Zhou Y. et al
SubQG 2019 - - 85 - EN Banerjee et al
BART 2022 - - 84 - EN Banerjee et al
Stage-I No Noise 2022 83.11 83.04 83.08 - EN Purkayastha et al.
mBERT 2021 - - 82.40 - DE Zhou Y. et al
LAMA 2019 - - 81.60 - EN Radoev et. al.
mBERT 2021 - - 80.90 - NL Zhou Y. et al
CompQA 2018 - - 77 - EN Banerjee et al
mBERT 2021 - - 76.10 - ES Zhou Y. et al
HGNet 2021 75.82 75.22 75.10 - EN Chen et al.
AQG-net 2021 76 75 76 - EN Liu et al.
SQG 2018 - - 75 - EN Banerjee et al
O-Ranking 2021 75.54 74.95 74.81 - EN Chen et al.
AQG-net 2021 - - 74.80 - EN Chen et al.
mBERT 2021 - - 74.50 - RU Zhou Y. et al
mBERT 2021 - - 74 - PT Zhou Y. et al
mBERT 2021 - - 73.20 - FR Zhou Y. et al
mBERT 2021 - - 72.60 - RO Zhou Y. et al
mBERT 2021 - - 72.30 - IT Zhou Y. et al
DAM 2021 - - 72 - EN Chen et al.
GSM 2021 71 73 72 - EN Liu et al.
mBERT 2021 - - 71.90 - HI_IN Zhou Y. et al
mBERT 2021 - - 71.70 - FA Zhou Y. et al
GGNN 2022 66 78 71 - EN Liu et al.
DAM 2022 65 77 71 - EN Liu et al.
Slot-Matching 2021 - - 71 - EN Chen et al.
G Maheshwari et. al. Pairwise 2019 66 77 71 - EN G Maheshwari et. al.
G Maheshwari et. al. Pointwise 2019 65 76 70 - EN G Maheshwari et. al.
HR-BiLSTM 2021 - - 70 - EN Chen et al.
S-Ranking 2021 65.89 75.30 69.53 - EN Chen et al.
STAGG 2021 - - 69 - EN Chen et al.
Liang et al. 2021 88 56 68 - EN Liang et al.
ValueNet4SPARQL 2023 86 84 85 - EN Kosten et al.
PGN-BERT 2018 - - 67 - EN Banerjee et al
STaG-QA_pre 2021 74.50 54.80 53.60 - EN Ravishankar et al.
STaG-QA 2021 76.50 52.80 51.40 - EN Ravishankar et al.
sparql-qa 2021 49.50 49.20 49.10 - EN M. Borroto et al
BART 2021 48.01 49.19 47.62 - EN Chen et al.
NLIWOD 2018 - - 48 - EN Banerjee et al
SYGMA 2021 47 48 47 - EN S Neelam et al
NHGG 2021 46.93 48.36 46.12 - EN Chen et al.
WDAqua-core1 2021 59 38 46 - EN Liang et al.
NSQA 2021 44.80 45.80 44.40 - EN Ravishankar et al.
NSQA 2023 45 46 45 - EN Kosten et al.
Stage-I Part Noise 2022 42.40 42.26 42.33 - EN Purkayastha et al.
Stage-II w/ type 2022 37.03 37.06 37.05 - EN Purkayastha et al.
QASparql 2021 - - 34 - EN Orogat et al.
DTQA 2021 33.94 34.99 33.72 - EN Abdelaziz et al.
QAmp 2021 25 50 33.33 - EN Purkayastha et al.
QAmp 2021 25 50 33 - EN Steinmetz et al.
QAmp 2021 25 50 33 - EN Abdelaziz et al.
QAmp 2021 25 50 33 - EN Ravishankar et al.
QAmp 2021 25 50 33 - EN Kapanipathi et al.
Stage-II w/o type 2022 32.17 32.20 32.18 - EN Purkayastha et al.
SINA 2015 - - 24 - EN Banerjee et al
WDAqua-core1 2021 22 38 28 - EN Abdelaziz et al.
WDAqua-core1 2021 22 38 28 - EN Purkayastha et al.
WDAqua-core1 2021 22 38 28 - EN Steinmetz et al.
WDAqua-core0 2021 22 38 28 - EN Ravishankar et al.
Stage-I Full Noise 2022 25.54 25.64 25.59 - EN Purkayastha et al.
Frankenstein 2021 20 21 20 - EN Liang et al.
WDAqua-core0 2021 - - 15 - EN Orogat et al.
AskNow 2021 - - 11 - EN Orogat et al.
Qanary(TM+DP+QB) 2021 - - 1 - EN Orogat et al.
Entity Type Tags Modified 2022 - - - 72 EN Lin and Lu
SPARQL Generator 2022 - - - 71.27 EN Lin and Lu
Diomedi and Hogan 2022 - - - 14 EN Lin and Lu
Yin et al. 2022 - - - 8 EN Lin and Lu
KGQAn 2023 58.07 47.12 52.03 - EN Omar et al.

LC-QuAD v2

The Largescale Complex Question Answering Dataset 2.0 (LC-QuAD 2.0)[2] is a Large Question Answering dataset with 30,000 pairs of question and its corresponding SPARQL query. The target knowledge base is Wikidata and DBpedia, specifically the 2018 version. Please see our paper for details about the dataset creation process and framework.

This dataset can be downloaded via the link.

Leaderboard for systems which require gold entity and/or relation as input

Model / System Year Precision Recall F1 Language Reported by
T5-Small 2022 - - 92 EN Banerjee et al.
T5-Base 2022 - - 91 EN Banerjee et al.
PGN-BERT-BERT 2022 - - 86 EN Banerjee et al.
SGPT_Q,K [1] 2022 - - 89.04 EN Al Hasan Rony et al.
PGN-BERT 2022 - - 77 EN Banerjee et al.
NSpM [2] 2022 - - 66.47 EN Al Hasan Rony et al.
BART 2022 - - 64 EN Banerjee et al.
Zou et al. + Bert 2021 - - 59.30 EN Zou et al.
CLC 2021 - - 59 EN Banerjee et al.
Multi-hop QGG 2020 - - 53 EN Banerjee et al.
Zou et al. + Tencent Word 2021 - - 52.90 EN Zou et al.
Multi-hop QGG 2021 - - 52.60 EN Zou et al.
AQG-net 2021 - - 44.90 EN Zou et al.
  • [1][2] Token wise match of query string is performed. Answers are not fetched from KG.

Leaderboard for systems which do not require gold entity and/or relation as input

Model / System Year Precision Recall F1 Language Reported by
SGPT_Q [3] 2022 - - 83.45 EN Al Hasan Rony et al.
ChatGPT 2023 - - 42.76 EN Tan et al.
GPT-3.5v3 2023 - - 39.04 EN Tan et al.
GPT-3.5v2 2023 - - 33.77 EN Tan et al.
GPT-3 2023 - - 33.04 EN Tan et al.
FLAN-T5 2023 - - 30.14 EN Tan et al.
UNIQORN 2021 33.1 - - EN Pramanik et al.
QAnswer 2020 30.80 - - EN Pramanik et al.
GraftNet 2018 19.7 - - EN Christmann P. et al
ElNeuQA-ConvS2S [1] 2021 26.90 27 26.90 EN Diomedi, Hogan
GRAFT-Net + Clocq [2] 2022 19.70 - - EN Christmann P. et al
Platypus 2018 3.6 - - EN Pramanik et al.
Pullnet 2019 1.1 - - EN Pramanik et al.
UNIK-QA 2020 0.5 - - EN Pramanik et al.
GETT-QA [4] 2023 40.3 - - EN Banerjee et al.
  • [1] discarded 2,502 (8.2%) of the 30,226 instances due to such quality issues..
  • [2] 2k dev, 8k test, more complex questions from origical LC-QuAD 2.0.
  • [3] Token wise match of query string is performed. Answers are not fetched from KG.
  • [4] With truncated KG embeddings.

LC-QuAD v2 + QALD-9

Leaderboard

Model / System Year Precision Recall F1 Language Reported by
mBERT [1] 2021 - - 70 PT_BR Zhou Y. et al
mBERT [2] 2021 - - 66.7 EN Zhou Y. et. al.
mBERT [3] 2021 - - 65.9 NL Zhou Y. et al
mBERT [4] 2021 - - 63.6 FR Zhou Y. et al
mBERT [5] 2021 - - 63.5 RU Zhou Y. et al
mBERT [6] 2021 - - 63.5 PT Zhou Y. et al
mBERT [7] 2021 - - 62.6 HI_IN Zhou Y. et al
mBERT [8] 2021 - - 62.2 DE Zhou Y. et al
mBERT [9] 2021 - - 62.1 RO Zhou Y. et al
mBERT [10] 2021 - - 60 FA Zhou Y. et al
mBERT [11] 2021 - - 58.8 ES Zhou Y. et al
mBERT [12] 2021 - - 57.7 IT Zhou Y. et al
  • [1] trained on LC-QuAD 1.0, tested on a data combining qald4 -9 and filter out some out-of-scope questionss.
  • [2] trained on LC-QuAD 1.0, tested on a data combining qald4 -9 and filter out some out-of-scope questionss.
  • [3] trained on LC-QuAD 1.0, tested on a data combining qald4 -9 and filter out some out-of-scope questionss.
  • [4] trained on LC-QuAD 1.0, tested on a data combining qald4 -9 and filter out some out-of-scope questionss.
  • [5] trained on LC-QuAD 1.0, tested on a data combining qald4 -9 and filter out some out-of-scope questionss.
  • [6] trained on LC-QuAD 1.0, tested on a data combining qald4 -9 and filter out some out-of-scope questionss.
  • [7] trained on LC-QuAD 1.0, tested on a data combining qald4 -9 and filter out some out-of-scope questionss.
  • [8] trained on LC-QuAD 1.0, tested on a data combining qald4 -9 and filter out some out-of-scope questionss.
  • [9] trained on LC-QuAD 1.0, tested on a data combining qald4 -9 and filter out some out-of-scope questionss.
  • [10] trained on LC-QuAD 1.0, tested on a data combining qald4 -9 and filter out some out-of-scope questionss.
  • [11] trained on LC-QuAD 1.0, tested on a data combining qald4 -9 and filter out some out-of-scope questionss.
  • [12] trained on LC-QuAD 1.0, tested on a data combining qald4 -9 and filter out some out-of-scope questionss.

References

[1] Trivedi, Priyansh, Gaurav Maheshwari, Mohnish Dubey, and Jens Lehmann. Lc-quad: A corpus for complex question answering over knowledge graphs. In International Semantic Web Conference, pp. 210-218. Springer, Cham, 2017.

[2] Dubey, Mohnish, Debayan Banerjee, Abdelrahman Abdelkawi, and Jens Lehmann. Lc-quad 2.0: A large dataset for complex question answering over wikidata and dbpedia. In International semantic web conference, pp. 69-78. Springer, Cham, 2019.

Go back to the README