KQA Pro

KQA Pro^[1] is a large-scale dataset for Complex KBQA, where a compositional and highly-interpretable formal format, named Program, is defined to represent the reasoning process of complex questions. Compositional strategies are proposed to generate questions, corresponding SPARQLs, and Programs with a small number of templates, and the generated questions are then paraphrased to natural language questions (NLQ) by crowdsourcing, giving rise to around 120K diverse instances. SPARQL and Program depict two complementary solutions to answer complex questions, which can benefit a large spectrum of QA methods. Besides the QA task, This dataset can also serves for the semantic parsing task. In addition, it is currently the largest corpus of NLQ-to-SPARQL and NLQ-to-Program.

This dataset can be downloaded via the link.

Leaderboard

The authors maintain their own leaderboard, which accepts submission.

References

[1] Shi, Jiaxin, Shulin Cao, Liangming Pan, Yutong Xiang, Lei Hou, Juanzi Li, Hanwang Zhang, and Bin He. KQA Pro: A Large-Scale Dataset with Interpretable Programs and Accurate SPARQLs for Complex Question Answering over Knowledge Base. arXiv preprint arXiv:2007.03875 (2020).

Go back to the README

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

kqa_pro.md

kqa_pro.md

KQA Pro

Leaderboard

References

Files

kqa_pro.md

Latest commit

History

kqa_pro.md

File metadata and controls

KQA Pro

Leaderboard

References