The E2E Dataset, packed as a PyTorch DataSet subclass
-
Updated
Jul 12, 2018 - Python
The E2E Dataset, packed as a PyTorch DataSet subclass
T-Rex : A Large Scale Alignment of Natural Language with Knowledge Base Triples
datasets with text data for use in NLP, Text analysis, information extraction, ML research.
RNNLG is an open source benchmark toolkit for Natural Language Generation (NLG) in spoken dialogue system application domains. It is released by Tsung-Hsien (Shawn) Wen from Cambridge Dialogue Systems Group under Apache License 2.0.
EMNLP 2019: Generating Personalized Recipes from Historical User Preferences
EMNLP 2018. Learning to Describe Differences Between Pairs of Similar Images. Harsh Jhamtani, Taylor Berg-Kirkpatrick.
Collection of text transcript of speeches delivered by the PM of India Mr. Narendra Modi.
Content structuring for NLG with discourse dependency trees.
Dataset associated with "BOLD: Dataset and Metrics for Measuring Biases in Open-Ended Language Generation" paper
The French summarization dataset introduced in "BARThez: a Skilled Pretrained French Sequence-to-Sequence Model".
Harsh Jhamtani*, Varun Gangal*, Eduard Hovy, Graham Neubig, Taylor Berg-Kirkpatrick. Learning to Generate Move-by-Move Commentary for Chess Games from Large-Scale Social Forum Data. ACL 2018
PyTorch code for ACL 2022 paper: RoMe: A Robust Metric for Evaluating Natural Language Generation https://aclanthology.org/2022.acl-long.387/
A repository for ConceptFR project files.
A Constrained Text Generation Challenge Towards Generative Commonsense Reasoning
Repository for the LREC-COLING 2024 Paper: Persona-Based Corpus in the Diabetes Mellitus Domain – Applying a Human-Centered Approach to a Low-Resource Context
Add a description, image, and links to the nlg-dataset topic page so that developers can more easily learn about it.
To associate your repository with the nlg-dataset topic, visit your repo's landing page and select "manage topics."