Skip to content

Fine-tuning Seq2Seq models to generate titles and abstracts using Arxiv data.

Notifications You must be signed in to change notification settings

pbmstrk/arxiv-generator

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

25 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Arxiv Generator

Generating titles and abstracts using Seq2Seq Models

Quick Start

The fine-tuned models are available on the Huggingface Models Hub, and can be loaded like any other Huggingface model,

from transformers import AutoModelForSeq2SeqLM, AutoTokenizer
# for title prediction
tokenizer = AutoTokenizer.from_pretrained('pbmstrk/t5-large-arxiv-abstract-title')
model = AutoModelForSeq2SeqLM.from_pretrained('pbmstrk/t5-large-arxiv-abstract-title')

# for abstract prediction
tokenizer = AutoTokenizer.from_pretrained('pbmstrk/t5-large-arxiv-title-abstract')
model = AutoModelForSeq2SeqLM.from_pretrained('pbmstrk/t5-large-arxiv-title-abstract')

Interactive Dashboard

You can also perform inference using the interactive dashboard. For this ensure that you have streamlit installed and run,

streamlit run app.py

The dashboard is then loaded.

Fine-tuning your own Models

To fine-tune Seq2Seq models the finetune.py script can be used. To install all necessary dependencies run

pip install ".[scripts]"

The arguments are handled using Hydra, and can be modified either in the config file or overwritten in the command line.

To load a model checkpoint,

from arxiv_generator import Seq2SeqGenerator
model = Seq2SeqGenerator.load_from_checkpoint(checkpoint_path="path/to/checkpoint")

Details

The models were fine-tuned on abstract-title pairs extracted from the Arxiv Dataset. The arxiv_generator module includes the ArxivDataset class to enable easier use of the dataset.

About

Fine-tuning Seq2Seq models to generate titles and abstracts using Arxiv data.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages