GitHub - spoluan/MEDIUM_pred_next_words: This project aims to predict the next words in a sentence using a language model trained on the Medium dataset, specifically focusing on generating likely sentences based on the initial words of a Medium post title entered in the search bar.

Project Descriptions

The Medium dataset is a collection of post titles from the Medium website. It contains a vast number of titles written by various authors, covering different topics such as technology, business, and politics. The dataset is publicly available on Kaggle, and interested parties can download it from https://www.kaggle.com/datasets/nulldata/medium-post-titles. In this project, we will be focusing on predicting the next words in a sentence using a language model. Specifically, we will be concentrating on predicting the possible sentences that a user may type based on the title's initial words. We will assume that the user will type the title in the search bar and build a model that can generate the most likely sentences to follow.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
datasets		datasets
MEDIUM_pred_next_word.ipynb		MEDIUM_pred_next_word.ipynb
README.md		README.md
word_cloud.png		word_cloud.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About

Releases

Packages

Languages

spoluan/MEDIUM_pred_next_words

Folders and files

Latest commit

History

Repository files navigation

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages