We will use n-gram language modeling to generate some poetry using the spaCy library for text processing. A poem will consist of three stanzas each containing four verses where each verse consists of 7—10 words.
The generational model are trained on the provided Poetry Corpus containing poems from Faiz, Ghalib and Iqbal. We will train unigram and bigram models using this corpus.
Several online solutions are available for English that use the NLTK library. However, we will be using the spaCy library to accomplish this task!
Language Models Used
- Unigram Model
- Bigram Model
- Backward Bigram Model
- Trigram Model