This project presents a Natural Language Processing (NLP) completion model based on the Markov Chain model. The model is trained on the entire corpus of stories written by Munshi Premchand, one of the most celebrated writers in Hindi literature. This project tries to generate sentences in hindi. This model is solely based on a probabilistic model and all the modellings are done manually.
-
Language Completion: The model is designed to generate natural language completions based on the patterns and structures observed in Premchand's stories.
-
Markov Chain Model: Utilizes the Markov Chain model to capture the probabilistic relationships between words, enabling context-aware completion suggestions.
-
Corpus: The training corpus consists of a comprehensive collection of Premchand's stories, providing rich linguistic context for accurate language generation.
-
Training the Model: Run all cells except last 2 cells to train the completion model on the Premchand stories corpus.
-
Generating Completions: After training, modify and run the last two cells to generate new sentences which sound similar to Premchand
This project is licensed under the MIT License.
Special thanks to Premchand Corpus dataset on Keras - https://www.kaggle.com/datasets/amankhandelia/premchand-corpus
Feel free to explore the capabilities of the Premchand NLP Completion Model, and we appreciate your involvement in enhancing and expanding this project!