Skip to content
This repository has been archived by the owner on Dec 19, 2023. It is now read-only.
/ bengali-stemmer Public archive

A library of implementations of published stemming methods for the Bengali language.

License

Notifications You must be signed in to change notification settings

banglakit/bengali-stemmer

Repository files navigation

BanglaKit Bengali Stemmer

A stemmer is a light-weight approach to find root words, avoiding expensive morphological analysis. The BanglaKit Stemmer implements a stepwise approach to removing inflections from Bengali Words [1].

Work is in progress with the algorithm of the stemmer, the implementations may vary significantly from version to version.

Algorithms

Rafi Kamal's Stemmer

Originally Developed by Rafi Kamal. Ported to Python.:

from bengali_stemmer.rafikamal2014 import RafiStemmer
stemmer = RafiStemmer()
stemmer.stem_word('বাংলায়')

Mahmud's Stemmer

Originally Implemented by M. R. Mahmud, M. Afrin, M. A. Razzaque, E. Miller and J. Iwashige. Ported to Python [1] Under development. As of now, only verb stemming has been implemented. mahmud2014 package under banglakit.

References

[1] M. R. Mahmud, M. Afrin, M. A. Razzaque, E. Miller and J. Iwashige, "A rule based bengali stemmer," 2014 International Conference on Advances in Computing, Communications and Informatics (ICACCI), New Delhi, 2014, pp. 2750-2756. doi: 10.1109/ICACCI.2014.6968484

About

A library of implementations of published stemming methods for the Bengali language.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages