Skip to content

vdragan1993/serbian-stemmer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Serbian Stemmer

Stemmer for Serbian language, built on Croatian stemmer from Natural Language Processing group of Faculty of Humanities and Social Sciences, University of Zagreb.

Stemmer works for Serbian Latin Alphabet, replaces special characters (č -> c, ć -> c, dž -> dz, đ -> dj, š -> s, ž -> z), and returns lowercase characters.

In general, stemmer works best on adjectives and nouns.

Usage

Download rules.txt, transformations.txt and serbian_stemmer.py files and put them in same directory.

from serbian_stemmer import stem

output_text = stem(input_text)

Stemmer is written in Python 3.4.

Releases

No releases published

Packages

No packages published

Languages