Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Command line option to parse text #46

Open
oggy22 opened this issue Feb 15, 2017 · 0 comments
Open

Command line option to parse text #46

oggy22 opened this issue Feb 15, 2017 · 0 comments

Comments

@oggy22
Copy link
Owner

oggy22 commented Feb 15, 2017

There should be a command line option with an input file. Input file will contain a text in Serbian (or potentially any other supported language). The system will try to parse the text and provide the following stats:

  • number of recognized/unrecognized words (potentally list them in output files)
  • the longest (successfully) parsed phrase counting characters
  • the longest (successfully) parsed phrase counting words

By adding new words these stats should gradually increase.

Extra: Create a service which would regularly download a text from http://www.politika.rs, parse it and store results.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant