- Chat feature for MWE.
- Improve chat UI.
- MWE and text analysis improvements.
- Chat feature for text analysis.
- Support for all Penn POS tags
- Bar plots for POS tags (in addition to wordclouds)
- Remove deprecated fasttext model.
- Automatic check and download of NLTK missing resources.
- Rm CI step for downloading NLTK resources.
- Facilitate configuration of plots for Text & Label Analysis plots, by creating new and more clear arguments.
- Fix minor bugs in bias analysis.
- Improve fonts and minor details in bias analysis plots.
- Add bias detection and analysis feature (based on sentiment analysis)
- Include 3 bias categories: race, religion, and gender.
- Include an initial set of key terms for each bias category.
- Add function to visualize bias analysis in a plot.
- Add function to visualize bias analysis in a tables.
- Complete refactoring and upgrading of the MWE module.
- Support for extracting variable length MWEs given a custom user syntactic patterns of POS tags.
- Predefined patterns for extracting Light Verb Constructions (LVCs), 2-3 word Noun Compounds, 2-3 Adjective Noun Compounds, and 2-3 Verb Noun Compounds, and Verb Particle Constructions (VPCs).
- Refactoring of the Association Measure module.
- Move DataFrame reader to a separate preprocessing module so that it can support all modules easier.
- Add support for extracting ngrams for MWE and also ngram analysis.
- Better encapsulation.
- Overall improvement and fix of several inconsistencies in docstring.
- Allow quite a few plot configurations via kwargs.
- Rm old code from the demo notebook.
- Change cover.
- Optimize figure creation.
- Update precommit hooks mypy and black versions
- Support for extracting variable length MWE given a user pattern of POS tags.
- Change newline encoding.
- To support multiline in GitHub release body.
- Test description.
- Test description2.
- Test description3.
- Test description4.
- Test description.
- Test description2.
- Update awk sed parser to correctly read release body.
- Fix missing multiline description in GitHub release using printf.
- Fix missing multiline description in GitHub release.
- Add action for CD.
- Publish to PyPi and GitHub Releases on bump version.
- Improve the CD workflow to ensure checks are passed and the merge is successful.
- Check for release notes, otherwise do not publish.
- Improve MWE functionalities.
- Fix fasttext issues.
- Remove support for Python 11 (for now).
- Make POS wordclouds configurable.
- Remove pos wordclouds and distplots as fields, and allow access to them via function call, for an improved data encapsulation.
- Upgrade wordcloud version to latest to avoid build failure.
- Upgrade pandas and scikit-learn versions
- Major refactoring with a semi-stable features (see below) and their documentations.
- Exploratory Data Analysis.
- Doc level Label Analysis.
- Clustering.
- Preprocessing functions.
- Partial MWEs.
- Tets.
- Initial release with major Exploratory Data Analysis, MWEs, and Preprocessing features.
- Initial documentations.