Skip to content

Latest commit

 

History

History
228 lines (116 loc) · 11.9 KB

CHANGELOG.md

File metadata and controls

228 lines (116 loc) · 11.9 KB

Changelog

Changelog for nlp_profiler.

0.0.1

GitHub branch test-python-3.6-compatibility enable support for Python 3.6

Based on the issue raised on GitHub #1

b2a002a - 4f117a6 @neomatrix369 Wed Sep 16 17:15:29 2020 +0100


GitHub branch create-test-cases write tests to verify implementation and for test coverage

6bdc799 - 4c49ae5 @neomatrix369 Thu Sep 17 17:27:14 2020 +0100


GitHub branch add-progress-bars add progress bars to the various levels of transformation for better UX/UI experience

Based on the issue raised on GitHub #3 - although only implements progress bars at the first and second levels of iterations, pending level 3 iteration (row/record level)

image

a83bc23 - 7c72b0e @neomatrix369 Thu Sep 17 19:50:30 2020 +0100


GitHub branch add-progress-bars add progress bars to the various levels of transformation for better UX/UI experience

Continuing with the above changes, third-level progress-bar is in place (row-level progress)

7c72b0e - c3ada30 @neomatrix369 Fri Sep 18 13:44:48 2020 +0100


GitHub pull request #9 improve performance of the library when used on larger datasets

Branch scale-when-applied-to-larger-datasets

Added parallelisation and some caching to improve the initial slow-down in the performance.

Verification and tests have been performed, although this is a continuous process.

For performance metrics before and after changes see this comment on GitHub issue #2.

00a68e2 - 1ff5082 @neomatrix369 Fri Sep 18 14:09:12 2020 +0100


0.0.1-dev

GitHub branch create-update-release releasing NLP Profiler on GitHub and PyPi

Just releasing to GitHub under the Releases tab and on PyPi

d5d0bc1 - 6510131 @neomatrix369 _Sun Sep 27 11:56:48 2020 +0100 _


0.0.2

GitHub branch scale-when-applied-to-larger-datasets Improving performance of Grammar check on large datasets

Tweaking the Grammar check function to perform better than the previous version

81d055f - 2e311f7 @neomatrix369 Sat Oct 3 07:57:39 2020 +0100


GitHub branch ci-cd-github-action Automate CI/CD process on GitHub

Enable running tests with coverage when a new PR is created or commits are pushed to the repo, across Linux and Windows instances.

Producing the Code coverage report with each commit. And uploading the artifacts to GitHub.

a806716 - 7e4ca87 @neomatrix369 Thu Oct 15 16:50:59 2020 +0100


0.0.3

GitHub branch add-docs-for-developers and add-github-templates Update docs for Developers and Add Github templates for issues and pull request

To improve communication with developers and also to create a streamlined process for the same, docs and templates have been added and updated to the repo. These do not change the functionality of the library in any form or shape.

6d40570 - 6d40570 @neomatrix369 Sat Oct 17 19:24:30 2020 +0100


GitHub branch addNounPhraseCount Add noun phrase count in text data

Count the number of noun phrases in the text data and return it as part of granular features.

Thanks, @ritikjain51 for your contribution originally via PR #13, which was fixed and refactored via PR #47.

f8a22ba - fcd706b @neomatrix369 Wed Oct 21 13:40:20 2020 +0100


GitHub branch ci-cd-github-action Fix GitHub to run on Windows instances

Now the build and test action runs on Windows instances as well. Fixes issue reported via #21.

5e7f999 @neomatrix369 Sat Oct 24 16:43:49 2020 +0100


GitHub branch add-spacy-version-dependency-for-conda Add spacy related docs info for Conda users

Conda user(s) could not install the library using the pip install this is now possible following the docs on the README page. Fixes issue #57 via PR #58

ae91f5c @neomatrix369 Sun Dec 13 10:17:17 2020 +0000


GitHub branch indicate-ease-of-reading-of-text High-level feature: Indicate ease of reading of text

Just like spelling check and grammar checks, adding a high-level feature to indicate if a block of text is easy to read or not, based on the library textstat's flesch_reading_ease().

It returns values between 0 and 100 (I have seen values go past 0 and 100 depending on how bad or good the text is).

4919a51 @neomatrix369 Sun Dec 13 18:36:42 2020 +0000


GitHub branch add-granular-features Granular features: Add granular features: count letters, digits, spaces, whitespaces, and punctuations

Implemented functionality via PR #60 - details described in the body of the PR. In short, counting repeated letters, digits, spaces, whitespaces, and punctuations in the text. Counting English and non-English language characters in the text. Also, amending existing functionality of punctuations count, digits count and fixing a bug in ease of reading scoring. Housekeeping: removing duplicates, removing cached folders before running tests.

68bee76 @neomatrix369 Sun Dec 27 12:45:59 2020 +0000


GitHub branch add-syllables-count Add granular feature: syllables count applied to text

Implemented functionality via PR #61 - details described in the body of the PR.

Added new feature(s) to the granular features groups: count syllables extracted from the text provided.

Credits: Gunes Evitan (https://www.kaggle.com/gunesevitan) -- inspired by the discussion on https://www.kaggle.com/c/commonlitreadabilityprize/discussion/238375

498338e @neomatrix369 Sat May 15 00:50:01 2021 +0100


GitHub branch moving-notebooks-to-github-release Moving notebooks to github releases

Implemented functionality via PR #62 - details described in the body of the PR.

Moving notebooks to github releases from the notebooks folder to prevent Github from misclassifying the repo/library.

5ba447e @neomatrix369 Sat Nov 13 21:02:21 2021 +0000


GitHub branch fix-failing-high-level-tests Make the acceptance tests pass, fixing dependency versions

Implemented functionality via PR #63 - details described in the body of the PR.

Fixing dependency issues leading to tests to fail as API changes in the respective libraries i.e. language_tool_python and pandas.

5b87b03 @neomatrix369 Sat Nov 13 23:35:38 2021 +0000


GitHub branch correct-spelling-of-column-to-noun-phrase Granular features: corrected the name of the new feature column to noun phrase

Implemented functionality via PR #64 - details described in the body of the PR.

Correct the misspelt term "noun phrase" or "noun phrases" across the codebase

a9d1e1a @neomatrix369 Sat Nov 13 21:41:56 2021 +0000


GitHub branch enabled-nightly-builds Github actions: enabled nightly run of build and test

Implemented functionality via PR #65 - details described in the body of the PR.

Enabled nightly run of build and test via Github actions

dde3172 @neomatrix369 Sun Nov 14 09:12:33 2021 +0000


GitHub branch grammar_check Grammar_quality_check: language tool replaced with Gingerit

Implemented functionality via PR #69 - details described in the body of the PR.

Replaced language tool with Gingerit for faster calculations

b5a5dda @bitanb1999 Sun March 13 00:31:31 2023 +0000


GitHub branch revert-76-sourcery/revert-71-spelling_check Granular features: reverted change made to spell checks

Implemented functionality via PR #75 - details described in the body of the PR.

Reverting spell check functionality as it is not tested and tests change/break with new implementation.

2cddf51 @neomatrix369 Mon Mar 13 02:56:40 2023 +0000


GitHub branch reformating-code-and-minor-fixes Reformatting code, refactoring as per Sourcery, minor fixes and test fixes

Implemented functionality via PR #73 - details described in the body of the PR.

Reformatting code, refactoring as per Sourcery, minor fixes and test fixes. Bringing back the build system in order. Fixes old regressed tests.

7caeb47 @neomatrix369 Mon Mar 13 11:23:49 2023 +0000


Return to README.md