- Don't include tests in sdist builds.
- Replace charade with chardet for easier packaging.
- Improved decoding of the page into Unicode.
- More log quieting down to INFO vs WARN
- Clean up logging output at warning when it's not a true warning
- Merge changes from 0.1.14 of breadability with the fork https://github.com/miso-belica/readability.py and tweaking to return to the name breadability.
- Fork: Added property
Article.main_text
for getting text annotated with semantic HTML tags (<em>, <strong>, ...). - Fork: Join node with 1 child of the same type. From
<div><div>...</div></div>
we get<div>...</div>
. - Fork: Don't change <div> to <p> if it contains <p> elements.
- Fork: Renamed test generation helper 'readability_newtest' -> 'readability_test'.
- Fork: Renamed package to readability. (Renamed back)
- Fork: Added support for Python >= 3.2.
- Fork: Py3k compatible package 'charade' is used instead of 'chardet'.
- Update sibling append to only happen when sibling doesn't already exist.
- Give images in content boy a better chance of survival
- Add tests
- Add a user agent to requests.
- Add argparse to the install requires for python < 2.7
- Updated scoring bonus and penalty with , and " characters.
- In case of an issue dealing with candidates we need to act like we didn't find any candidates for the article content. #10
- Add code/tests for an empty document.
- Fixes #9 to handle xml parsing issues.
- Change the encode 'replace' kwarg into a normal arg for older python version.
- Fix the link removal, add tests and a place to process other bad links.
- Start to look at removing bad links from content in the conditional cleaning state. This was really used for the scripting.com site's garbage.
- Add a test generation helper readability_newtest script.
- Add tests and fixes for the scripting news parse failure.
- Add actual testing of full articles for regression tests.
- Update parser to properly clean after winner doc node is chosen.
- Bugfix: #4 issue with logic of the 100char bonus points in scoring
- Garden with PyLint/PEP8
- Add a bunch of tests to readable/scoring code.
- Fix bugs in scoring to help in getting right content
- Add concept of -d which shows scoring/decisions on nodes
- Update command line client to be able to pipe output to other tools
- Initial release and upload to PyPi