Utilizing multiple libraries to develop innovative solutions is key to being a competent software engineer. In today's fast-paced landscape, staying informed is important but difficult to do!
This terminal application utilizes Beautiful Soup to scrape the Politico website for the day's top stories. After collecting news links, the application utilizes natural language processing (NLP) libraries to summarize article texts and calculate a polairty score to further inform the reader. The resulting information is displayed to the user for convenient reading.
If you have Docker installed, you can run this application on your own machine with just 2 steps!
Pull the image from Docker Hub
docker pull smhussain5/politico-python
Then run the image as an interactive Docker container
docker run --rm -it smhussain5/politico-python
- Beautiful Soup
- Newspaper3k
- NLTK
- PyCharm
- Python
- TextBlob
This was a straightforward application, but required proper organization for clean code. Furthermore, the Newspaper3k library was unable to collect every article and the NLP, in its current state, provides adequate summaries.
In < 100 lines of code, I was able to scrape Politico and use NLP techniques to summarize the scraped articles. This is a great feat and demonstrates the power of these Python libraries. Potential refactoring may include utilizing more accurate NLP libraries and web-frameworks like Django for better presentation.