pydatascraper is a Python application that provides web scraping capabilities, including fetching Google and Yelp reviews. The application has a user-friendly graphical user interface (GUI) for easy interaction.
- Web Scraping: Extract information from web pages based on user-provided URLs.
- Google Reviews: Fetch reviews for a given business or location using Google Maps API.
- Yelp Reviews: Retrieve reviews for a business using the Yelp API.
- OpenStreetMap Data: Extract latitude, longitude, and additional information from OpenStreetMap.
- Python 3.x
- Required Python packages (install using
pip install -r requirements.txt
):requests
beautifulsoup4
pandas
openpyxl
nltk
(for text processing)tkinter
(GUI toolkit)
-
Clone the repository:
git clone https://github.com/arjunlimat/pydatascraper.git
-
Install the package directly:
pip install pydatascraper
- import the webscraper model:
from pydatascraper.pyscraper import main
- Run the application:
main()
The GUI will appear, allowing you to choose different services and perform web scraping tasks.
Web Scraping
Enter a URL and click "Search" to explore available data types.
Choose the desired data type, enter a file name, and click "Download" to save the data.
Select "Google reviews" from the services dropdown.
Enter the business or location name and address. Provide a file name and click "Download" to fetch and save Google reviews.
Select "Yelp reviews" from the services dropdown. Enter the business name and address. Provide a file name and click "Download" to fetch and save Yelp reviews.
Select "Open Street Map" from the services dropdown. Enter the map URL, provide a file name, and click "Download" to extract map data.
Contributions are welcome! If you encounter issues or have ideas for improvement, please open an issue or submit a pull request.
This project is licensed under the MIT License - see the LICENSE file for details.