This project is focused on scraping millions of emails dynamically from thousands of web pages automatically from the website fredmiranda.com. The goal of this project is to create a dataset of email addresses that can be used for various purposes.
- Selenium
- Python
- IPython Notebook
- CSV library
- Install Selenium and its dependencies:
- pip install selenium
- Clone or download the repository from GitHub.
- Open the
email_scraping.ipynb
file using IPython Notebook. - Install the required libraries mentioned in the first cell of the notebook.
- Change the URL in the
url
variable to the desired web page you want to scrape emails from. - Run all the cells in the notebook. The code will start scraping emails from the web page and will keep running until all the pages are scraped.
- Once the code is finished running, a CSV file named
email_dataset.csv
will be created in the same directory as the notebook. The file will contain the email addresses scraped from the website.
This project demonstrates how to use Selenium and Python to scrape emails dynamically from thousands of web pages automatically. The dataset created can be used for various purposes and can also be easily exported to other formats.