Dependencies:
proxy-checker | pyfiglet | Beautiful Soup | requests | re (Regex)
Source Code:
BasicProxyScraper.py
The output for all proxies will be through text files and be put into the initialized directories.
Files beginning with "All" are the unchecked (all) proxy outputs. This file will populate even if you select to not use the proxy-checker
.
Before running, make sure you populate your "URLS.txt" file with the sites you want to scrape for proxies; ONE URL PER LINE MAXIMUM
Run the scraper in terminal or through your IDE.
Using CMD and your choice of terminal, navigate to the directory housing the BasicProxyScraper.py
and type python BasicProxyScraper.py
. You will see something similar to the image below after hitting enter. Answer if you would like to have your proxies checked with either "Yes" or "No"
The next question will be about about the amount of threads you would like to run. This will depend of your PC's available power so you may have to fiddle and find the number for you. I usually choose 200-300.
Last step is to let it run! Sit back and wait for it to finish. Once done, your proxies will be within the output text files.