You are welcome to contribute to this repo. See the CONTRIBUTING.md for more info.
A full tutorial walking you through this program is detailed on the inspirezone.tech blog post: Learn web scraping with python in minutes: The basics using selenium.
The repo source files have gone through major modifications since the tutorial was written. You can see the original tutorial files under the folder blog-tutorial-original-code/. Use the code found in this folder to follow along with the blog tutorial.
Go to a job postings website, perform search and export job title and description link to a file.
This is a web scraper written in python using the selenium package. It will:
- Launch indeed.com/worldwide
- Perform a search for "machine learning"
- Export each job posting title and link to a file
You need to have either Firefox or Chrome installed. You also need the corresponding driver for the browser.
For Firefox download geckodriver: https://github.com/mozilla/geckodriver/releases
For Chrome download chromedriver: https://chromedriver.chromium.org/downloads
Python and the following modules must be installed on the computer running this script.
Install Python and pip:
sudo apt-get install python
sudo apt-get install pip
Install selenium:
pip install selenium
python job-search-web-scraping.py
You can run the program in headless mode adding the headless argument
python job-search-web-scraping.py headless
Make sure that you download the correct browser driver version for your os, and for Windows make sure that it extensions is .exe