Skip to content

Python script that uses APIs from pscraper-lib to schedule and perform daily scraping on EV markeplaces

Notifications You must be signed in to change notification settings

eneakllomollari/pscraper-tool

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

32 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

pscraper-tool

This project contains the script that is used to schedule the daily scraping script.

Overview

It uses the schedule library to schedule daily scrape jobs and the APIs provided by pscraper-lib to perform scraping and send reports upon completion.

The scraping process is comprised of several scrape jobs which are configured in a config.yml file. Each job will run concurrently on a separate process. When all processes are complete the script builds and sends a slack report. The script is also responsible for configuring the logging. The logs reside in the logs directory.

The script is automated to run daily without interruptions. The command used to run it is

$ nohup ./scrape.py &

It runs inside a tmux session in a Google Cloud Compute Engine so that the process doesn't require any supervision.

About

Python script that uses APIs from pscraper-lib to schedule and perform daily scraping on EV markeplaces

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages