Skip to content

EV Scraper Library, that provides APIs for pscraper-tool

Notifications You must be signed in to change notification settings

eneakllomollari/pscraper-lib

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

59 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

PHEV Electric Vehicle Scraping Library

Build Status

Sub-modules

pscraper.api

Classes

Class API

class API(username, password, localhost=False)

Provides the APIs to interact with the database

Ancestors (in MRO)
Methods
Method history_get

def history_get(self, **kwargs)

Method history_post

def history_post(self, **kwargs)

Method seller_get

def seller_get(self, **kwargs)

Method seller_patch

def seller_patch(self, phone_number, **kwargs)

Method seller_post

def seller_post(self, **kwargs)

Method vehicle_get

def vehicle_get(self, **kwargs)

Method vehicle_patch

def vehicle_patch(self, vin, **kwargs)

Method vehicle_post

def vehicle_post(self, **kwargs)

pscraper.scraper

pscraper.scraper.marketplaces

Sub-modules

Functions

Function scrape

def scrape(zip_code, search_radius, target_states, api)

Scrape data about electric vehicles from all supported marketplaces using the specified parameters

Args

zip_code :  str : The zip code to perform the search in

search_radius :  int : The search radius for the specified zip code

target_states :  list : The states to search in (i.e. ['CA', 'NV'] )

api :  API : Pscraper API to communicate with the database

Returns

list of tuples : (time, vehicles) per marketplace

pscraper.scraper.marketplaces.autotrader

Sub-modules

Functions

Function scrape_autotrader

def scrape_autotrader(zip_code, search_radius, target_states, api)

pscraper.scraper.marketplaces.autotrader.consts

pscraper.scraper.marketplaces.carmax

Sub-modules

Functions

Function scrape_carmax

def scrape_carmax(zip_code, search_radius, target_states, api)

pscraper.scraper.marketplaces.carmax.consts

pscraper.scraper.marketplaces.cars

Sub-modules

Functions

Function scrape_cars

def scrape_cars(zip_code, search_radius, target_states, api)

Scrape EV data from cars.com filtering with the specified parameters

Args

zip_code :  str : The zip code to perform the search in

search_radius :  int : The search radius for the specified zip code

target_states :  list : The states to search in (i.e. ['CA', 'NV'] )

api :  API : Pscraper API to communicate with the backend

Returns

total :  int : Total number of cars scraped

pscraper.scraper.marketplaces.cars.helpers

Functions

Function get_cars_com_response

def get_cars_com_response(url, session)

Scrapes vehicle and page information from url

Args

url :  str : Url to get the response from

session :  requests.sessionsSession : Session to use for sending requests

Returns

dict : Parsed information about the url and the vehicles it contains

Function validate_params

def validate_params(search_radius, target_states)

Validates that target_states are eligible states and search_radius is valid

Args

search_radius :  int : Radius to scrape in

target_states :  list : States provided by the scraper

pscraper.scraper.marketplaces.helpers

Functions

Function get_seller_id

def get_seller_id(vehicle, api, session)

Returns a seller id (primary_key). Search for existing seller by phone number. If not found creates a new seller and returns it's id. Requires seller to have streetAddress, city and state. If any are missing returns -1.

Args

vehicle :  dict : Vehicle whose seller needs to be created/searched

api :  API : Pscraper api, that allows retrieval/creation of marketplaces

session :  requests.sessionsSession : Google Maps Session to use for geolocating seller

Function update_vehicle

def update_vehicle(vehicle, api, google_maps_session)

Updates vehicle's last date and duration if it exists in the database, creates a new vehicle if it doesn't. Updates vehicle's price/seller if a change is found from the existing price/seller.

Args

vehicle :  dict : vehicle to be created/updated

api :  API : Pscraper api, that allows retrieval/creation of marketplaces

google_maps_session :  requests.sessionsSession : Google Maps Session to use for geolocating seller

pscraper.utils

Sub-modules

pscraper.utils.base_api

Functions

Function request_wrapper

def request_wrapper(method, success_codes)

Classes

Class BaseAPI

class BaseAPI(base_url, auth)

Descendants
Methods
Method get_full_url

def get_full_url(self, url)

Method get_request

def get_request(self, url, params)

Method patch_request

def patch_request(self, url, data)

Method post_request

def post_request(self, url, data)

pscraper.utils.misc

Functions

Function get_geolocation

def get_geolocation(address, session)

Finds latitude and longitude from a human readable address using Google Maps API. You need to set the environment variable GCP_API_TOKEN to your Google Maps API token

Args

address :  str : Human readable address

session :  requests.sessionsSession : Session to use for geolocating

Returns

latitude, longitude :  tuple : Lat, Lng found from Google Maps API

Function get_traceback

def get_traceback()

Get formatted traceback information after exception

Returns

text, longitude (str) Traceback text :  

Function measure_time

def measure_time(func)

A decorator to call a function and track the time it takes for the function to finish execution

Args

func : Function to be decorated

Function send_slack_message

def send_slack_message(**kwargs)

Sends a message in Slack. If only one argument is provided (channel) it sends traceback information about the most recent exception. You need to set the SLACK_API_TOKEN environment variable of your slack workspace API token

Args

kwargs : Keyword arguments to be used as payload for WebClient

Function send_slack_report

def send_slack_report(cars_et, cars_total, at_et, at_total, cm_et, cm_total, states, channel='#daily-job')

Post scraping report on slack channel #daily-job . Need to set the SLACK_API_TOKEN environment variable to your slack workspace API token. Uses utils.misc.send_slack_message.

Args

cars_et : Time in seconds it took to scrape cars.com

cars_total : Number of vehicles scraped from cars.com

at_total : Time in seconds it took to scrape autotrader

at_et : Number of vehicles scraped from autotrader

cm_total : Time in seconds it took to scrape carmax

cm_et : Number of vehicles scraped from carmax

states : Scraped states to include in the report

channel : Slack channel to send the report to, default: #daily-job