Skip to content

Latest commit

 

History

History
42 lines (27 loc) · 5.6 KB

README.md

File metadata and controls

42 lines (27 loc) · 5.6 KB

crypto-scraper

Scraping top crypto's data and creating easy-to-read charts and graphs with additional technical indicators.

project-crypto-scraper-uml


1. GCP Deployment

  • 1.1 Trigger

    Cloud Function is regularly triggered every specifed period of time (default 6h). Cloud Scheduler sends a message to a Pub/Sub topic script-trigger which is followed by triggering data-collector Cloud Function.

  • 1.2 Run

    data-collector is then compiled by Cloud Functions and it's docker image sent to Artifact Registry gcs-artifacts. If there were no changes done to the data-collector function it pulls the latest application image.

    data-collector is a javascript single-file application that launches a puppeteer headless browser that connects to the given websites, scrapes specified data and saves it to Google Cloud Storage crypto-data-storage-bucket. All necessary logs are written to Google Logging.

  • 1.3 Save

    Scraped data is saved to a specified bucket. Each crypto has it's own file in .json format. If there already exists a file for given crypto, the file's metadata is overriten so that the newest data is attached to the end of each JSON values.

    Example data BTC.json:

    • {"timestamp":["2023-05-29-2","2023-05-29-3","2023-05-29-4","2023-05-30-1","2023-05-30-2","2023-05-30-3","2023-05-30-4","2023-05-31-1","2023-05-31-2","2023-05-31-3","2023-05-31-4","2023-06-01-1","2023-06-01-2","2023-06-01-3","2023-06-01-4","2023-06-02-1","2023-06-02-2","2023-06-02-3","2023-06-02-4","2023-06-03-1","2023-06-03-2","2023-06-03-3","2023-06-03-4","2023-06-04-1","2023-06-04-2","2023-06-04-3","2023-06-04-4","2023-06-05-1","2023-06-05-2","2023-06-05-3","2023-06-05-4","2023-06-06-1","2023-06-06-2","2023-06-06-3","2023-06-06-4","2023-06-07-1","2023-06-07-3","2023-06-07-4","2023-06-08-1","2023-06-08-2","2023-06-08-3","2023-06-08-4","2023-06-09-1","2023-06-09-2","2023-06-09-3","2023-06-09-4","2023-06-10-1","2023-06-10-2","2023-06-10-3","2023-06-10-4","2023-06-11-1","2023-06-11-2","2023-06-11-3","2023-06-11-4","2023-06-12-1","2023-06-12-2","2023-06-12-3","2023-06-12-4","2023-06-13-1","2023-06-13-2","2023-06-13-3","2023-06-14-1","2023-06-14-2","2023-06-14-3","2023-06-14-4","2023-06-15-1","2023-06-15-2","2023-06-15-3","2023-06-15-4","2023-06-16-2","2023-06-16-3","2023-06-16-4","2023-06-17-1","2023-06-17-2","2023-06-17-3","2023-06-17-4","2023-06-18-1","2023-06-18-2","2023-06-18-3","2023-06-18-4","2023-06-19-1","2023-06-19-2","2023-06-19-3","2023-06-19-4","2023-06-20-1","2023-06-20-2","2023-06-20-3","2023-06-20-4","2023-06-21-1","2023-06-21-2","2023-06-21-3","2023-06-22-1","2023-06-22-2","2023-06-22-3","2023-06-22-4","2023-06-23-1","2023-06-23-2","2023-06-23-3","2023-06-23-4","2023-06-24-1","2023-06-24-2","2023-06-24-3","2023-06-24-4","2023-06-25-1","2023-06-25-2","2023-06-25-3","2023-06-25-4","2023-06-26-1","2023-06-26-3","2023-06-26-4","2023-06-27-1","2023-06-27-2","2023-06-27-3","2023-06-27-4","2023-06-28-1","2023-06-28-2","2023-06-28-3","2023-06-28-4","2023-06-29-1","2023-06-29-2","2023-06-29-3","2023-06-29-4","2023-06-30-1","2023-06-30-3","2023-06-30-4","2023-07-01-1","2023-07-01-2","2023-07-01-3","2023-07-01-4","2023-07-02-1","2023-07-02-2","2023-07-02-3","2023-07-02-4","2023-07-03-1","2023-07-03-2","2023-07-03-3","2023-07-03-4","2023-07-04-1","2023-07-04-2","2023-07-04-3","2023-07-04-4","2023-07-05-1","2023-07-05-2","2023-07-05-3","2023-07-05-4","2023-07-06-1","2023-07-06-2","2023-07-06-3","2023-07-06-4","2023-07-07-1","2023-07-07-3","2023-07-08-1"],"price":["27899.84","27626.64","27676.59","27844.83","27870.75","27681.15","27727.16","27662.30","27143.48","26926.20","27101.11","26780.23","26916.69","26898.13","26897.98","27004.50","27082.69","27104.03","27218.79","27151.27","27160.97","27317.91","27018.52","27069.02","27225.32","27190.45","27202.92","26846.73","26785.83","26167.17","25677.63","25740.65","25769.62","26084.32","27074.08","26952.89","26412.15","26204.86","26347.66","26386.47","26736.86","26573.79","26485.69","26649.24","26469.42","26474.99","26316.75","25667.06","25644.83","25782.06","25743.26","25720.23","25762.49","26063.71","25799.30","25969.72","25825.75","25932.40","26060.90","26158.35","25740.06","25984.06","25982.70","25975.46","25108.98","25039.90","24883.24","24935.08","25551.99","25567.00","25792.45","26293.92","26248.28","26577.38","26397.32","26515.51","26532.38","26520.67","26553.24","26399.82","26390.61","26388.10","26434.81","26783.17","26890.59","26764.70","27096.25","28132.04","28715.02","28924.19","29861.21","30287.99","30134.75","29863.41","30055.27","30017.93","30147.94","31201.34","30684.60","30744.90","30638.97","30413.37","30528.72","30764.43","30694.49","30593.15","30446.06","30302.19","30419.36","30156.44","30376.65","30380.68","30525.17","30716.85","30448.26","30302.07","30421.03","30141.51","30165.78","30434.68","30483.47","30416.39","30740.92","30063.62","30469.97","30395.08","30441.32","30572.17","30596.71","30513.69","30528.03","30501.88","30572.54","30768.47","30612.55","31078.81","31050.10","31202.10","31024.24","30964.86","30808.04","30871.90","30684.18","30354.29","30476.29","30504.96","31071.46","30403.57","30226.97","30143.31","30402.50","30283.91"]}

    *Currently there isn't any external environment connected to the GCS bucket. Work still in progress as the data is being collected.


2. Charts and Graphs

Work in progress

Example chart BTC.png:

BTC

3. AI model training