Skip to content

A McGrawHill (MGH) Textbook Scraper that is made in python.

Notifications You must be signed in to change notification settings

Sendeky/MGHTextbookScraper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 

Repository files navigation

MGHTextbookScraper

A McGrawHill (MGH) Textbook Scraper that is made in python.
Python

Requirements:

  • Python (Tested on 3.9, should work on anything recent)
  • McGrawHill Cookies

Getting Started:

Clone project

user@User-Machine~$ git clone https://github.com/Sendeky/MGHTextbookScraper
user@User-Machine~$ cd MGHTextbookScraper

Install requirements

user@User-Machine~$ pip install requirements.txt

Getting cookies from McGrawHill:

  • Navigate and open your textbook
  • Open your browser's Web Inspector (Ctrl+Shift+I for Chrome and most browsers)
  • Find a link that starts with "epub-factory-cdn.mheducation.com" (Ctrl+F to open find menu)
  • Open the link

There are 3 cookies that are necessary for this to work

  • Click on the Cookies Tab of the Web Inspector
  • Get 3 Cookies:
    *CloudFront-Policy
    *CloudFront-Signature
    *CloudFront-Key-Pair-Id

Congrats! Now put all 3 cookies into the cookies.txt file

Running the Scraper:

It's super simple! Just run the main file like this

user@User-Machine~$ python TextbookScraperProject.py

The retrieved text will be in the project folder in data.json

About

A McGrawHill (MGH) Textbook Scraper that is made in python.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages