magcox5 / web-scraping-challenge Public

Notifications You must be signed in to change notification settings
Fork 0
Star 0

This program scrapes Mars data from NASA, JPL, USGS, and Mars facts websites to create a fact sheet with Mars images as a local web page.

0 stars 0 forks Branches Tags Activity

Notifications

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
Missions_to_Mars		Missions_to_Mars
Mission to Mars_final-page1.jpg		Mission to Mars_final-page1.jpg
Mission to Mars_final-page2.jpg		Mission to Mars_final-page2.jpg
Mission to Mars_final.pdf		Mission to Mars_final.pdf
README.md		README.md

Repository files navigation

Mission to Mars

This program scrapes Mars data from the following websites in order to create a fact sheet with Mars images as a local web page.

Nasa Mars news
Jet Propulsion Labs
Twitter account with Mars weather
Mars Facts
USGS Astrogeology site for images of Mars' Hemispheres

Note: Twitter disable access to tweet texts, so I used Tweepy and the Twitter api to get the weather. Technically not scraping.

Files:

(Written in python)

app.py is the flask application server
scrape_mars.py is the file converted from the jupyter notebook to scrape the data
mission_to_mars.ipynb is the jupyter notebook
template\index.html is the home page for the server
chromedrive.exe -- essential for scraping the data

Database:

MongoDB - Database is called mission_to_mars, Collection is called mars_info. Every time new data is scraped, the old database is dropped and a new one is created.

Instructions:

Prerequisites:
1. All files listed above - download to a new subdirectory
2. Python
3. Chromedrive (goes to target website to scrape data)
4. Tweepy (to get data from Twitter api)
5. Pandas
6. Beautiful Soup (tool for parsing scraped data)
7. requests
8. splinter/browser (used with Chromedrive)
9. re
10. A config.py file with your Twitter authorization credentials
11. PyMongo (to interact with MongoDB)
12. Jupyter Notebook (If you want to run pieces of code in mission_to_mars.ipynb to see how things work)
13. Chrome browser
14. A twitter account with access credentials for the tweets you want
15. Flask (to create a web server to host your web page)
Run the application:
1. Go to a terminal application, such as git bash
2. Go to the subdirectory containing your Mars scraping files
3. At the command line, run the flask application python app.py
4. In the chrome browser, go to http://127.0.0.1:5000/ The Mars Facts page appears.
5. Press "Scrape New Data" and the program will go get the latest info. from the Mars websites.

Final Result:

About

This program scrapes Mars data from NASA, JPL, USGS, and Mars facts websites to create a fact sheet with Mars images as a local web page.

python flask jupyter-notebook chromedriver

Report repository

Releases

No releases published

Packages

No packages published

Languages