instascrape is a powerful, lightweight Python library for scraping Instagram data with no configurations necessary! It is designed with flexibility and developer productivity in mind so you can stop wasting valuable time trying to figure out how to collect data and just start analyzing 💪
- 💪 Powerful, object-oriented scraping tools as well as a variety of useful functions
- 💃 Flexibly determines whether you want to scrape HTML, JSON, BeautifulSoup, or request and scrape the URL itself
- 💾 Download content to your computer as png, jpg, mp4, and mp3
- 🎨 Dynamically retrieve HTML embed code for posts
- 🎼 Expressive and consistent API for concise and elegant code
- 📊 Designed for seamless integration with Selenium, Pandas, and other industry standard tools for data collection and analysis
- 🔨 Lightweight: you don't have to build a hammer factory when all you need is the hammer
- 🕸️ The only hard dependencies are Requests and Beautiful Soup; no more worrying about configurations or webdrivers
- ⌚ Proven to work as of December, 2020
- 💻 Installation
- 🔎 Sample Usage
- 📚 Documentation
- 📰 Blog Posts
- 🙏 Contributing
- 🕸️ Dependencies
- 💳 License
- ❔ Support
This library currently requires Python 3.7 or higher.
Install from PyPI using
$ pip3 install insta-scrape
WARNING: make sure you install insta-scrape and not a package with a similar name!
All top-level, ready-to-use features can be imported using:
from instascrape import *
instascrape uses clean, consistent, and expressive syntax to make the developer experience as painless as possible.
# Instantiate the scraper objects
google = Profile('https://www.instagram.com/google/')
google_post = Post('https://www.instagram.com/p/CG0UU3ylXnv/')
google_hashtag = Hashtag('https://www.instagram.com/explore/tags/google/')
# Scrape their respective data
google.scrape()
google_post.scrape()
google_hashtag.scrape()
After being scraped, relevant attributes can be accessed with dot (.) or bracket ([]) notation
print(google.followers)
print(google_post['hashtags'])
print(google_hashtag.amount_of_posts)
>>> 12262794
>>> ['growwithgoogle']
>>> 9053408
The official documentation can be found on Read The Docs 📰
Check out blog posts on DEV for ideas and tutorials!
- Scrape data from Instagram with instascrape
- Visualizing Instagram engagement with instascrape
- Exploratory data analysis of Instagram using instascrape and Python
- Creating a scatter matrix of Instagram data using Python
- Downloading an Instagram profile's recent photos using Python
- Scraping 25,000 data points from Joe Biden's Instagram using instascrape
- Compare major tech Instagram page's with instascrape
- Tracking an Instagram posts engagement in real time with instascrape
- Dynamically generate embeddable Instagram HTML with instascrape
All contributions, bug reports, bug fixes, documentation improvements, enhancements, and ideas are welcome!
Feel free to open an Issue or look at existing Issues to get a dialogue going on what you want to see added/changed/fixed.
Beginners to open source are highly encouraged to participate and ask questions ❤️
Instascrape primarily relies on two third-party libraries for requesting and scraping Instagram HTML content:
- Requests: HTTP requests
- BeautifulSoup: Scraping and parsing HTML data.
The rest of its functionality is provided directly from Python 3's standard library for unobtrusive code under the hood with little to no overhead.
Reach out to me if you have questions or ideas!
- Email:
- Twitter: