Web Scraping Tools

This is intended to grow into a library of useful data scraping tools scripted in Python.

Index

Images

Video and audio

These scripts are mainly based on the ffmpeg and youtube-dl libraries. Please note that youtube-dl can be very slow for downloading (~50kb/s). The new yt-dlp library is a fork of youtube-dl and features improved performance (~5Mip/s downloads) and additionnal tools. More on these libraries in the links below.

Prerequisites

python
pip

Requirements

git
- You'll know you did it right if you can run git --version and you see a response like git version x.x.x

Setup

Clone this repo

git clone https://github.com/VidiHawk/web-scraping-tools

cd <your project's file>

Then install dependencies

pip install -r requirements.txt

Adding your own tools

If you want to add packages to the requirement.txt file, I recommand using the pipreqs package. To install it:

pip install pipreqs

To build automatically your requirements.txt, just run the following command in the project directory:

pipreqs . --force

The --force flag will overwrite the existing requirements.txt file.

Notes

These scripts have been created and tested on the Ubuntu 20.04.4 LTS operating system and Python 3.8.10

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
utils		utils
webdriver		webdriver
.gitattributes		.gitattributes
.gitignore		.gitignore
README.md		README.md
google_image.py		google_image.py
googlemap_places.py		googlemap_places.py
ig_scraper.py		ig_scraper.py
requirements.txt		requirements.txt
video2audio.py		video2audio.py
video2mp4.py		video2mp4.py
youtube2mp3.py		youtube2mp3.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Web Scraping Tools

Index

Images

Video and audio

Prerequisites

Requirements

Setup

Adding your own tools

Notes

Acknoledgements

About

Releases

Packages

Languages

VidiHawk/web-scraping-tools

Folders and files

Latest commit

History

Repository files navigation

Web Scraping Tools

Index

Images

Video and audio

Prerequisites

Requirements

Setup

Adding your own tools

Notes

Acknoledgements

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages