an image crawler/indexer for lightshot/prntscrn
- generate random string
- append to prntscrn url
- fetch image if not 404
- save image to database
- apply modules to entry
- TextAnalyzer
- Checks for text and saves it
- NSFWAnalyzer
- Detects nsfw and saves bounding boxes with label
git clone https://github.com/nbdy/prntscrngrb
cd prntscrngrb
./dependencies.sh
pip3 install .
usage: __main__.py [-h] [-l LANGUAGES [LANGUAGES ...]] [-d DIRECTORY] [-sl SUFFIX_LENGTH] [-co] [-db DATABASE] [--skip-indexing]
options:
-h, --help show this help message and exit
-l LANGUAGES [LANGUAGES ...], --languages LANGUAGES [LANGUAGES ...]
TextDetector languages
-d DIRECTORY, --directory DIRECTORY
Where to put them images
-sl SUFFIX_LENGTH, --suffix_length SUFFIX_LENGTH
URL suffix length
-co, --crawl-only Only download images
-db DATABASE, --database DATABASE
Database name
--skip-indexing Skip the indexing step