Skip to content

Latest commit

 

History

History
50 lines (36 loc) · 1.33 KB

README.md

File metadata and controls

50 lines (36 loc) · 1.33 KB

Extracting text from memes

Since most memes use the same font, all you do is train Google's Tesseract OCR with this font and use it on these memes.

Sample text

$ python ocr.py http://cf.chucklesnetwork.com/items/7/7/4/7/6/original/meme-text-impact-font-with-outline.jpg
MEME TEXT

IMPACT FONT WITH
OUTLINE

Setup

First instal the dependencies.

$ source setup.sh # This may take a very long time
$ source app.sh

Then move the trained data into the tessdata directory:

$ mv tessdata/eng.traineddata /usr/local/share/tessdata/

Usage

You can pass an image or URL to an image as the first argiment to ocr.py and run it on the command line. The resulting image is saved into result.png.

$ python ocr.py https://i.imgur.com/YzMXGdQ.jpg
WALKER TOLD US

 

WE HAVE AIDS

You can also run a Flask server on localhost.

$ python __init__.py

Resources