Skip to content

Latest commit

 

History

History
19 lines (13 loc) · 759 Bytes

File metadata and controls

19 lines (13 loc) · 759 Bytes

Data-Extraction-from-PDF-files-of-Resume

Author:

A. N. M. Sajedul Alam

Goal:

Data Extraction from PDF files of Resume

Important Details:

a) Implemented on 6 different resumes randomly collected from web
b) Used poppler utils for manipulating PDF files and converting them to other formats
c) Used pdf2image and easyocr for converting pdf to image and for optical character recognition
d) Used spacy for doing advanced natural language processing tasks and pillow's imageDraw module for getting simple 2D graphics for image objects
e) Worked only with resumes written in English Language

Attention:

All the resumes were webscraped randomly from web. Author will not be liable for any future misuse.