Skip to content

rin7/Wine_catalogs

Repository files navigation

Wine Vintage Catalogs

We developed an automated method to extract key features such as the wine name, bottle and case prices from Sherry Lehmann scanned wine catalogs from the 1930’s to 1980’s.

Website:

https://nirvolo.wixsite.com/wine-catalog/

Results:

Pytesseract v3.ipynb -- Final version of codes for our approach.

Accuracy Test.ipynb -- Code that gets error rate of an image into a dataframe.(One image per output)

df_ave_error_revised.csv -- Excel sheet of the average error rates for our test set.

Experiments:

Practice.ipynb -- PDF pen pro outputs.

Pytesseract.ipynb -- Codes using different approaches we thought about in the early stages of the project.

Pytesseract v2.ipynb -- Code to test one of the best approaches we had.

Wine Tesseract test .R -- Codes using Tesseract in R software.

adistance.R -- R coding using Levenshtein distance.

docextractor.R -- R coding to extract text from docx files.

About

No description or website provided.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •