Fraud-Detection-Project

An application for fraud detection in medicine packages and tablets.

Steps

Performed Web Scraping with Beautiful Soup to gather a large dataset of of images and information about medicine packages and tablets packages and tablets .
Transformed the data and loaded it into an excel file with Power Query Editor to have the names and information of each package in a local dataset.
Extracted text from images with OCR tools such as EasyOCR and Pytesseract from medicine packages and tablets to detect fraud, and Aapplied Named Entity Recognition tagging on extracted text to label the medicine by name, dosage, type and size, and trained a custom spacy model on the processed data to predict labels on new text.
Extracted the labeled text in a csv file and used Jaccard Similarity scores to detect fraud between the information on the packaging and the tablets.
Built an user friendly dashboard with Streamlit.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
fraud_platform_imen		fraud_platform_imen
README.md		README.md
summer internship report.pdf		summer internship report.pdf