The aim of this repository is creating an OCR model (convert image to text) for Old Persian Cuneiform
This repository is part of Electronic Old Persian Library organization and inspired from eBL project.
yolo_cnn_old_persian
tesseract_old_persian
easyocr_old_persian
yolo_cnn_old_persian
: is not completed yet.tesseract_old_persian
is completed.easyocr_old_persian
is completed but needs more optimization and real data.
This model is based on EasyOCR repository for a custum model. If you see any error please check issues.
Trainer notebook:
Using saved model:
To use saved model please create the root
of your machine like below structure and replace custum_example.pth
, custom_example.py
and custom_example.yaml
files there. For more comprehension please watch this tutorial on youtube.
/root/
/EasyOCR/
/model/
custum_example.pth
/user_network/
custom_example.py
custom_example.yaml
This tesseract pre-trained OCR model converts Old Persian cuneiform to English transcription and is developed by S. Muhammad Hossein Mousavi.
Please replace peo.traineddata
file in this directory: /usr/share/tesseract-ocr/4.00/tessdata
An example:
The last 12 lines of the great Darius's inscription in Persepolis, DPd inscription:
Input:
Output:
Zittiy ; iaryvuS ; xrSayZiy;
mnc;aurmzia;upstam; rlauv;
hia ; ViZiriS ; rgiriS ; uta;
im am ; i h yaum ; au lm z i a ;
pitTucs;hca;hinaya; hca;
QuSiyala ; hca;iruga;ariy;
imam ;ihyaum;ma; ajMiya; ait;
aim ;yanm;jDiyaMiy;
aitmiy ; iiaTuv
At the next stage, we can translate that Old Persian transcription to modern languages by Chat-GPT:
Prompt: Can you translate this Old Persian (achaemenid) English transcription to modern English in one paragraph?
“This is me, Dariush king; By the grace of Ahura Mazda, I have built this; I founded this empire and made it strong. May Ahuramazda protect me and my kingdom; may it last forever; and it would be safe from lies; that is what I did; That is what I am saying.”
این منم داریوش شاهنشاه؛ به لطف اهورامزدا، من این را بنا کردم؛ من این امپراتوری را بنیان نهادم و آن را نیرومند ساختم. باشد که اهورامزدا من و پادشاهی مرا محافظت کند؛ باشد که برای همیشه پایدار بماند؛ و باشد که از دروغ در امان باشد؛ این است آنچه من انجام دادم؛
این است آنچه من میگویم.
I wrote an article as a tiny report for what I have done for this project till now.
This repository is under CC-BY-NC license and any commercial use is prohibited.
This repository is still under developing. For contributing contact me by email: melaneepython@gmail.com
- Shaghayegh Rahmani (Melanee)
- Dr. Zohre Akbari
- Dr. Hassan Ghaed
- Amirhossein Khajehpour
- 4 Unkown sponsers
If you would like to support this project financially, you can use this link:
https://github.com/Melanee-Melanee/Old-Persian-Cuneiform-OCR/blob/master/other/Financial.md