Translate bangla to english. This model is train based on encoder decoder with attention mechanism. This repository may be a starting point to approaching bangla machine translation
problem. If this repository helps others people who are working on bangla machine translation then it would be very greatfull for me.
I use dataset provide in http://www.manythings.org/anki/ben-eng.zip . This dataset contain english bangla sentence pair in the following format,
I'm counting on you. আমি আপনার উপর নির্ভর করে আছি।
I want your opinion. আমি আপনার মতামত চাই।
How is your daughter? আপনার মেয়ে কেমন আছে?
BanglaTranslator
├── assets
│ └── banglafonts
│ └── Siyamrupali.ttf
├── data
│ ├── ben-eng
│ │ ├── _about.txt
│ │ └── ben.txt
├── docs
│ └── U0980.pdf
├── models
│ ├── input_language_tokenizer.json
│ ├── target_language_tokenizer.json
├── translator
│ ├── config.py
│ ├── datasets.py
│ ├── infer.py
│ ├── __init__.py
│ ├── models.py
│ ├── train.py
│ └── utils.py
├── infer-example.ipynb
├── README.md
└── training-example.ipynb
assets
contain bangla font that used in plottingdata
contain english bangla pair datasetdocs
contrain documeantaion bangla unicode poins and it's char mapingmodels
contrain saved tokenize and training checkpoints if you do trainingtranslator
is the core of the project that contrain all the required scripts for this project.infer-example.ipynb
An example notebook that shows how predict on single sentence using saved checkpointstraining-example.ipynb
you can use this notebook to train bangla to english translator model
python 3.7
tensorflow 2.x
matplotlib
sklearn
tqdm
jupyter notebook
If you want to just test the model then you need to download pretrain model from from google drive link
and extract training_checkpoints.zip
file under models
directory
I test pre-train model and got result like bellow.
- If you want to test it yourself please check
infer-example.ipynb
and also download pre-train model