Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

training not giving the expected results #1321

Open
RamadanHussein opened this issue Oct 15, 2024 · 4 comments
Open

training not giving the expected results #1321

RamadanHussein opened this issue Oct 15, 2024 · 4 comments

Comments

@RamadanHussein
Copy link

RamadanHussein commented Oct 15, 2024

hi ,
I'm trying to train EasyOCR on Arabic new fonts, we created a dataset of 1200 image with the labels after training the new model used to check some images where the results are very poor , the yaml file, sample of the images and labels attached
Anyone can support in this or guide me if I'm doing something wrong

log_dataset.txt
log_train.txt
opt.txt

easyOCr.zip

@romanvelichkin
Copy link

romanvelichkin commented Oct 18, 2024

Your config doesn't contain saved_model - are training from scratch?

So there can be many different options why it doesn't work for you...

  1. Why num_class: 103? Isn't that amount of symbols you're going to train? You have much less symbols than 103.
  2. You have low lr (learning rate).
  3. It's not clear how much validation data you have.
  4. 1300 images could be not enough - for a small specific task I have train dataset of 8000 images and validation dataset of 4000 images, and I'm still far from perfect.
  5. Train and valid data can be different, so you need to increase train data by adding more relevant data.

I found that data generation used in EasyOCR can be not efficient enough. I found that using images I'm working with is much more sufficient way to increase performance.
I automated creation of train data from my images. I run easyocr over my images and then cut it according to results. Then I can fix images or detected text the way I need, also all fail cases are becoming obvious.

My confing file, that works for me:
ru_filtered_config.txt

@RamadanHussein
Copy link
Author

Apricate your reply and help , I increased the sample to 10K for training and ~4k for validation also updated the configuration and start to see better results ,what do you think any other room for enhancement to get better results ?
and want to ask about "I automated creation of train data from my images. I run easyocr over my images and then cut it according to results. Then I can fix images or detected text the way I need, also all fail cases are becoming obvious."
How this can be done ,any support ?

ar_filtered_config.txt

@romanvelichkin
Copy link

romanvelichkin commented Oct 22, 2024

I can't really tell how much there can be room for improvement. It depends on model - how much stuff it can learn and memorize. But, from my experience, I think it can handle tens of thousands images if not hundreds of thousands.

I automated creation of train data from my images

  1. Method helps if you're not training model from scratch, so it recognizes something already.
    1.1. You feed bunch of images with text to EasyOCR.
    1.2. You get scan result for each image that contains detected and recognized data: bounding box coords, text, confidence.
    1.3. Extract bounding box coordsand text from scan result.
    1.4. Cut from original image small piece according to bounding box Using OpenCV or Pillow.
    1.5. Write text data into text file, so it will have cut_file_name and text data for that file.
    1.6. Now you can look for fail cases: what was recognized wrong and fix it.
    1.7. Augment data if needed and add to train or test data.

  2. You can teach model to understand different fonts. Create .doc file with as many, words, symbols and fonts as you need. Save each page as image. Use method I privided in above.

  3. You can use data generators: https://github.com/Belval/TextRecognitionDataGenerator.

@romanvelichkin
Copy link

romanvelichkin commented Oct 22, 2024

Code I wrote to cut images as described in my method. You have to modify it a bit to make work, because it's adapted to my project structure.

import ast
import os
import pandas as pd
from PIL import Image


def get_coords(box_string):
    # Create list from string
    box_coords = ast.literal_eval(box_string)

    x_left = 9999
    x_right = -1
    y_top = 9999
    y_bottom = -1
    
    for corner_coords in box_coords:
        x = round(corner_coords[0])
        y = round(corner_coords[1])
        
        if x < x_left:
            x_left = x

        if x > x_right:
            x_right = x

        if y < y_top:
            y_top = y

        if y > y_bottom:
            y_bottom = y

    return x_left, x_right, y_top, y_bottom


def extract_images(scan_result_filepath, img_dir_path=PATH_IMAGE, extract_dir=PATH_EXTRACT):
    scan_result_filename = os.path.basename(scan_result_filepath)
    scan_result_name= os.path.splitext(scan_result_filename)[0]
    
    pdf_name_length = scan_result_name.find('_')
    pdf_name = scan_result_name[:pdf_name_length]

    img_name = scan_result_name

    img_filepath = os.path.join(img_dir_path, pdf_name + '.pdf', img_name + '.jpg')
    print(img_filepath)

    df = pd.read_csv(scan_result_filepath)
    df = df.rename(columns={'0': 'box', '1': 'value', '2': 'accuracy'})
    print(len(df))
    print()

    text_lines  = []
    
    with Image.open(img_filepath) as im:
        for i in range(len(df)):
            try:
                coords = df.iloc[i, 0]
                x_left, x_right, y_top, y_bottom = get_coords(coords)
                value = df.iloc[i, 1]
                print(x_left, x_right, y_top, y_bottom, value)
    
                im_crop = im.crop((x_left, y_top, x_right, y_bottom))
                im_crop_path = os.path.join(extract_dir, scan_result_name + '_' + str(i) + '.jpg')
                im_crop.save(im_crop_path, quality=90, optimize=True)
            
                text_lines.append(scan_result_name + '_' + str(i) + '.jpg,' + value)
            except:
                pass

    extract_txt_filepath = os.path.join(extract_dir, scan_result_name + '.txt')
    with open(extract_txt_filepath, 'w') as file:
        for line in text_lines:
            try:
                file.write(line + '\n')
            except:
                pass


for pdf_filename in os.listdir(PATH_SCAN_RESULT):
    pdf_dir = os.path.join(PATH_SCAN_RESULT, pdf_filename)
    for csv_filename in os.listdir(pdf_dir):
        scan_result_filepath = os.path.join(pdf_dir, csv_filename)
        extract_images(scan_result_filepath)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants