Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speed up post-processing with C++ bindings and add char detection boxes #137

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

gfickel
Copy link

@gfickel gfickel commented Jan 8, 2021

The main goal for this PR is to add a C++ binding to greatly improve the runtime of the post-processing step. This gets pretty slow as the number of words on the image increases. Also, since it was quite easy to do, I also added a function that returns the character boxes since it was really important for our team and it also seemed to be requested a couple of times on the issues.

The problem was on getDetBoxes_core(...) function from craft_utils.py . To be more specific, the bottleneck is the following for: https://github.com/clovaai/CRAFT-pytorch/blob/master/craft_utils.py#L34

You can find the C++ implementation on findWordBoxes function on find_components.cpp file. I've tried to keep the implementation as simple as possible, so no threads, SIMD, etc.

An important point is that this PR does not break the interface, so every integration should keep working without problems. The program will detect if the C++ bindings are available, and use the original Python implementation if they are not and/or the user explicitly informs on the function call.

The compilation of the C++ bindings only needs CMake and a compiler. No external library is used. That's why the diff is a little big since I had to grab some code from OpenCV to perform a couple of functionalities. On doing this I tried to skim off most of code paths that are not going to be used to keep the line code as low as possible. Also, there is a simple shell script that compiles the bindings on Unix using gcc.

Regarding the character detection, it is a very small and simple function, with both a Python and C++ implementation. I've added it in this PR since the C++ code is also used on the word boxes detection, and the Python version is really small. But I can create a different PR only with this feature if necessary.

Finally, I've added a mode that I called "Fast Mode". It doesn't do the dilation for every word; instead it simply grows the rotated bounding box width and height according to the dilation size that would be done. I haven't notice big differences on the generated boxes, but it is much faster. However, I've added a flag (fastMode) to let the user decide if they want use this new post-processing method or the original one.

On my tests, using a notebook with an i7 8550-U and an image with 532 words, the hotspot of the postprocessing function went from 572.2ms to 6ms. Also, I've checked the bounding boxes results from both the original Python and the C++ bindings and they both matched on all of my test images.

Oh, and by the way, CRAFT is amazing! Congrats :D

@keshavoct98
Copy link

keshavoct98 commented May 5, 2021

Hello Gfickel,

I tested your code on images of license plates for text detection. However, I am unable to get any observable speed boost. I compiled the CPP bindings and even tried the Fast mode. Can you tell me If I am doing something wrong?
Image I tested on-
110KR43BY9378_crop

@Yashjain0333
Copy link

Hello Gfickel,

I tested your code on images of license plates for text detection. However, I am unable to get any observable speed boost. I compiled the CPP bindings and even tried the Fast mode. Can you tell me If I am doing something wrong?
Image I tested on-
110KR43BY9378_crop

This is only faster for cases where the image has a large number of words, like the one he tested on, which had 532 words.

@keshavoct98
Copy link

Okay, thanks for the clarification.

@Yashjain0333
Copy link

Does getting character-level bounding boxes require some additional changes to the code? Because I haven't got the character boxes even after building the CPP bindings.

@keshavoct98
Copy link

There's a function named "getWordAndCharBoxes" inside craft_utils.py file that you can call instead of "getDetBoxes" from test.py file. That function will return char boxes as well.

@Yashjain0333
Copy link

There's a function named "getWordAndCharBoxes" inside craft_utils.py file that you can call instead of "getDetBoxes" from test.py file. That function will return char boxes as well.

Great thanks a lot, needed to confirm that!

@zoldaten
Copy link

how to use getWordAndCharBoxes ?
i`ve just change here:

# Post-processing
    #boxes, polys = craft_utils.getDetBoxes(score_text, score_link, text_threshold, link_threshold, low_text, poly)
    boxes, polys = craft_utils.getWordAndCharBoxes(score_text, score_link, text_threshold, link_threshold, low_text, poly)

and got a error:

Traceback (most recent call last):
  File "/home/pi/Desktop/CRAFT-pytorch-improvement-cpp_bindings/test.py", line 166, in <module>
    bboxes, polys, score_text = test_net(net, image, args.text_threshold, args.link_threshold, args.low_text, args.cuda, args.poly, refine_net)
  File "/home/pi/Desktop/CRAFT-pytorch-improvement-cpp_bindings/test.py", line 104, in test_net
    boxes, polys = craft_utils.getWordAndCharBoxes(score_text, score_link, text_threshold, link_threshold, low_text, poly)
  File "/home/pi/Desktop/CRAFT-pytorch-improvement-cpp_bindings/craft_utils.py", line 283, in getWordAndCharBoxes
    low_text, poly, use_cpp_bindings, fast_mode, rotated_box)
  File "/home/pi/Desktop/CRAFT-pytorch-improvement-cpp_bindings/craft_utils.py", line 269, in getDetBoxes
    low_text, use_cpp_bindings, fast_mode, rotated_box)
  File "/home/pi/Desktop/CRAFT-pytorch-improvement-cpp_bindings/craft_utils.py", line 54, in getDetBoxes_core
    linkmap = linkmap.copy()
AttributeError: 'float' object has no attribute 'copy'

@TINGWEIJING
Copy link

how to use getWordAndCharBoxes ? i`ve just change here:

# Post-processing
    #boxes, polys = craft_utils.getDetBoxes(score_text, score_link, text_threshold, link_threshold, low_text, poly)
    boxes, polys = craft_utils.getWordAndCharBoxes(score_text, score_link, text_threshold, link_threshold, low_text, poly)

and got a error:

Traceback (most recent call last):
  File "/home/pi/Desktop/CRAFT-pytorch-improvement-cpp_bindings/test.py", line 166, in <module>
    bboxes, polys, score_text = test_net(net, image, args.text_threshold, args.link_threshold, args.low_text, args.cuda, args.poly, refine_net)
  File "/home/pi/Desktop/CRAFT-pytorch-improvement-cpp_bindings/test.py", line 104, in test_net
    boxes, polys = craft_utils.getWordAndCharBoxes(score_text, score_link, text_threshold, link_threshold, low_text, poly)
  File "/home/pi/Desktop/CRAFT-pytorch-improvement-cpp_bindings/craft_utils.py", line 283, in getWordAndCharBoxes
    low_text, poly, use_cpp_bindings, fast_mode, rotated_box)
  File "/home/pi/Desktop/CRAFT-pytorch-improvement-cpp_bindings/craft_utils.py", line 269, in getDetBoxes
    low_text, use_cpp_bindings, fast_mode, rotated_box)
  File "/home/pi/Desktop/CRAFT-pytorch-improvement-cpp_bindings/craft_utils.py", line 54, in getDetBoxes_core
    linkmap = linkmap.copy()
AttributeError: 'float' object has no attribute 'copy'

You need to pass img_resized into the method, example:

    boxes, polys, char_boxes = craft_utils.getWordAndCharBoxes(
        img_resized,
        score_text,
        score_link,
        text_threshold,
        link_threshold,
        low_text,
        poly
    )

After that, add this line of code below it to get the resized values.

char_boxes = craft_utils.adjustResultCoordinates(char_boxes, ratio_w, ratio_h)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants