-
Notifications
You must be signed in to change notification settings - Fork 887
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Speed up post-processing with C++ bindings and add char detection boxes #137
base: master
Are you sure you want to change the base?
Conversation
Okay, thanks for the clarification. |
Does getting character-level bounding boxes require some additional changes to the code? Because I haven't got the character boxes even after building the CPP bindings. |
There's a function named "getWordAndCharBoxes" inside craft_utils.py file that you can call instead of "getDetBoxes" from test.py file. That function will return char boxes as well. |
Great thanks a lot, needed to confirm that! |
how to use getWordAndCharBoxes ?
and got a error:
|
You need to pass boxes, polys, char_boxes = craft_utils.getWordAndCharBoxes(
img_resized,
score_text,
score_link,
text_threshold,
link_threshold,
low_text,
poly
) After that, add this line of code below it to get the resized values. char_boxes = craft_utils.adjustResultCoordinates(char_boxes, ratio_w, ratio_h) |
The main goal for this PR is to add a C++ binding to greatly improve the runtime of the post-processing step. This gets pretty slow as the number of words on the image increases. Also, since it was quite easy to do, I also added a function that returns the character boxes since it was really important for our team and it also seemed to be requested a couple of times on the issues.
The problem was on getDetBoxes_core(...) function from craft_utils.py . To be more specific, the bottleneck is the following for: https://github.com/clovaai/CRAFT-pytorch/blob/master/craft_utils.py#L34
You can find the C++ implementation on findWordBoxes function on find_components.cpp file. I've tried to keep the implementation as simple as possible, so no threads, SIMD, etc.
An important point is that this PR does not break the interface, so every integration should keep working without problems. The program will detect if the C++ bindings are available, and use the original Python implementation if they are not and/or the user explicitly informs on the function call.
The compilation of the C++ bindings only needs CMake and a compiler. No external library is used. That's why the diff is a little big since I had to grab some code from OpenCV to perform a couple of functionalities. On doing this I tried to skim off most of code paths that are not going to be used to keep the line code as low as possible. Also, there is a simple shell script that compiles the bindings on Unix using gcc.
Regarding the character detection, it is a very small and simple function, with both a Python and C++ implementation. I've added it in this PR since the C++ code is also used on the word boxes detection, and the Python version is really small. But I can create a different PR only with this feature if necessary.
Finally, I've added a mode that I called "Fast Mode". It doesn't do the dilation for every word; instead it simply grows the rotated bounding box width and height according to the dilation size that would be done. I haven't notice big differences on the generated boxes, but it is much faster. However, I've added a flag (fastMode) to let the user decide if they want use this new post-processing method or the original one.
On my tests, using a notebook with an i7 8550-U and an image with 532 words, the hotspot of the postprocessing function went from 572.2ms to 6ms. Also, I've checked the bounding boxes results from both the original Python and the C++ bindings and they both matched on all of my test images.
Oh, and by the way, CRAFT is amazing! Congrats :D