You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
In an effort to continue improve performance for the newly changed DataPack using profiling code for typical scenarios as described in issue 805( #805) and find out additional places to improve besides those areas being fixed in #904 (PR907), additional profiling test was run on the PR907 code base. After new analysis it seems get_class is now called "excessively"/in high volume compared to version 0.2.0. However (after discussion) this is by design as we no longer use old ways (0.2.0) of creating entry classes so will reliant on this get_class to load correct class instead. However the current code is not optimized for high volume calls. So either the way to load related classes needs to improve or some optimization needs to be added to improve the class loading efficiency.
To Reproduce
Steps to reproduce the behavior:
Download 0.2.0 code base to a new folder
Write a profiling (unit) test using typical forte pipleline to process (such as annotations) , or simply check out "data_pack_profiling_test.py" in the branch for implementing issue 805 profiling task (https://github.com/J007X/forte/tree/implementation_805)
Running the same profiling test on both current code base (post PR907 changes) and 0.2.0 code base. Compare the running speed between running this profile test
Expected behavior
Notice the performance difference (currently even with post PR907 changes, it is still slower than 0.2.0), and the excessive calls (such as for entry_getter and get_class) appeared in the current codebase profiling results compare to 0.2.0 running result.
Expected desirable result (goal): the current version should be on par or faster than 0.2.0 in those typical scenarios.
Screenshots
Environment (please complete the following information):
OS: MacOS
Version 12.0.1
Python version: 3.8
Additional context
Per discussion prior to creating this issue, the large number of calls to get_class (and entry_getter) is expected (by design) behavior so the current focus for fixing the current performance issue is on optimizing related methods/process involved.
The text was updated successfully, but these errors were encountered:
Describe the bug
In an effort to continue improve performance for the newly changed DataPack using profiling code for typical scenarios as described in issue 805( #805) and find out additional places to improve besides those areas being fixed in #904 (PR907), additional profiling test was run on the PR907 code base. After new analysis it seems get_class is now called "excessively"/in high volume compared to version 0.2.0. However (after discussion) this is by design as we no longer use old ways (0.2.0) of creating entry classes so will reliant on this get_class to load correct class instead. However the current code is not optimized for high volume calls. So either the way to load related classes needs to improve or some optimization needs to be added to improve the class loading efficiency.
To Reproduce
Steps to reproduce the behavior:
Download 0.2.0 code base to a new folder
Write a profiling (unit) test using typical forte pipleline to process (such as annotations) , or simply check out "data_pack_profiling_test.py" in the branch for implementing issue 805 profiling task (https://github.com/J007X/forte/tree/implementation_805)
Running the same profiling test on both current code base (post PR907 changes) and 0.2.0 code base. Compare the running speed between running this profile test
Expected behavior
Notice the performance difference (currently even with post PR907 changes, it is still slower than 0.2.0), and the excessive calls (such as for entry_getter and get_class) appeared in the current codebase profiling results compare to 0.2.0 running result.
Expected desirable result (goal): the current version should be on par or faster than 0.2.0 in those typical scenarios.
Screenshots
Environment (please complete the following information):
Additional context
Per discussion prior to creating this issue, the large number of calls to get_class (and entry_getter) is expected (by design) behavior so the current focus for fixing the current performance issue is on optimizing related methods/process involved.
The text was updated successfully, but these errors were encountered: