Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

performance decode_str #471

Open
dwjbosman opened this issue Dec 2, 2024 · 2 comments
Open

performance decode_str #471

dwjbosman opened this issue Dec 2, 2024 · 2 comments
Labels
enhancement New feature or request

Comments

@dwjbosman
Copy link

dwjbosman commented Dec 2, 2024

Using decode_str to create a Python object from a CBOR message that contains a list with 1000 int elements takes more than 1.5 seconds!

Using


                profiler = cProfile.Profile()
                profiler.enable()
                json_obj = cmd_spec.str_to_json(raw_message, as_json_obj=True)
                profiler.disable()
                with open('profile.txt', 'w') as f:
                    stats = pstats.Stats(profiler, stream=f).sort_stats('cumtime')
                    stats.print_stats()

         338763 function calls (273791 primitive calls) in 1.492 seconds

   Ordered by: cumulative time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    1.492    1.492 /usr/local/lib/python3.10/dist-packages/zcbor/zcbor/zcbor.py:1730(decode_str)
        1    0.000    0.000    1.491    1.491 /usr/local/lib/python3.10/dist-packages/zcbor/zcbor/zcbor.py:1710(decode_obj)
   3060/1    0.013    0.000    1.491    1.491 /usr/local/lib/python3.10/dist-packages/zcbor/zcbor/zcbor.py:1687(_decode_full)
   6063/1    0.015    0.000    1.491    1.491 /usr/local/lib/python3.10/dist-packages/zcbor/zcbor/zcbor.py:1638(_decode_obj)
   3116/1    0.010    0.000    1.491    1.491 /usr/local/lib/python3.10/dist-packages/zcbor/zcbor/zcbor.py:1587(_decode_single_obj)
    39/11    0.001    0.000    1.423    0.129 /usr/local/lib/python3.10/dist-packages/zcbor/zcbor/zcbor.py:1628(_handle_key)
      112    0.000    0.000    1.147    0.010 /usr/local/lib/python3.10/dist-packages/zcbor/zcbor/zcbor.py:1411(var_name)
      112    0.001    0.000    1.147    0.010 /usr/local/lib/python3.10/dist-packages/zcbor/zcbor/zcbor.py:1403(id)
      112    0.674    0.006    0.674    0.006 {method 'sub' of '_regex.Pattern' objects}
     3116    0.009    0.000    0.541    0.000 /usr/local/lib/python3.10/dist-packages/zcbor/zcbor/zcbor.py:1466(_check_value)
    57/56    0.002    0.000    0.481    0.009 /usr/local/lib/python3.10/dist-packages/zcbor/zcbor/zcbor.py:1529(_add_if)
      112    0.001    0.000    0.472    0.004 /usr/local/lib/python3.10/dist-packages/zcbor/zcbor/zcbor.py:341(generate_base_name)
      112    0.471    0.004    0.471    0.004 {method 'fullmatch' of '_regex.Pattern' objects}

It seems the use of regular expressions is one of the culprits. The one with the 'id' function is easy to replace, the other one I just disabled. This already saved 1 second.

         338315 function calls (273343 primitive calls) in 0.513 seconds

   Ordered by: cumulative time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    0.513    0.513 /usr/local/lib/python3.10/dist-packages/zcbor/zcbor/zcbor.py:1736(decode_str)
        1    0.000    0.000    0.512    0.512 /usr/local/lib/python3.10/dist-packages/zcbor/zcbor/zcbor.py:1716(decode_obj)
   3060/1    0.012    0.000    0.512    0.512 /usr/local/lib/python3.10/dist-packages/zcbor/zcbor/zcbor.py:1693(_decode_full)
   6063/1    0.012    0.000    0.512    0.512 /usr/local/lib/python3.10/dist-packages/zcbor/zcbor/zcbor.py:1644(_decode_obj)
   3116/1    0.010    0.000    0.512    0.512 /usr/local/lib/python3.10/dist-packages/zcbor/zcbor/zcbor.py:1593(_decode_single_obj)
    39/11    0.000    0.000    0.506    0.046 /usr/local/lib/python3.10/dist-packages/zcbor/zcbor/zcbor.py:1634(_handle_key)
  328/176    0.001    0.000    0.286    0.002 /usr/lib/python3.10/pprint.py:57(pformat)
  328/176    0.001    0.000    0.285    0.002 /usr/lib/python3.10/pprint.py:155(pformat)
 6447/176    0.025    0.000    0.284    0.002 /usr/lib/python3.10/pprint.py:167(_format)
6478/6314    0.010    0.000    0.243    0.000 /usr/lib/python3.10/pprint.py:453(_repr)
25030/6314    0.076    0.000    0.232    0.000 /usr/lib/python3.10/pprint.py:462(format)

But still decoding should not take hundreds of ms. More investigation and fixes are needed.

@oyvindronningstad
Copy link
Collaborator

That's very interesting, thanks for the investigation! Was your CDDL big as well, or just the CBOR data?

@oyvindronningstad
Copy link
Collaborator

oyvindronningstad commented Dec 2, 2024

I'm trying to replicate now, and my setup gives an execution time of 0.018 seconds for a list of 1000 ints. Can you share some more details, so I can try to get the same results?

Here's my code:

import zcbor
import cbor2
import cProfile, pstats

cddl_contents = """
perf_int = [0*1000int]
"""
raw_message = cbor2.dumps(list(range(1000)))
cmd_spec = zcbor.DataTranslator.from_cddl(cddl_contents, 3).my_types["perf_int"]

profiler = cProfile.Profile()
profiler.enable()
json_obj = cmd_spec.str_to_json(raw_message)
profiler.disable()

with open('profile.txt', 'w') as f:
	stats = pstats.Stats(profiler, stream=f).sort_stats('cumtime')
	stats.print_stats()

I'm using zcbor 0.9.1 and Python 3.13.0

@oyvindronningstad oyvindronningstad added the enhancement New feature or request label Dec 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants