Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

String concatenation in a loop slow compared to Python #559

Open
willstudy opened this issue May 21, 2024 · 4 comments
Open

String concatenation in a loop slow compared to Python #559

willstudy opened this issue May 21, 2024 · 4 comments
Assignees

Comments

@willstudy
Copy link

willstudy commented May 21, 2024

test code

import time

def test_f_string_format_performance(num_entries: int) -> float:
    start_time = time.time()
    for i in range(num_entries):
        formatted_string = f"Number {i}: {i * 2}"
    return time.time() - start_time

def test_string_concatenation_performance(num_entries: int) -> float:
    start_time = time.time()
    result = ""
    for i in range(num_entries):
        result += f"Number {i}: {i * 2}\n"
    return time.time() - start_time


num_entries = 200000 

format_time = test_f_string_format_performance(num_entries)
print(f"f-string format operation time: {format_time:.6f} seconds")

concatenation_time = test_string_concatenation_performance(num_entries)
print(f"String concatenation operation time: {concatenation_time:.6f} seconds")

codon test run

(base) Y6RC26NQM4:press bytedance$ ./build/codon build -release -exe string.py
(base) Y6RC26NQM4:press bytedance$ ./string
f-string format operation time: 0.134792 seconds
String concatenation operation time: 25.759953 seconds

python3 test run

(base) Y6RC26NQM4:press bytedance$ python3 string.py
f-string format operation time: 0.021931 seconds
String concatenation operation time: 0.032574 seconds
@willstudy
Copy link
Author

@arshajii thanks for help

@arshajii
Copy link
Contributor

After some researching it seems that Python string objects have an optimization where they check if their reference count is 1 before concatenation, and if so they do a realloc in-place instead of creating a new object, which effectively skips the long memcpy operation.

It's not as simple in Codon since we don't keep reference counts (instead we use a GC). It should be possible to still determine if an object has just a single reference via the GC, but we would need to do benchmarking to make sure it doesn't introduce performance problems in other scenarios.

In the meantime, there is an internal string buffer type _strbuf that is used in the standard library, which can be used here. It makes the runtime 25% faster than CPython (3.11) on my machine:

def test_string_concatenation_performance(num_entries: int) -> float:
    start_time = time.time()
    result = _strbuf()    # <-- use strbuf object
    for i in range(num_entries):
        result.append(f"Number {i}: {i * 2}\n")
    result = str(result)  # <-- convert strbuf to string
    return time.time() - start_time

I'll keep this issue open until we figure out a general solution.

@arshajii arshajii changed the title Why does the performance of string operations become very poor after being statically compiled by Codon? String concatenation in a loop slow compared to Python May 21, 2024
@willstudy
Copy link
Author

willstudy commented May 22, 2024

@arshajii add __iadd__ function for str class?

img_v3_02b4_a88a02e9-8050-4c83-a78c-0b226780a70g

for example:

def __iadd__(self, other: str) -> str:
        len1 = self.len
        len2 = other.len
        len3 = len1 + len2

       self.len = len3
       self.ptr = realloc(self.ptr, len3, len1)
       str.memcpy(self.ptr + len1, other.ptr, len2)
       return self

How about this solution? but str is tuple and cannot be modified

@inumanag
Copy link
Contributor

Does it work if you do:

@extend 
class str:
   def __iadd__(self, other: str):
       len1 = self.len
        len2 = other.len
        len3 = len1 + len2

       self.len = len3
       self.ptr = realloc(self.ptr, len3, len1)
       str.memcpy(self.ptr + len1, other.ptr, len2)
       return self

I think it should work. Let me know if it does not.

@inumanag inumanag self-assigned this Sep 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants