Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve performance in CharOperation #3386

Merged
merged 1 commit into from
Dec 11, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -3750,7 +3750,7 @@ public static final char[] replace(
next : for (int i = 0; i < max;) {
int index = indexOf(toBeReplaced, array, true, i);
if (index == -1) {
i++;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The actual bug is here: If index == -1, there is not hit. No need to scan again from the next offset.

What I like about this impl here: it's garbage-free whereas the proposed solution needs to encode the char arrays as ISO or UTF-8.

Do you have time to measure the pressure on the GC if your version of replace is used in tight loops?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agree, that sounds like a better fix. One should compare the measurement for both solutions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you guys are happier with a smaller change, then I'm fine with it. I'll just abort the loop at that point and be done with it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, please

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

using "break" instead of "continue" and removing the "next" label would be prettier, but i am OK with this fix

i = max; // end
continue next;
}
if (occurrenceCount == starts.length) {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -170,4 +170,26 @@ public void test012() {
4,
true));
}

public void testReplacePerformance001() {
// This is the test that takes an excessively long time
// The improvement here is drastic, 99% reduction in time
testReplaceImpl("This is one line with no matches\n", 9, 0.01);
}

private void testReplaceImpl(String oneLine, int power, double multiplier) {
String total = oneLine;
for( int i = 0; i < power; i++ ) {
total = total + total; // Double the length
}
total = oneLine + total;

// Now replace
long start = System.currentTimeMillis();
char[] found = CharOperation.replace(total.toCharArray(), new char[]{'/'}, new char[] {'z'});
assertNotNull(found);
long end = System.currentTimeMillis();
jukzi marked this conversation as resolved.
Show resolved Hide resolved
long spent = end - start;
assertTrue(spent < 10);
}
}
Loading