-
-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug] std::string::replace("aa", "aa", "") returns "a" #1408
Comments
The definition of std::string::replace has the following line at the end:
This should be:
Might also want to add a check against |
This error occurs only the length of the string to be matched is longer than the length of the string obtained by removing all matching string patterns. While it is true that the proposed code change fixes that case, it does so in a rather obscure way. The issue us that the size of the pattern to match advances index i by the matched pattern size which is used to check the condition to terminate the loop but not to calculate the position and length of the remaining characters in input string. To fix it define a new variable called remaining_len and initialize it using the length of the string. As index i moves forward, remaining_len moves backwards and then used to calculate the right position and size of the remaining chars. The full code fn replace(str string, str pattern, str replace) {
u32 string_len, pattern_len, replace_len, remaining_len;
string_len = std::string::length(string);
pattern_len = std::string::length(pattern);
replace_len = std::string::length(replace);
remaining_len = string_len;
if (pattern_len > string_len)
return string;
str result;
u32 i;
while (i <= (string_len - pattern_len)) {
if (std::string::substr(string, i, pattern_len) == pattern) {
result = result + replace;
i = i + pattern_len;
remaining_len = remaining_len - pattern_len;
} else {
result = result + std::string::at(string, i);
i = i + 1;
remaining_len = remaining_len - 1;
}
}
result = result + std::string::substr(string,string_len-remaining_len,remaining_len);
return result;
}; I think this makes the code look more intentional than just using i. I tried your examples and some more using multiple match targets and they all seem to work, so I'll submit a pr if you agree. |
The proposed code calculates the starting position based on the remaining length, which is rather obscure... :D Joking aside, I don't really mind how this function is implemented, as long as it's not too inefficient and does its job. Just please also add a check on |
Yes, thank you for the comment. I didnt intend to duplicate i but simply refactoring it as the remaining len. So I added the new variable and as I am writing about it i am already forgetting to remove the old one... Not trying to be obtuse on purpose, but I hate off by one errors and this was the best way i could think to make sure I don't make one again in this function. In my mind using the remaining length is a lot more clear than using string index which is why I didn't see this bug when I fixed the last issue about this ( #1279) pattern_len ==0 , yes, thanks for that too. It makes sense that naive code will search forever for a pattern that doesn't exist. The latest version fn replace( str string, str pattern, str replace ) {
u32 string_len, pattern_len, replace_len, remaining_len;
string_len = std::string::length(string);
pattern_len = std::string::length(pattern);
replace_len = std::string::length(replace);
remaining_len = string_len;
if ( pattern_len > string_len || pattern_len == 0 )
return string;
str result;
while ( remaining_len >= pattern_len ) {
if ( std::string::substr( string, string_len - remaining_len, pattern_len ) == pattern ) {
result = result + replace;
remaining_len = remaining_len - pattern_len;
} else {
result = result + std::string::at(string, string_len - remaining_len);
remaining_len = remaining_len - 1;
}
}
result = result + std::string::substr( string, string_len - remaining_len, remaining_len );
return result;
}; |
I changed my mind in order to avoid code duplication. I also renamed all variables to sensible names and removed unused ones. Finally I rewrote the empty strings check to be more comprehensive. Most of the changes are based on the suggestions you made. Please take a look at the resulting code which should be as efficient as possible. If you spot something you find that can be improved or any questions/comments please let me know. This is the PR candidate fn replace(str string, str pattern, str replace) {
s32 string_len = std::string::length(string);
s32 pattern_len = std::string::length(pattern);
if (pattern_len > string_len || pattern_len * string_len == 0 )
return string;
str result;
s32 string_index;
s32 remaining_len = string_len;
while ( pattern_len <= remaining_len ) {
if ( std::string::substr(string, string_index, pattern_len) == pattern ) {
result += replace;
string_index += pattern_len;
} else {
result += std::string::at(string, string_index);
string_index += 1;
}
remaining_len = string_len - string_index;
}
result += std::string::substr(string, string_index, remaining_len );
return result;
}; These are the tests I am checking the code with and the results std::print("1: {}",replace("aa", "aa", ""));
std::print("2: {}",replace("ResourceType::TEXTURE", "ResourceType::", ""));
std::print("3: {}",replace("aaahijkaauvwxaaa", "aa", "bbb"));
std::print("4: {}", replace("bb", "bb", "z"));
std::print("5: {}", replace("bcd", "bc", "z"));
std::print("6: {}", replace("bcdef", "bcd", "z"));
std::print("7: {}", replace("", "", ""));
std::print("8: {}", replace("", "", "z"));
std::print("9: {}", replace(" ", " ", "z"));
std::print("10: {}", replace("abcdefg", "b", "z"));
std::print("11: {}", replace("abcdefg", "bc", "z"));
std::print("12: {}", replace("abcdefg", "bcd", "z"));
std::print("13: {}", replace("abc?*?efg", "?*?", "_z_"));
std::print("14: {}", replace("abc d efg", " ", "-"));
std::print("15: {}", replace("abc**d**efg", "**", "-+-"));
I: 1:
I: 2: TEXTURE
I: 3: bbbahijkbbbuvwxbbba
I: 4: z
I: 5: zd
I: 6: zef
I: 7:
I: 8:
I: 9: zzz
I: 10: azcdefg
I: 11: azdefg
I: 12: azefg
I: 13: abc_z_efg
I: 14: abc-d-efg
I: 15: abc-+-d-+-efg
I: Pattern exited with code: 0
I: Evaluation took 0.108065s If you can think of some tests I missed also please post a note about it. |
Looks good to me! |
@paxcut Can this issue be closed ? |
OP seemed happy and I can't think of a test that's not checked already so I am fairly sure it can. |
Operating System
Windows
What's the issue you encountered?
std::string::replace() doesn't give the expected result even in quite simple cases.
How can the issue be reproduced?
ImHex Version
1.31.0 d12f501
ImHex Build Type
Installation type
Portable
Additional context?
No response
The text was updated successfully, but these errors were encountered: