-
Notifications
You must be signed in to change notification settings - Fork 7.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
levenshtein bug #13823
Comments
This could be solved by changing the computation of optimal cost to use deltas. I'd have to check what the performance impact of that is, and if there is a better way though. I'll have a better look this evening. |
It's always going to be possible to choose costs that result in an overflow one way or another. |
Hmm, I wonder if there are practical applications of high (absolute numbers of) cost values at all. I mean, usually(?) you'd pass very small numbers; so if we would restrict these values, that might not break any reasonable code using the Another option might be to do the internal calculations with floats/doubles. That would be less precise (and probably slower), but at least you won't get completely wrong results. If we want to avoid any breakage, we may instead choose to just document the issue. Frankly, I see fee (if any) cases where |
I made two possible suggestions on how to mitigate it here: #14809 (comment) I also thought about using floats / doubles but didn't pursue that due to the precision loss.
That's a possibility too, not sure though. One interesting use case which I have seen for levenshtein is sequence alignment cost for DNA or proteins. |
Description
The following code:
Resulted in this output:
But I expected this output instead:
Also levenshtein(str_repeat('a', 100000), str_repeat('b', 1000000), 1, PHP_INT_MAX - 500000, 1)
returning unexpected -9223372036854625811
PHP Version
any
Operating System
any 64-bit
The text was updated successfully, but these errors were encountered: