-
Notifications
You must be signed in to change notification settings - Fork 147
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dramatically optimize algorithm in the common case by excluding match… #89
base: master
Are you sure you want to change the base?
Conversation
…ing heads and tails before using LCS. For example, in the case of single insert, the algorithm changes from O(m*n) to O(m+n). When the arrays contain 1,000 entries, for example, this change reduces the number of comparisons ~1,000,000 to ~2,000 and the size of the table used by the algorithm from ~1,000,000 to 2.
I haven't had a chance to actually measure the perf difference, but you could replace
with something like:
if it makes a perf difference. |
So I did the perf test. After I added the proper I still don't have an instinct around what operations drop you out of the lazy domain. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@epatey - left a couple of comments. Overall this is really nice, thank you!
/// Namespace for the `diff` and `apply` functions. | ||
public enum Dwifft { | ||
|
||
internal static func matchingEndsInfo<Value: Equatable>(_ lhs: [Value], _ rhs: [Value]) -> (Int, ArraySlice<Value>, ArraySlice<Value>) { | ||
let minTotalCount = min(lhs.count, rhs.count) | ||
let matchingHeadCount = zip(lhs, rhs).lazy.prefix() { $0.0 == $0.1 }.count() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These two lines feel a bit too clever for their own good - it's hard for me to understand what they're doing at a quick glance. I think using a couple of plain old for loops would be preferable here.
/// Namespace for the `diff` and `apply` functions. | ||
public enum Dwifft { | ||
|
||
internal static func matchingEndsInfo<Value: Equatable>(_ lhs: [Value], _ rhs: [Value]) -> (Int, ArraySlice<Value>, ArraySlice<Value>) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even though it's an internal function, can you please add a docstring to this method to help with future debugging etc?
/// Returns the sequence of `DiffStep`s required to transform one array into another. | ||
/// | ||
/// - Parameters: | ||
/// - lhs: an array | ||
/// - rhs: another, uh, array | ||
/// - Returns: the series of transformations that, when applied to `lhs`, will yield `rhs`. | ||
public static func diff<Value: Equatable>(_ lhs: [Value], _ rhs: [Value]) -> [DiffStep<Value>] { | ||
let (matchingHeadCount, lhs, rhs) = matchingEndsInfo(lhs, rhs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nitpicking, but can you not shadow the lhs
and rhs
variable names here? Made it harder for me to understand how this compiled at first glance...
…ing heads and tails before using LCS. For example, in the case of single insert, the algorithm changes from O(m*n) to O(m+n). When the arrays contain 1,000 entries, for example, this change reduces the number of comparisons ~1,000,000 to ~2,000 and the size of the table used by the algorithm from ~1,000,000 to 2.