Replies: 2 comments 2 replies
-
I'll defer to @leops and @xunilrj . I have little context on our text diff implementation. |
Beta Was this translation helpful? Give feedback.
-
This part is actually quite important:
It turns out diffing is actually one of the slower parts of the toolchain: it can take up as much as 50% of the overall runtime of the CLI in profiling runs (although this number might be biased as we run the profiling on large repos with many diagnostics). For linter diagnostics we should be able to implement a form of mutation-based diffing (this is what the |
Beta Was this translation helpful? Give feedback.
-
I encountered this while building
noDupeKeys
, but very likely this can also happen with some other lint rules including ones that are onmain
.Linting
with
noDupeKeys
suggests removinga: 1,
.However, since
record_diff
does a plain text diff on the input code vs the code after applying the suggested fix, the console output looks like removing1, a:
This is confusing.
Solutions
Tree-based diffing
We could print diffs based on the CST instead of plain text. A CST diffing algorithm can understand that
a: 2
is the same in both input and output code, and marks thea: 1
member as the difference.Disadvantage
Tree-based diffing cannot find out which member is removed in this object:
It will thus still print slightly confusing diffs in some cases, for example for
noDupeKeys
it may show removal of the other second member even though the first one is the one that the lint is highlighting as a problem:Mutation tracking diffing
We could print a diff based on the previous tree and the mutation batch we apply to it, so that we know the exact node being mutated. The disadvantage of naive tree-based diffing does not apply to mutation tracking diffing for this reason.
Disadvantage
Mutation tracking diffing is less versatile.
Formatter diffs exhibit similar problems, although they are usually less confusing because formatter changes tend to be less invasive:
Tree-based diffing could improve formatter diffs as well, but we cannot trivially apply mutation tracking diffing to formatter diffs.
Notes
Both diffing algorithms are likely slower than text diffing. However, I wouldn't consider the lint/format check failure case, in which this code would run, as performance-sensitive as the success case.
This is probably a low priority bug since the impact on users isn't dramatic (the diff is still technically correct and users should be able to figure out what is happening) but fixing it by eschewing code text diffing in favor of something better is not a quick fix.
So it's likely more of a thing for in the future when Rome has more users. But I thought it's worth tracking either way.
cc @leops @MichaReiser
Beta Was this translation helpful? Give feedback.
All reactions