-
Notifications
You must be signed in to change notification settings - Fork 155
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix verification of tracked struct from high-durability query #550
Conversation
✅ Deploy Preview for salsa-rs canceled.
|
CodSpeed Performance ReportMerging #550 will not alter performanceComparing Summary
|
924b53d
to
b30e9a7
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! I don't think I understand salsa well enough at this point to say with high confidence that this is correct. We may have to wait for @nikomatsakis's opinion (but can merge the fix in our fork to unblock)
src/tracked_struct/struct_map.rs
Outdated
@@ -201,6 +209,50 @@ where | |||
unsafe { C::struct_from_raw(data.as_raw()) } | |||
} | |||
|
|||
fn get_and_validate_last_changed<'db>( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would find comment describing how it is different from get_from_map
useful because the difference is very subtle
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added the comment.
One of my questions here for @nikomatsakis is whether instead of adding a new get_and_validate_last_changed
, maybe the implementation of get_from_map
should just always do this. It's possible that other code paths that just call .get
are still susceptible to this bug?
But it's also a little weird if a method called get
can mutate the stored data.
src/tracked_struct/tracked_field.rs
Outdated
let id = input.unwrap(); | ||
let data = self.struct_map.get(current_revision, id); | ||
let data = self |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think I understand salsa enough to make a real assessment but changing the last changed in maybe_changed_after
seems slightly off to me.
I also wonder if we'll run into problems with it if the setup is slightly different because, as far as I understand, maybe_changed_after
is only called from deep verify memo. That means, it would get by passed if all queries only perform a shallow compare. I'm not sure if that matters for tracked fields or if some other mechanism would take over.
But maybe it is allright because it is kind of similar to what FunctionIngredient::maybe_changed_after
does.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also wonder if we'll run into problems with it if the setup is slightly different because, as far as I understand, maybe_changed_after is only called from deep verify memo. That means, it would get by passed if all queries only perform a shallow compare.
Yes, this relates to my question above about whether it's possible for other code paths besides maybe_changed_after
which call get
to run into this bug.
But I think it is safe as-is, because any query of low enough durability to possibly re-run in the current revision will also go through deep verify, which will check maybe_changed_after
on all of its inputs. So a struct that never gets maybe_changed_after
called on it in a given revision can't be accessed in that revision. I think?
But maybe it is allright because it is kind of similar to what
FunctionIngredient::maybe_changed_after
does.
Yes, it seems like it is already part of the contract of maybe_changed_after
(and verification in general) that it can validate things to the current revision.
c8f6317
to
b2ec4b9
Compare
Will attempt to read this tomorrow-ish. |
b2ec4b9
to
1ac16e2
Compare
I don't have a good repro yet but I just ran into the following error.
Which may indicate that we need to validate all tracked structs? |
Yeah, that traceback does suggest that there's still a path to accessing an un-validated struct :/ When you say you don't have a good repro, do you mean that you can't reproduce it consistently at all? That's going to make things difficult... |
1ac16e2
to
986fddf
Compare
@carljm I haven't spent time on finding a consistent repro. I ran into it when I switched between different files in the LSP |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I get the overall bug. Let me think on this.
My thought is that we should just update to the most recent revision during the get operation. I think the only way to have an issue is to cheat and leak the data through thread local storage. We do need to update to prevent later writes from occurring in that scenario, however
…On Wed, Aug 7, 2024, at 9:02 AM, Niko Matsakis wrote:
***@***.**** commented on this pull request.
OK, I get the overall bug. Let me think on this.
—
Reply to this email directly, view it on GitHub <#550 (review)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AABF4ZUZJEYIJAPF3RSKNITZQGZ6BAVCNFSM6AAAAABL35N3WOVHI2DSMVQWIX3LMV43YUDVNRWFEZLROVSXG5CSMV3GSZLXHMZDEMRSGY4TQNRZHE>.
You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>
|
Always updating on the get path makes sense to me and should simplify this PR. I'll update to that approach. |
986fddf
to
8d553da
Compare
Updated to always validate on get. Some things I'm not sure about here:
|
Test failure looks like a compiler improvement in nightly catching an unmatchable pattern, not related to this PR. |
I've ignored it; it should be fixed soonish: rust-lang/rust#129031. If you rebase, CI will pass. |
8d553da
to
5c541dc
Compare
I THINK this is fixed by #564 |
I can confirm this is fixed (https://github.com/astral-sh/ruff/pull/13016/files). Did we add a test for it? |
@MichaReiser I don't think I added a specific test, I'm going to close this PR, but a PR with a test would be great |
This adds a test for, and proposes one possible fix for, a bug we ran into with durabilities where we were hitting the "access struct from previous revision" assert.
The test models a situation where we have two File inputs (0, 1), where
File(0)
has LOW durability andFile(1)
has HIGH durability. We can query anindex
for each file, and adefinitions
from that index (just a sub-part of the index), and we caninfer
each file. Theindex
anddefinitions
queries depend only on the File they operate on, but theinfer
query has some other dependencies:infer(0)
depends oninfer(1)
, andinfer(1)
also depends directly onFile(0)
. The dependency graph is shown below: light color represents low durability, black represents high durability, dark gray are structs and struct fields; I didn't get around to adding their durability to the graph yet.(The graph is automatically generated using some hacky printlns I added to Salsa that create a mermaid.live graph; if I end up debugging another issue where it's useful, I'll probably clean this code up and try to make it generally reusable.)
The panic occurs because
definitions(1)
is high durability, and depends onindex(1)
which is also high durability.index(1)
creates the tracked structDefinition(1)
, andinfer(1)
(which is low durability) depends onDefinition.file(1)
.After a change to file 0 (low durability), we only shallowly verify
definitions(1)
-- it doesn't need deep-verify because it is high durability. So that means we never verifyindex(1)
at all (deeply or shallowly), which means we never markDefinition(1)
validated. So when we deep-verifyinfer(1)
, we try to access its dependencyDefinition.file(1)
, and hit the panic because we are accessing a tracked struct that has never been re-validated or re-recreated in R2.The proposed fix is that in
maybe_changed_after
for a tracked struct field, rather than checking that the struct has been created/validated at the current revision, check that it has been created/validated at least as recently as its durability last changed, and if that is older than the current revision, validate it. The logic here is that this tracked struct can't have changed unless some input at or higher than its durability has changed.The other possible fix I considered (but haven't tried implementing yet) is that when we short-circuit a shallow verify due to durability, we need to not only mark our own outputs validated, but recursively find all our inputs that also verify due to durability, and validate all of their outputs as well. That fix seems more complex and expensive than what I have in this PR.