-
Notifications
You must be signed in to change notification settings - Fork 17.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
unique: "run out of hash bits" bug in internal HashTrieMap #69534
Comments
Related Issues and Documentation
(Emoji vote if this was helpful or unhelpful; more detailed feedback welcome in this discussion.) |
cc @mknyszek |
There's no internal limit, this is just a bug. ("Run out of hash bits" means that the implementation would've expected a hash collision given that it just walked over (what it thinks is) the maximum depth of the tree and ran out of hash bits. Hash collisions are still handled, though.) Looking into it. |
I'll try to add an atomic counter to see how many calls it takes to trigger this panic |
My suspicion is an off-by-one error in creating new nodes. Like, the hash matches down to the last 4 bits, but the last 4 don't and we give up too early. It occurs to me that this codepath isn't well-tested because actual collisions (which are well-tested) short-circuit to appending onto the overflow list for a node. I'll try to write a test for this case and see if that's the problem. |
Yeah, actually, I'm feeling more confident in this diagnosis. If it's the case, I should have a fix and a test shortly. EDIT: No, that doesn't seem to be it. |
I took a brief look at planetscale/vtprotobuf#140 and I noticed that you're using It already looks like you already tried at tip though, past that commit, and ran into the same issue. Can you confirm that? |
Yeah, that's why I tried the tip version. Without that commit the panic occurs within a few seconds of me triggering that code path. With that commit it's a bit harder and occurs in 30-60 seconds. |
If you have any instructions to reproduce, or if you're willing to share a core file or something (so that I can inspect the state of the HashTrieMap on crash) that would help me iterate on this faster. I can also write a small patch to dump the state of the HashMapTrie on crash, if you're willing to run that and provide the output. (EDIT: If it helps, I can remove the actual key/value output, I think I just need the hash anyway.) |
Change https://go.dev/cl/614475 mentions this issue: |
Let me try to work on reproducing this and creating a core dump file. |
One is a test of unsafe.String usage, which was broken before CL 610738 was merged. The other is trying to improve coverage of "near collision" scenarios in the HashTrieMap where only the last few bits differ. This is intended to catch off-by-one errors in iterating down the tree. For #69534. Change-Id: I3f302e148e81269a50e93b5edf83cafc2d291098 Reviewed-on: https://go-review.googlesource.com/c/go/+/614475 Auto-Submit: Michael Knyszek <mknyszek@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Ian Lance Taylor <iant@google.com> Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com>
You are correct, I couldn't reproduce this anymore with the newest version after that fix. I probably accidentally ran a build with an older version. Sorry for the noise and thank you so much for all your hard work. In case this resurfaces, I will reopen this issue. |
Go version
go version devel go1.24-3d33437c45 Sun Sep 15 02:05:37 2024 +0000 linux/amd64
Output of
go env
in your module/workspace:What did you do?
Trying to intern all strings coming through gRPC using the new unique.Make(): new vtprotobuf extension. It seems like it is very easy to break some limits? The machine has 128 cores and 256GiB+ of RAM so resources shouldn't be a problem.
I tried creating a toy program that interns random strings and couldn't reproduce it, unfortunately.
What did you see happen?
I see this after some time:
Same happens with 1.23.1.
What did you expect to see?
I expected no panic.
The text was updated successfully, but these errors were encountered: