-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
vl-convert-python crashes with ARM64 PointerAuthentication error #67
Comments
Update: I just ran the same codebase in an x86 Mac and it worked fine there. |
Searching and reading a bit more, the error message ("Check failed: Deoptimizer::IsValidReturnAddress(PointerAuthentication::StripPAC(pc), isolate_)") seems to come from the V8 engine. Some deep nuance about the way vl-convert is interfacing to V8? |
I wrote a very small standalone test program to try and reproduce this issue, I just bundle all the Vega-Lite specs that are generated by the application where I see the problem and pass them straight into vl-convert-python: But no dice, the test program runs just fine (takes about 45 seconds on my Mac). The issue must lie in something about the interaction either of the Rust interface to the V8 engine, or how the Rust library that does it interplays with the Python interpreter where it gets embedded. |
Success! I changed the test program a bit to run multiple rounds of the whole set of files, and now it does reproduce the issue. |
Thanks for the report, investigation, and repro @sacundim. This is really helpful. I don't have any ideas off the top of my head, but one thing that comes to mind is that we've previously run into an issue with the arm64 linux wheels that cropped up only when cross-compiling from linux on x86. This issue went away when I compiled locally inside an arm64 docker container (running on an M1 mac). In the end we worked around the issue by downgrading deno/v8 and continuing to cross-compile with GitHub actions. So the first thing I want to try is to see if there's a difference in behavior for wheels that are compiled inside an arm64 docker container (rather than cross compiled). |
Ok, I was able to reproduce the crash (thanks again for the easy to use repro @sacundim). I rebuilt the Could you give this wheel a try and see if it fixes the issue for you as well? The take away is that it looks like we can't trust cross-compilation for building aarch64 images. I'll start looking into options for building on aarch64 directly on GitHub actions. |
I'm trying the hand-compiled one now. A 6-round run of the test case just succeeded, now I'm trying 12 rounds to really stress it. Will later try the original program. I wonder if you get the compilation process to dump out all the exact compiler options it uses, and compare between the cross-compilation and native environments, if you will see that they're different. Because obviously I could be wrong, but my inclination is to suspect that cross-compilation should produce the same object code given the same source files and compiler options. (Modulo nondeterministic factors like random seeds affecting hash table orders and such) |
The 12-round run with the recompiled package succeeds, and two runs of the original app do so as well. |
Ok, sounds good! I've had some luck cross compiling working packages on GitHub actions using QEMU. I'll ping you again when I have another wheel to try out. Thanks! |
Here's a new wheel that I built on GitHub Actions using QEMU (in #68). It's also built with manylinux2014 support, so should be just as compatible as the current cross-compiled wheels. vl_convert_python-0.10.3-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl This is running the repro successfully for me, so hopefully this takes care of the issue! |
Should be fixed in the just released 0.11.0. Thanks again for the repro, this wouldn't have gotten fixed otherwise! |
I tried 0.11.0 and just confirmed that my app runs successfully. Thanks for the quick turnaround! |
Environment:
I am porting a Python app of mine from Vega-Altair 4.2.x to 5.0.1, and altair_saver to vl-convert-python, and I get errors like this after I do a certain number (that I haven't measured) of SVG exports in the same process.
I don't have much of a clue how to troubleshoot this—any suggestions welcome. Searching the web a bit, this
PointerAuthentication
andStripPAC
stuff is related to ARM64 security features to detect if a pointer has been modified, but I have no clue if this would be a stack overflow issue vs. e.g. the Javascript interpreter doing some business with JIT code generation or garbage collection.Two things that I have managed to establish are:
I'm still doing some investigation to see if e.g. there's a fixed number of invocations of SVG export that triggers the failure or there is one subtask that succeeds but somehow leaves the process in a bad state that causes the subsequent ones to fail, but I don't know if there is a more direct way of figuring out what the cause is so I thought I'd file this ticket right away.
The text was updated successfully, but these errors were encountered: