-
Notifications
You must be signed in to change notification settings - Fork 60
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Stackless 3.x segfaults on MacOS #173
Comments
This could be related to the problem reported in pull request #163: "In fact it no longer works for gcc 5.4." |
Hi Anselm, does it make sense for me to try? |
Test Stackless right after compiling. Might help to fix Stackless on MacOS (stackless-dev/stackless#173).
@ctismer Hi Christian, your help is more than welcome. Could you try master-slp too? |
@ctismer You can use the instructions for Linux https://github.com/stackless-dev/stackless/wiki/BuildForConda#building-on-linux to reproduce the faulty build on your Mac. I'm fairly sure that the problem is caused by LTO. You could try the following patch:
With this code, |
I just wrote a small test program
Then I compiled it using clang 6.0.0 and gcc 6.3.0 with -O3 and looked at the generated assembly code:
Obviously the only reliable way to guarantee that the function from the same translation unit gets called without optimisations is to use a volatile function pointer. Therefore I'll apply the patch from my previous post. |
I can report that compiling stackless on OSX 10.14.2 (Mojave) with stable OSX (Apple LLVM version 10.0.0 (clang-1000.10.44.4)) works successfully. Further I can also report that the
Other than this other failed tests seem to be minor naming inconveniences with Python3. |
@Fohlen Thank you very much for your feedback. Which version or branch did you compile? |
I think I've encountered a similar issue and I've experienced it with both 2.7 and 3.7. At least it seem to trigger right after I've tested the following code:
And the debug session looks like this:
I tried to re-produce the same issue in the 3.7 version now, but after much testing in many directions it seems I've provoked another kind of issue which I've pasted bellow. However I recall that the 3.7 python crashed with a "bus error" as well at some point. But at least from a users perspective the same happened with both versions in both cases, once the execution of the code above are done it crashes.
|
@mikalv Thank you for your report. Unfortunately I still can't debug this issue (no Mac), but probably the Stack switching code is not correct for MacOS. For now the fix is to add a |
@akruis No problem, I'm happy to give it a try to fix this, however I'm not that familiar with assembly and stackless (warning about possible stupid questions upcoming). MacOS's kernel is a fork&merge of mach3 and the FreeBSD kernel - where most of the BSD api is available for POSIX applications, so in general software that runs on FreeBSD runs fine on MacOS, but sometime that's false. The document you linked seems correct from what I know. The little I know of assembly would be that MacOS on 64 architectures adopt the System V AMD64 ABI reference, so at least a syscall would use the registers rdi, rsi, rdx, r10, r8 and r9 as arguments just as on linux if I'm correct. I'm currently reading up on stmxcsr and fstcw which was totally new for me. By the way, check out http://www.darlinghq.org/ - If I recall right they had a kernel module for linux now to simulate the MacOS userspace. |
Ok I think I understand the code now, it store/loads the register states. I did a new build from current master, and seems I provoked it to crash at the startup. However it seems very related.
|
This is the slp_switch function after it's compiled by the default Apple compiler.
|
@mikalv Two suggestions
And many thanks for the hint to the Darling Project. |
Was the result of your first suggestion with the master branch of this repo. I'm not totally sure how to "fix" that error yet. I'll report on suggestion two soon. |
This was odd, but good news. Pickle failed the test. However, stackless passed my manual test (the snippet from my first comment). I posted both bellow. Earlier I've tested these: However, running make clean, and then
|
Whatever issues 3.7 and 2.7 had, it seems the master-slp branch has resolved it. |
The unused full clobber list must contain "st" too.
I just pushed a fix for this error: d29d047 The assembly code of the C-function |
What's the output of |
No, the "Stackless code" in master-slp and 3.7-slp is the same. It is more likely an effect of the disabled optimisation in debug builds. |
I cut the text where everything was OK - bellow is where it started failing:
|
@mikalv Thank you very much for providing this output. All failures are caused by the same root cause. It could be related to the other Mac failures. About the root cause of the pickling failures: if Stackless switches from one tasklet to another tasklet it uses one of two different mechanisms. Either hard-switching or soft-switching. Hard switching is always possible. It manipulates the stack (slp_switch). Soft switching is only possible, if the interpreter has not been invoked recursively by an extension function written in C. If soft switching is possible, it it has advantages:
The error message "frame had a C state that can't be restored." means, that a pickled tasklet could not be restored, because Stackless used hard switching. But why? |
I have no Mac. Therefore I can't debug or test on MacOS. I can use Travis CI to compile Stackless 2.7.15 and 3.6.6. Stackless 3.5.6 fails to compile.
Unfortunately Stackless 3.6.6 does not work. Example::
The text was updated successfully, but these errors were encountered: