-
-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix: setup signal handlers in interpreted code #14766
base: master
Are you sure you want to change the base?
Fix: setup signal handlers in interpreted code #14766
Conversation
The interpreter won't receive signals anymore, since the interpreted code will receive them, but the interpreter's signal loop fiber won't be resumed and it won't handle signals anymore. Unblocks the default handler for SIGCHLD into the interpreted code, so Process.run and waiting for a sub-process works just as expected.
Hypothesis A: the signal handler is running on the current stack, and that's messing with the interpreter. Maybe registering an alternate stack with sigaltstack() and pass Hypothesis B: the signal handler is replaced by interpreted code, which may do things that aren't thread safe, for example call into GC from the signal handler (not the signal loop). We might want the interpreted code to change the writer fd of the interpreter, so the signal handler runs in the interpreter but will be received by the interpreted code. |
The interpreted code cannot register signal handlers because it will run interpreted code that isn't safe. Even registering SIG_IGN must happen in the interpreted, otherwise the handler tries to run some interpreted and segfaults trying to allocate memory. Introduces a couple interpreter intrinsics to: 2. register the signals to the OS, so the process can receive them; 1. replace the interpreter's signal writer to write to the interpreted code's signal loop; C signal handlers will thus receive the signals and handle them in the interpreter (minimal, non interpreted code, that writes to a file descriptor) while the crystal signal handlers will be interpreted normally.
We can reenable a few signal specs, but a couple ones to detect or handle segfaults take forever to complete and have been disabled when interpreted.
Hypothesis B was the issue. We needed a couple interpreter intrinsics after all. The interpreter is registering the handlers (can't run interpreted code in C signal handlers), so it receives the signals and forwards them to the interpreted signal pipe, and the signal loop of the interpreted code will invoke the interpreted handlers 👍 Note: the interpreter won't handle signals anymore, but it already didn't handle them after starting to interpret code, since the interpreter bypasses the fiber scheduler (it directly resumes fibers) so the signal loop fiber was never resumed. |
With this patch, I can run the full minitest.cr test suite using the interpreter ❤️ |
This reverts commit 7b2c96c.
Running the std specs with the interpreter was painfully slow, so I compiled crystal in release mode and then tried to run the specs, but it segfaulted after a bunch of forks, and before reporting any progress. I tried running: $ SPEC_SPLIT="3%4" bin/crystal i spec/std_spec.cr A gdb session shows that the interpreter eventually segfaults: Program received signal SIGSEGV, Segmentation fault.
0x0000555555e2181e in interpret ()
at /home/julien/work/crystal-lang/crystal/src/compiler/crystal/interpreter/interpreter.cr:354
354 {% begin %}
(gdb) bt
#0 0x0000555555e2181e in interpret ()
at /home/julien/work/crystal-lang/crystal/src/compiler/crystal/interpreter/interpreter.cr:354
#1 0x0000555555c4a753 in interpret ()
at /home/julien/work/crystal-lang/crystal/src/compiler/crystal/interpreter/interpreter.cr:252
#2 0x00005555561ca274 in interpret ()
at /home/julien/work/crystal-lang/crystal/src/compiler/crystal/interpreter/repl.cr:92
#3 interpret_and_exit_on_error ()
at /home/julien/work/crystal-lang/crystal/src/compiler/crystal/interpreter/repl.cr:96
#4 0x00005555561c6587 in run_file ()
at /home/julien/work/crystal-lang/crystal/src/compiler/crystal/interpreter/repl.cr:65
#5 repl () at /home/julien/work/crystal-lang/crystal/src/compiler/crystal/command/repl.cr:47
#6 0x0000555556126ea4 in run () at /home/julien/work/crystal-lang/crystal/src/compiler/crystal/command.cr:104
#7 0x000055555574671c in run () at /home/julien/work/crystal-lang/crystal/src/compiler/crystal/command.cr:55
#8 run () at /home/julien/work/crystal-lang/crystal/src/compiler/crystal/command.cr:54
#9 __crystal_main () at /home/julien/work/crystal-lang/crystal/src/compiler/crystal.cr:11
#10 0x000055555574cf59 in main_user_code () at /home/julien/work/crystal-lang/crystal/src/crystal/main.cr:118
#11 main () at /home/julien/work/crystal-lang/crystal/src/crystal/main.cr:104
#12 main () at /home/julien/work/crystal-lang/crystal/src/crystal/main.cr:130 |
Interestingly the specs that failed when calling out to |
I think the problem, now, is support for It's likely unsafe to fork/exec in the interpreted code, and we should |
The interpreter won't receive signals anymore, since the interpreted code will receive them, but the interpreter's signal loop fiber won't be resumed and it won't handle signals anymore.
Fixes the default handler for SIGCHLD into the interpreted code, so
Process.run
now works as expected and don't hang forever.closes #12241
Alternative to / closes #14019
TODO:
I already marked some specs as pending (mostly those about segfaults), but there seems to be another one failing, or maybe it's a more common occurrence.--release
Maybe the new segfault is related to the support of
fork
in the interpreter?