Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cc-wrapper: add support for shadowstack hardening flag #326819

Merged
merged 9 commits into from
Jul 28, 2024

Conversation

risicle
Copy link
Contributor

@risicle risicle commented Jul 13, 2024

Description of changes

Some background: https://discourse.nixos.org/t/future-design-of-hardening-flags/38826

This, along with #324429, is a replacement for #320597, abandoning the idea of combining multiple technologies into a single hardbackedgecfi flag.

This also adds an optional reversion of a glibc commit that default-disabled shadow-stack support at runtime, presumably because of distributions that started building with these flags enabled before they were able to test them properly. We don't have that problem, and for package sets like pkgsExtraHardening we probably want that on by default so that these packages get more testing.

Again, I do not have access to a shadowstack-capable machine, so I would greatly appreciate someone who does building as many packages as possible on this branch and telling me which packages need to have hardeningDisable set for them.

Things done

  • Built on platform(s)
    • x86_64-linux
    • aarch64-linux
    • x86_64-darwin
    • aarch64-darwin
  • For non-Linux: Is sandboxing enabled in nix.conf? (See Nix manual)
    • sandbox = relaxed
    • sandbox = true
  • Tested, as applicable:
  • Tested compilation of all packages that depend on this change using nix-shell -p nixpkgs-review --run "nixpkgs-review rev HEAD". Note: all changes have to be committed, also see nixpkgs-review usage
  • Tested basic functionality of all binary files (usually in ./result/bin/)
  • 24.11 Release Notes (or backporting 23.11 and 24.05 Release notes)
    • (Package updates) Added a release notes entry if the change is major or breaking
    • (Module updates) Added a release notes entry if the change is significant
    • (Module addition) Added a release notes entry if adding a new NixOS module
  • Fits CONTRIBUTING.md.

Add a 👍 reaction to pull requests you find important.

@alois31
Copy link
Contributor

alois31 commented Jul 13, 2024

Something doesn't look right, I don't see x86_Thread_features: shstk unless I manually set GLIBC_TUNABLES=glibc.cpu.hwcaps=SHSTK.

@alois31
Copy link
Contributor

alois31 commented Jul 13, 2024

Somehow the patch does not get applied:

nix-repl> pkgsExtraHardening.glibc.patches
[
  …/pkgs/development/libraries/glibc/2.39-master.patch
  …/pkgs/development/libraries/glibc/nix-locale-archive.patch
  …/pkgs/development/libraries/glibc/dont-use-system-ld-so-cache.patch
  …/pkgs/development/libraries/glibc/dont-use-system-ld-so-preload.patch
  …/pkgs/development/libraries/glibc/fix_path_attribute_in_getconf.patch
  …/pkgs/development/libraries/glibc/fix-x64-abi.patch
  …/pkgs/development/libraries/glibc/nix-nss-open-files.patch
  …/pkgs/development/libraries/glibc/0001-Revert-Remove-all-usage-of-BASH-or-BASH-in-installed.patch
  …/pkgs/development/libraries/glibc/reenable_DT_HASH.patch
]

But the correct flag is applied:

nix-repl> pkgsExtraHardening.glibc.override (old: builtins.trace old.enableCETRuntimeDefault old)
trace: true
«derivation /nix/store/y3d837gcqkqa685018y809wgxh832i3k-glibc-2.39-52.drv»

@risicle
Copy link
Contributor Author

risicle commented Jul 13, 2024

Well caught - think I know what I've forgotten...

@risicle
Copy link
Contributor Author

risicle commented Jul 13, 2024

Forgot to pass the enableCETRuntimeDefault value through from default.nix to common.nix

@alois31
Copy link
Contributor

alois31 commented Jul 14, 2024

I tried the same packages as in #320597 (comment) again. libxcrypt, pcre and llvm fail the same way as they did there, and I worked around by disabling shadowstack. Unfortunately, LLVM still does not build because ld.gold segfaults (no "control protection" messages in dmesg, but the build succeeds outside of pkgsExtraHardening…).

@risicle
Copy link
Contributor Author

risicle commented Jul 14, 2024

Have disabled for libxcrypt and pcre (let's hope they solve that at some point in the future as they're heavily used dependencies).. llvm has an ugly use of hardeningDisabled already.. let's see if we can get that tidied up in staging ahead of this... (#327093)

…fault

this appears to have been added to glibc because of the number
of packages in some distributions that were built with CET enabled
before a CET enabled machine was available to test for breakage
with.

we don't have that problem to such an extent and users of hardened
systems will likely want to enable this by default.
@risicle
Copy link
Contributor Author

risicle commented Jul 14, 2024

Apologies for the rebase - had to put it on top of #327093

@alois31
Copy link
Contributor

alois31 commented Jul 15, 2024

The ld.gold LLVM issue turns out to be related to the shadow stack after all (not sure why it segfaults afterwards):

FAIL: LLVM :: tools/gold/X86/bcsection.ll (46409 of 52661)
******************** TEST 'LLVM :: tools/gold/X86/bcsection.ll' FAILED ********************
Exit Code: 1

Command Output (stderr):
--
RUN: at line 1: rm -rf /build/llvm-src-18.1.8/llvm/build/test/tools/gold/X86/Output/bcsection.ll.tmp && mkdir -p /build/llvm-src-18.1.8/llvm/build/test/tools/gold/X86/Output/bcsection.ll.tmp
+ rm -rf /build/llvm-src-18.1.8/llvm/build/test/tools/gold/X86/Output/bcsection.ll.tmp
+ mkdir -p /build/llvm-src-18.1.8/llvm/build/test/tools/gold/X86/Output/bcsection.ll.tmp
RUN: at line 2: /build/llvm-src-18.1.8/llvm/build/bin/llvm-as -o /build/llvm-src-18.1.8/llvm/build/test/tools/gold/X86/Output/bcsection.ll.tmp/bcsection.bc /build/llvm-src-18.1.8/llvm/test/tools/gold/X86/bcsection.ll
+ /build/llvm-src-18.1.8/llvm/build/bin/llvm-as -o /build/llvm-src-18.1.8/llvm/build/test/tools/gold/X86/Output/bcsection.ll.tmp/bcsection.bc /build/llvm-src-18.1.8/llvm/test/tools/gold/X86/bcsection.ll
RUN: at line 4: /build/llvm-src-18.1.8/llvm/build/bin/llvm-mc -I=/build/llvm-src-18.1.8/llvm/build/test/tools/gold/X86/Output/bcsection.ll.tmp -filetype=obj -triple=x86_64-unknown-unknown -o /build/llvm-src-18.1.8/llvm/build/test/tools/gold/X86/Output/bcsection.ll.tmp/bcsection.bco /build/llvm-src-18.1.8/llvm/test/tools/gold/X86/Inputs/bcsection.s
+ /build/llvm-src-18.1.8/llvm/build/bin/llvm-mc -I=/build/llvm-src-18.1.8/llvm/build/test/tools/gold/X86/Output/bcsection.ll.tmp -filetype=obj -triple=x86_64-unknown-unknown -o /build/llvm-src-18.1.8/llvm/build/test/tools/gold/X86/Output/bcsection.ll.tmp/bcsection.bco /build/llvm-src-18.1.8/llvm/test/tools/gold/X86/Inputs/bcsection.s
RUN: at line 5: /build/llvm-src-18.1.8/llvm/build/bin/llc -filetype=obj -mtriple=x86_64-unknown-unknown -o /build/llvm-src-18.1.8/llvm/build/test/tools/gold/X86/Output/bcsection.ll.tmp/bcsection-lib.o /build/llvm-src-18.1.8/llvm/test/tools/gold/X86/Inputs/bcsection-lib.ll
+ /build/llvm-src-18.1.8/llvm/build/bin/llc -filetype=obj -mtriple=x86_64-unknown-unknown -o /build/llvm-src-18.1.8/llvm/build/test/tools/gold/X86/Output/bcsection.ll.tmp/bcsection-lib.o /build/llvm-src-18.1.8/llvm/test/tools/gold/X86/Inputs/bcsection-lib.ll
RUN: at line 7: /nix/store/nibzckhjmpaa8iq8hyp8qvg2vxdaw89a-gcc-wrapper-13.3.0/bin/ld.gold -shared --no-undefined -o /build/llvm-src-18.1.8/llvm/build/test/tools/gold/X86/Output/bcsection.ll.tmp/bcsection.so -m elf_x86_64 -plugin /build/llvm-src-18.1.8/llvm/build/lib/LLVMgold.so /build/llvm-src-18.1.8/llvm/build/test/tools/gold/X86/Output/bcsection.ll.tmp/bcsection.bco /build/llvm-src-18.1.8/llvm/build/test/tools/gold/X86/Output/bcsection.ll.tmp/bcsection-lib.o
+ /nix/store/nibzckhjmpaa8iq8hyp8qvg2vxdaw89a-gcc-wrapper-13.3.0/bin/ld.gold -shared --no-undefined -o /build/llvm-src-18.1.8/llvm/build/test/tools/gold/X86/Output/bcsection.ll.tmp/bcsection.so -m elf_x86_64 -plugin /build/llvm-src-18.1.8/llvm/build/lib/LLVMgold.so /build/llvm-src-18.1.8/llvm/build/test/tools/gold/X86/Output/bcsection.ll.tmp/bcsection.bco /build/llvm-src-18.1.8/llvm/build/test/tools/gold/X86/Output/bcsection.ll.tmp/bcsection-lib.o
/nix/store/j6sbda9i71w9dbsvrkzpyjixmjwlvbzf-binutils-2.42/bin/ld.gold: error: /build/llvm-src-18.1.8/llvm/build/lib/LLVMgold.so: could not load plugin library: /build/llvm-src-18.1.8/llvm/build/lib/libLLVM.so.18.1: rebuild shared object with SHSTK support enabled

--

********************

Doesn't work whether you build it with or without shadow stack.

@risicle
Copy link
Contributor Author

risicle commented Jul 15, 2024

Ah - this sounds like it might be resolved with glibc's enableCET permissive mode. Perhaps I shouldn't force it to strict mode.

Out of interest, does it give you the same error with/without shadowstack enabled for llvm?

@alois31
Copy link
Contributor

alois31 commented Jul 15, 2024

No, with shadowstack enabled the crash is the same as in #320597 (comment) .

@risicle
Copy link
Contributor Author

risicle commented Jul 15, 2024

Let's try with permissive mode? (sorry, big rebuild)

@risicle
Copy link
Contributor Author

risicle commented Jul 15, 2024

Oh... but before you go for the rebuild you could also try adding export GLIBC_TUNABLES=glibc.cpu.x86_shstk=permissive to the preCheck phase of LLVM's build?

@alois31
Copy link
Contributor

alois31 commented Jul 16, 2024

Oh... but before you go for the rebuild you could also try adding export GLIBC_TUNABLES=glibc.cpu.x86_shstk=permissive to the preCheck phase of LLVM's build?

This does not help, for some reason (same "rebuild shared object with SHSTK support enabled" errors).

@risicle
Copy link
Contributor Author

risicle commented Jul 16, 2024

It's possible the env var gets stripped by the test harness machinery. (I'm only guessing this is during the checkPhase because I saw the word FAIL there)

Any good if you do a rebuild all the way from glibc with the new branch?

@alois31
Copy link
Contributor

alois31 commented Jul 17, 2024

All packages I mentioned in the last PR as well as the ones you suggested on Matrix (in total: ffmpeg, fish, lix, nginx, nsncd, openssh, polkit, postgresql_16, python3Packages.scipy, tmux) have been built now on the latest state. LLVM builds now, and the following failures have been observed:

  • lix still fails the tests because they segfault, disabling shadowstack hardening works around.
  • postgresql_16 has a very non-informative test suite failure that is not worked around by disabling shadowstack:
    not ok 199   + xml                                      1339 ms
    
  • The python3Packages.pytest-timeout and python3Packages.pytest-xdist tests have exhibited flakiness, but the root cause may be race conditions exposed by high system load instead of something shadow stack-related (as they did build successfully on a second try).

No other build failures have been observed, in particular python3Packages.execnet succeeds now without further changes.

@risicle
Copy link
Contributor Author

risicle commented Jul 17, 2024

Awesome. That's extremely useful.

In the short term we can simply mark these with shadowstack disabled to try and avoid these problems.

I'm wondering though whether, when we have more time to investigate them individually, we'll find that it's simply the tests doing things that shadow-stack doesn't like. For example - digging into libxcrypt my guess is that it's this test https://github.com/besser82/libxcrypt/blob/72f75aa370ae96ccd2cc44ea3cf4182d8679ffbe/test/explicit-bzero.c#L43 that's causing the problem. Shadow-stack allegedly doesn't like swapcontext(), which is why they removed it from the library several years ago besser82/libxcrypt@c3f01c7. But it seems it remains in this test. In this particular case, setting export GLIBC_TUNABLES=glibc.cpu.x86_shstk=off in preCheck may address the issue and allow us to build libxcrypt with shadowstack enabled.

But on the other hand I wouldn't want to barge in and "fix" the postgres tests by runtime-disabling shadow-stack, as I've no reason to believe it's not a genuine problem being exhibited. And come to think, runtime-disabling shadow-stack for libxcrypt's tests would surely prevent it catching a hypothetical future genuine problem with the actual library.

These runtime-optional security features are a bit of a minefield.

FWIW the postgres failure not responding the shadowstack disablement is probably a sign that one of its linked libraries is not completely ok with shadow-stack. Hopefully at some point we'll find it.

@alois31
Copy link
Contributor

alois31 commented Jul 18, 2024

I regret that I have to inform you that the libxcrypt test failure was caused by a seccomp filter blocking access to the map_shadow_stack system call. Since this filter is not present in upstream software, the hardeningDisable on libxcrypt can be removed again (and it's stdenv rebuild time once again).

@alois31
Copy link
Contributor

alois31 commented Jul 19, 2024

Regarding PCRE, it seems that the only thing not compatible with shadow stack is the JIT in the EOL version (PCRE2 works fine). We might consider disabling the JIT, or identifying and backporting the shadow stack support, instead.

@alois31
Copy link
Contributor

alois31 commented Jul 19, 2024

Another day of big rebuilds later, I am back with the following results (relative to before all the "disable shadowstack hardening flag" commits):

  • The pcre failure remains, which I have worked around by disabling the JIT.
  • The llvm failure remains and, in accordance with prior reports, is worked around by disabling shadow stack.
  • It seems that the lix failure is caused by Boost coroutines, as a trivial example using them crashes too, and is worked around by disabling shadow stack. (This may not warrant testing again after the 2.91 release, since Boost coroutine usage was replaced by C++20 coroutines.)
  • The postgresql_16 mystery failure reproduces outside of pkgsExtraHardening, so it probably can be ignored for now here.

Given the nature of the observed failures (Boost coroutines and JITs), I have performed some additional builds, which are summarized below.

  • nix predictably crashes during its tests too (interestingly at a later phase than lix), and works when disabling shadow stack.
  • luajit, nodejs, openjdk, pypy and spidermonkey all build successfully without additional workarounds.

Furthermore, I have observed binaries in the following packages to start without shadow stack, even though it has not been explicitly disabled: ffmpeg, luajit, nodejs, nsncd, openjdk, spidermonkey. I'm not sure to what extent this is fully intentional (for example, ffmpeg lacks the required note on a random assortment of libraries, nsncd itself doesn't have it (possibly because rustc doesn't set it?), openjdk links against gnutls which doesn't have the note; luajit, nodejs and spidermonkey lacking the note I assume to be intentional).

@risicle
Copy link
Contributor Author

risicle commented Jul 21, 2024

How does the current HEAD look for you?

@alois31
Copy link
Contributor

alois31 commented Jul 21, 2024

Basically the same as in the previous comment. It seems that LLVM (and by extension, rustc) does not support generating shadow stack-compatible code on x86_64. I also built a Go program (lazygit), and have seen what I expected (works, but no shadow stack). I think this is good for the initial iteration now.

doc/stdenv/stdenv.chapter.md Outdated Show resolved Hide resolved
Comment on lines +305 to +308
# causes shadowstack disablement
pcre = super'.pcre.override { enableJit = false; };
pcre-cpp = super'.pcre-cpp.override { enableJit = false; };
pcre16 = super'.pcre16.override { enableJit = false; };
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are the first packages that pkgsExtraHardening is making an explicit domain‐specific hardening vs. functionality/performance trade‐off for, rather than just blanket‐defaulting a hardening switch on and letting individual packages decide what to do, right?

I’m not opposed necessarily, just want to check that this is something new that’s happening. I do worry a little about bit rot as package options drift without anyone checking pkgsExtraHardening. And it might be hard to make judgement calls in future when you’re faced with whether to turn all JITs ever off for security reasons.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if this answers the question, but here I think is the starting point that lead to this: #326819 (comment) . TLDR: legacy PCRE JIT is broken with shadow stack, so either JIT or shadow stack have to be disabled, and the latter has cascade effects.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, okay. The fact that it would require disabling it for all the downstream packages makes this make sense to me.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is the first time we're overriding an actual package.

@emilazy emilazy merged commit 8a837af into NixOS:staging Jul 28, 2024
9 of 10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants