Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow NIX_* environment variables to be passed to rustc #466

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

jazzdan
Copy link
Contributor

@jazzdan jazzdan commented Oct 28, 2023

Otherwise if you are running buck2, and by extension rustc, from within nix any C(++) builds will fail.

All credit to @fnichol who pointed me to this fix in their prelude.

Let me know if this fix is better applied somewhere else.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Oct 28, 2023
@thoughtpolice
Copy link
Contributor

thoughtpolice commented Oct 28, 2023

This is, IMO, not the right place to put this. It is far too tied to the specifics of the Nixpkgs/Rust toolchain you are using (nixpkgs? rust-overlay? mozilla's overlay? your own custom binaries? and what git versions are those?) and may become fragile if the version of Buck2's prelude and nixpkgs get desynchronized somehow, e.g. after a merge from staging and you run something like nix flake update.

Besides all of that, nobody at Meta uses Nix for anything like this, so it will inevitably bitrot and nobody there will be able to maintain it. They won't be in a position to fix or migrate it or know when to add or remove things, or understand why and now those options may break (or help enforce!) hermeticity.

It is probably something better attached to the actual toolchain definition, i.e. system_rust_toolchain should take an inherited_envs parameter or something that lists environment variables that it should inherit from the buck2 daemon directly. Maybe all toolchains should take that parameter; I don't know.

That's just one possible fix, though. I'm not even sure that's the best one. Really, in an ideal world, system_rust_toolchain would be able to discover this on the fly somehow. But you need some form of dynamic dependencies for that. I'd have to think more about it.

@jazzdan
Copy link
Contributor Author

jazzdan commented Oct 28, 2023

Yeah I am not attached to this solution at all. Something like inherited_envs sounds good to me, but curious to hear from folks more experienced than I.

@ndmitchell
Copy link
Contributor

I can think of three ways to go:

  1. Have inherited env vars as @thoughtpolice suggest. I think the issue with that is really the NIX_ENV_VAR_PREFIXES field. That seems a bit grim. It also requires quite a lot of configuration.
  2. Accept that this is a bit grim whichever way you slice it, and just hardcode that anything with a NIX_ prefix is preserved. Only a few lines of code. No configuration. Not beautiful, but not horrible, and works out of the box for most people.
  3. Make Nix users specify a rustc binary that has these environment variables prebaked in. I think that's actually not too hard, just do something like (untested):
genrule(
    name = "rustc",
    cmd = "echo \$NIX_WHATEVER=$NIX_WHATEVER rustc > $OUTPUT && chmod +x $OUTPUT",
    output = "rustc",
    executable = True,
)

Now you have produced a rustc that captures the environment variables, so even if the Rust toolchain omits them, users get it back.

That said, option 2 seems simple and effective? Although I'd be curious what the Rust rule authors think, e.g. @zertosh @davidbarsky

@thoughtpolice
Copy link
Contributor

thoughtpolice commented Oct 28, 2023

I think that solution 3 actually leads to a more complete solution 4 which should be the real way for people to use Nix for Buck2: there should, instead of a system_rust_toolchain, be a nix_rust_toolchain, and this rule should invoke nix itself in order to:

  1. Get access to the rustc binary, i.e. by copying a closure from https://cache.nixos.org
  2. Interrogate the list of shell variables needed from the resulting shellHook
  3. Create a new derivation that wraps rustc as noted above and captures those variables explicitly,
  4. Returns the rustc created from step 3 as the compiler to use.

There are two complications:

  1. We need to also write a nix_cxx_toolchain too as a hard dependency.
  2. We also need to express what other Nix expressions should be included in the closure (i.e. you may want zlib available so you use the one from nixpkgs.)

This would mean that you do not need direnv or any other tool like nix shell to populate your terminal environment. You could also specify a version of nixpkgs this way, and it would be independent of anything being hard coded.

I haven't really needed this myself (yet), but maybe I can try to get a working nix_cxx_toolchain going or something.

@ndmitchell
Copy link
Contributor

Is it likely that some people would want the whole Nix hermetic toolchain (which should certainly be available) and some other Nix users would want to just use direnv? If so, is doing both the proper toolchain and a hack to let NIX_ variables through worthwhile?

@davidbarsky
Copy link
Contributor

That said, option 2 seems simple and effective? Although I'd be curious what the Rust rule authors think, e.g. @zertosh @davidbarsky

As a preface, I'm on a 9 hour flight, so the amount of oxygen I have to think isn't particularly high, nor have I had a decent amount for minute. Anyways:

  • I'm a (bad) nix user who uses it as a homebrew replacement, but I also have fbsource. I may or may or may not notice breakage, but it'd be crummy if this messed with toolchains bundled within the monorepos. That said, this might be a self-inflicted issue.
  • I'm not sure I'm an author as so much I occasionally poke at the implementation details. That being said, if the use case I mentioned in the first point continues to work, I don't really have objections. even if it doesn't, it's arguable this is a self-inflicted problem.

@cormacrelf
Copy link
Contributor

There is a rules_nixpkgs for Bazel. They don't seem to be using the NIX_ environment variables to do linking, at all: rather, they construct symlinks into /nix/store and pass those through the normal bazel rules. E.g. here, here, here. You can definitely use this to make a "hermetic" (big caveat) local-only toolchain. If more of Buck's rules supported uploading to the action cache, then this would be viable, as you could allow developer machines to push and pull from the cache.

The problem is with remote execution. The hermetic seal is drawn outside buck-out. Making nix-based toolchains = symlinking to /nix/store, but if you try to use such a symlink (buck-out/v2/gen/.../__glibc__/out/lib/libc.so.6 -> /nix/store/... on a remote executor, there is no /nix/store. So remote execution is very difficult. Your toolchain's binaries are all symlinks to outside buck-out. The executables' RUNPATHs point at /nix/store paths. So do the RUNPATHs of all the shared libraries. There are bash scripts etc that call other scripts and read files in /nix/store. None of these things can work.

The rules_nixpkgs project has been discussing how to get remote execution working since early 2022. Most people are building a custom docker image for RE using nix, that has everything you can possibly need already available in /nix/store, in advance. Executing an action like "link (the linker is a symlink into /nix/store) main.o with glibc (which happens to symlink to /nix/store)" just uses whatever's been pre-populated in the docker image. Any new nix dependency => create a new docker image, push it, update the RE configuration at the same time.

I think that's the only idea that's not insane for anyone who doesn't have 10,000 developers to throw at the problem.

There are 3 main alternatives, and all of them sound like pretty big undertakings.
  • Add support for nix store remote copying tweag/rules_nixpkgs#404 -- You deploy an RBE server that is also a Nix remote store. You configure the RBE action runner or a local machine with an environment variable/host address, it connects to the remote, runs the nix build there, and copies the results back. On the RBE itself, copying the results back is a no-op, I think. The Nix remote store stores everything at /nix/store on the RBE. This would surely only work if you deployed your own RBE, with management of chroots etc done yourself.

  • https://skillsmatter.com/skillscasts/17673-remote-execution-with-rules-nixpkgs -- mount an NFS server at /nix/store in all the RBE executors. A separate nix remote store manages the contents (r+w). On the developer machine, a local-only action runs in the rule that writes /nix/store symlinks, that tells the the remote store to build and save a binary for the nix derivation and save it in NFS. Then once an RE action is executing, it's already there on the readonly NFS mount. This is apparently quite efficient as NFS only sends the blocks you actually try to read. But it's obviously a pretty involved setup.

  • You can always... rewrite all the /nix/store paths to point to ./buck-out/.... This sounds absolutely horrifying, but it could work. There is a lot of existing nix tooling around rewriting store paths, like patchelf and nix bundle with e.g. relocatable.nix. A difficulty is the lack of sharing. If you want glibc, you could get the closure (about 4 or 5 /nix/store directories) and rewrite all the nix paths in them to refer to ./buck-out/..., writing the output to buck-out. Cool. But almost every package needs glibc, so they will all do that independently. Once your dependencies are large... you add clang, and then lldb, both of which rewrite and save their own copy of llvm etc... now you have 3GB of stuff, when it should be 1GB. Sounds pretty slow in every way.

    • My best idea for Buck is to do store path rewriting, but use some combo of anon targets (to dedupe multiple glibcs, llvms, etc) and dynamic dependencies (to see what nix packages are direct inputs to clang (nix-store --query --references /nix/store/...clang-15.0.7 => llvm, glibc, libgcc, etc) that also need to be called as anon targets, in order to do the rewriting in for clang itself). Then buck would basically mirror /nix/store in buck-out/v2/anon, deduplicated just as /nix/store is, but with all the paths rewritten, built on-demand, and available in the cache and on RBE. Alas that is not how dynamic dependencies work. You have to run actions, I don't think you can bind an output using an anon target. So this is a pipe dream.

Using a docker image with the required nix dependencies in it seems doable and doesn't require deploying and administering a bunch of servers. At least it's possible to have consistent versions and bust caches when they change. I have an improved version of buck2-nix that does basically what rules_nixpkgs does, which I'm polishing up. I made a clang toolchain with it, not even using Nix's clang wrapper script, that has the deps and exec deps set correctly such that it will happily cross-compile to linux and link glibc, by using --target-platforms on a Mac. So it's promising, at least.

I think it will also need symlinks, even dangling ones, to be preserved in buck2's re_upload. At the moment it just looks straight through symlinks and recursively copies /nix/store/some-path to the remote executor.

@ndmitchell
Copy link
Contributor

OK, so this seems a complicated issue. It would be great to make some progress on it.

  • @davidbarsky - are you saying that if we land a variant of this patch we break stuff for you? And is it OK to break stuff?
  • @cormacrelf - I think dynamic actions could bind existing outputs - if you symlink to the actions is that enough?
  • @cormacrelf - "If more of Buck's rules supported uploading to the action cache" - this seems quite feasible. You could imagine that the upload to cache could be controlled so even if False you can specify a config to make them all true.

In the meantime, is some variant of whitelist all NIX_ variables a reasonable way to go? Or is the genrule trick I mentioned better?

fnichol added a commit to systeminit/si that referenced this pull request Nov 23, 2023
Update to patch required after change in
facebook/buck2-prelude@9e1cfaf
resulted in a merge conflict (trivial to resolve but there nonetheless).

References: #2266
References: facebook/buck2#466

Signed-off-by: Fletcher Nichol <fletcher@systeminit.com>
@cormacrelf
Copy link
Contributor

@ndmitchell

The genrule approach can't work with system_rust_toolchain as-is, because it doesn't allow configuring RustToolchainInfo.compiler. If it did allow that, the genrule would look SOMETHING like this, but even this doesn't work, because using nix is sometimes like bashing your head against a brick wall: nix flakes only work if every single file it references is checked into git... (side note why on earth should nix care? but it does.) srcs in a genrule are copied to buck-out, not in git.

non-working concept

You need to use some kind of nix command to capture the environment, because the NIX_ env vars are huge (multiple kilobytes each) and quite unwieldy to paste into a genrule.

# root BUCK file
genrule(
    name = "nix-rustc-script",
    srcs = glob([ "**/*.nix" ]) + ["flake.lock"],
    bash = """
    (
        echo '#!/usr/bin/env bash';
        nix develop . --command bash -c set \
         | grep '^PATH=\|^NIX_\|^PKG_CONFIG_\|^LD_' \
         | sed 's/^/export /';
        echo 'exec rustc \"$@\"';
    ) > $OUT
    chmod +x $OUT
    """,
    out = "rustc.sh",
)

command_alias(
    name = "nix-rustc",
    exe = ":nix-rustc-script",
    visibility = ["toolchains//..."],
)

Given that, it would be nicer to say to Nix folks (a core audience for buck) that there's a flag you can set to make it work. I would add a hack via:

  • Flag in rustc_action.py to pass through any variable starting with a prefix
  • Field on RustToolchainInfo: env_prefix_whitelist, default [].
  • Field on system_rust_toolchain to enable it, so as not to interfere if you use nix but don't want it in buck2. Set it to `env_prefix_whitelist = ["NIX_"].
  • No need to whitelist individual variables because it's going to be too fragile.

@davidbarsky
Copy link
Contributor

  • @davidbarsky - are you saying that if we land a variant of this patch we break stuff for you? And is it OK to break stuff?

I would not be surprised if things break, but I also know I'm not doing standard stuff. There's a small group of nix users that I'll ask at work about this, but I have the sudden, sinking feeling that I might have the most expertise on interactions between the buck rules, fbsource, and nix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants