-
-
Notifications
You must be signed in to change notification settings - Fork 14.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
john: Missing AVX2 support #328226
Comments
Yes, but upon a closer look at the commit you referenced, apparently they only build up to AVX/XOP at best, not including AVX2 and AVX512BW. Something they could want to fix. cc: @anthraxx
No, so that our automatic CPU fallback feature works (progressively invoking a less capable binary until the CPU requirement check passes, which is transparent to the user). |
Hi!
Who is "our" and where is that feature, please? Can we (nixpkgs) use it too?
How does nixpkgs usually deal with this? Can we expose it via something like |
Sorry I was unclear. By "our" I meant John the Ripper upstream project. The feature is in the upstream source tree and intended for use by distros. Unfortunately, it is currently tricky to use as we do not provide a build script of our own that would perform the multiple builds and renames. Other than that, sure, you (nixpkgs) are welcome to use it, like Arch does (but complete the fallback chain to include AVX2 and start with non-fallback AVX512BW main binary, please). |
Some more examples are referenced from openwall/john#5233 and some are in https://github.com/search?q=repo%3Aopenwall%2Fjohn-packages%20CPU_FALLBACK&type=code One aspect we're inconsistent on (between different packaged builds that use this feature) is whether to call the fallback binaries "positively" (e.g., |
Thanks! I also found |
Right. One thing to note from there is the |
Does nixpkgs need fallbacks on the OMP / non-OMP axis as well? |
This is optional, but nice to have. Non-OpenMP builds are usually very slightly faster in the special case of running 1 thread, so that's when they're used. Running 1 thread per process can happen in a VM or when using
|
OK, thanks! Here is what I have so far: diff --git a/pkgs/tools/security/john/default.nix b/pkgs/tools/security/john/default.nix
index aeefcaa0bbef..6834594c79d0 100644
--- a/pkgs/tools/security/john/default.nix
+++ b/pkgs/tools/security/john/default.nix
@@ -19,8 +19,34 @@
ocl-icd,
substituteAll,
makeWrapper,
-}:
+ simdChain ?
+ if stdenv.buildPlatform.isx86 then
+ [
+ "avx512bw"
+ "avx512f"
+ "avx2"
+ "xop"
+ "avx"
+ "sse2"
+ ]
+ else
+ [ ],
+ withOpenMP ? true,
+ callPackage,
+}@args:
+# john has a "fallback chain" mechanism; whenever the john binary
+# encounters that it is built for a SIMD target that is not supported
+# by the current CPU, it can fall back to another binary that is not
+# built to expect that feature, continuing until it eventually reaches
+# a compatible binary. See:
+# https://github.com/openwall/john/blob/bleeding-jumbo/src/packaging/build.sh
+# https://github.com/openwall/john/blob/bleeding-jumbo/doc/README-DISTROS
+# https://github.com/NixOS/nixpkgs/issues/328226
+let
+ simdFallback = (callPackage ./default.nix (args // { simdChain = lib.tail simdChain; })).john;
+ ompFallback = (callPackage ./default.nix (args // { withOpenMP = false; })).john;
+in
stdenv.mkDerivation rec {
pname = "john";
version = "rolling-2404";
@@ -61,10 +87,30 @@ stdenv.mkDerivation rec {
+ lib.optionalString withOpenCL ''
python ./opencl_generate_dynamic_loader.py # Update opencl_dynamic_loader.c
'';
- configureFlags = [
- "--disable-native-tests"
- "--with-systemwide"
- ];
+ __structuredAttrs = true;
+ configureFlags =
+ [
+ "--disable-native-tests"
+ "--with-systemwide"
+ ]
+ ++ (lib.optionals (simdChain != [ ]) [
+ "--enable-simd=${lib.head simdChain}"
+ "CPPFLAGS=${
+ builtins.concatStringsSep " " [
+ "-DCPU_FALLBACK"
+ "-DCPU_FALLBACK_BINARY=${lib.escapeShellArg "\"${simdFallback}/bin/john\""}"
+ ]
+ }"
+ ])
+ ++ (lib.optionals (!withOpenMP) [ "--disable-openmp" ])
+ ++ (lib.optionals withOpenMP [
+ "CPPFLAGS=${
+ builtins.concatStringsSep " " [
+ "-DOMP_FALLBACK"
+ "-DOMP_FALLBACK_BINARY=${lib.escapeShellArg "\"${ompFallback}/bin/john\""}"
+ ]
+ }"
+ ]);
buildInputs =
[
@@ -106,9 +152,10 @@ stdenv.mkDerivation rec {
]);
# TODO: Get dependencies for radius2john.pl and lion2john-alt.pl
- # gcc -DAC_BUILT -Wall vncpcap2john.o memdbg.o -g -lpcap -fopenmp -o ../run/vncpcap2john
- # gcc: error: memdbg.o: No such file or directory
- enableParallelBuilding = false;
+ outputs = [
+ "out" # full package
+ "john" # just the binary - for the SIMD fallback chain
+ ];
postInstall = ''
mkdir -p "$out/bin" "$out/etc/john" "$out/share/john" "$out/share/doc/john" "$out/share/john/rules" "$out/${perlPackages.perl.libPrefix}"
@@ -119,6 +166,9 @@ stdenv.mkDerivation rec {
cp -vt "$out/share/john/rules" ../run/rules/*.rule
cp -vrt "$out/share/doc/john" ../doc/*
cp -vt "$out/${perlPackages.perl.libPrefix}" ../run/lib/*
+
+ mkdir -p "$john/bin"
+ cp -vt "$john/bin" "$out/bin/john"
'';
postFixup = '' I'll find out if it works tomorrow, because this does look like it'll take a while to build... On that note... this will require building John 14 times (7 x 2). There isn't really a short-cut around that, is there? Each build takes like 10 minutes on my threadripper even if I re-enable parallel building (which looks like it works fine now?). I have no idea if the NixOS org wants to take on all of this compute. |
No shortcut other than deciding not to build some of these. For example, you could omit
That's weird. It takes under 1 minute on a quad-core laptop for me. Of course, parallel building. I'm not aware of it ever having been broken - we've been using it all the time. |
Thanks. Quick question, is there anything in the package that invokes the It was nice and easy to make the nixpkgs package recurse into itself to build the fallback chain, though that has the slight downside that we're also packaging all the other files for each link in the chain. That shouldn't be too much of an issue in practice with store optimization. We could improve it by splitting the two (into subpackages or multiple outputs), however if we were to do that, we need to be careful so that 1) all john binaries look in the correct place for things like dictionaries, and 2) any other programs or scripts that invoke john should know where to look for it.
Ah, thanks for that - it was my bad, looks like I need to explicitly enable parallel building for autoconf in nixpkgs.
Here's the commit that added it, has a link to a log of a failed build - that was a while ago though: e7ce27f |
We have some built-in programs that are part of It's not that SIMD matters much for their performance (they do not perform password hashing), but rather that they use shared object files in Normally, you would not need to handle these specially - they would just invoke a
I don't see why you'd be running into such problems. Normally, it's just one package that has multiple
Oh, right. I see we did have this bug in 1.8.0-jumbo-1 released in December 2014. We fixed it in git in December 2015. This build log you reference is from 2017, but it's reasonable you were still building our latest release of the time. Not everyone ran into this bug because it'd only be reached when libpcap development package was installed, which was/is an optional dependency - but it's reasonable that in a package you try and include the complete functionality. |
It was brought to my attention we do not compile
john
with AVX2 support, which might have significant negative performance effects.Additional context
Apparently,
john
uses auto-detection, which we disabled for compatibility and reproducibility.Arch builds the project multiple times with different levels of CPU features
so that user can choose between performance and compatibility: https://gitlab.archlinux.org/archlinux/packaging/packages/john/-/blob/66e5e2c28aaaed440029c73190802cd26e7440ad/PKGBUILD#L75-80Notify maintainers
cc @offlinehacker @matthewbauer @CherryKitten @CyberShadow
Add a 👍 reaction to issues you find important.
The text was updated successfully, but these errors were encountered: