Nemo Galois fields decrease performance compared to AbstractAlgebra #1693

Ktrompfl · 2024-02-29T15:47:45Z

Using Galois fields from Nemo instead of AbstractAlgebra heavily impacts the performance of basic arithmetic operations due to allocations for every single operation:

julia> using AbstractAlgebra, BenchmarkTools, Nemo

Welcome to Nemo version 0.43.1

Nemo comes with absolutely no warranty whatsoever

julia> @btime begin
           K = AbstractAlgebra.GF(2)
           a = one(K)

           for _ = 1:1_000_000
               a *= a
           end
       end

  247.656 μs (2 allocations: 40 bytes)

julia> @btime begin
           K = Nemo.GF(2)
           a = one(K)

           for _ = 1:1_000_000
               a *= a
           end
       end


  66.673 ms (1000006 allocations: 76.29 MiB)

julia> @btime begin
           K = AbstractAlgebra.GF(2)
           a = one(K)

           for _ = 1:1_000_000
               a += a
           end
       end
  78.806 ns (2 allocations: 40 bytes)

julia> @btime begin
           K = Nemo.GF(2)
           a = one(K)

           for _ = 1:1_000_000
               a += a
           end
       end

  68.983 ms (1000006 allocations: 76.29 MiB)

I assume this is a byproduct of calling native libraries, but still very unfortunate.

The text was updated successfully, but these errors were encountered:

thofma · 2024-02-29T16:02:26Z

Yes and no. Try using Native.GF(2) (once Nemo is loaded).

Ktrompfl · 2024-02-29T16:47:33Z

Thanks for your response. Apart from being slightly faster on addition and not causing any allocations at all, Native.GF still appears to be a magnitude slower on multiplication when compared to AbstractAlgebra.GF:

julia> @btime begin
           K = Native.GF(2)
           a = one(K)

           for _ = 1:1_000_000
               a *= a
           end
       end

  5.737 ms (0 allocations: 0 bytes)

julia> @btime begin
           K = Native.GF(2)
           a = one(K)

           for _ = 1:1_000_000
               a += a
           end
       end

  44.278 ns (0 allocations: 0 bytes)

thofma · 2024-02-29T17:50:24Z

Hm, yes, it is quite bad, but a bit puzzling. @fieker: The AbstractAlgebra code is not that sophisticated in comparison to Nemo/flint, which goes through n_mulmod_preinv. Is this just the overhead of the function call or is there something else going on?

fieker · 2024-02-29T21:48:01Z

my guess is 2 things:

preinv (but that is minor)
memory management: GF(2) in AA will use machine stuff, Int64 for operations. This is immutable and stack allocated and such. In contrast, Native.GF(2) is expensive

My guess is that Native.GF(2) will outperform AA on tasks taking place in C (in flint), however "naive" native julia code will favour AA by a wide margin.
Furthermore, each "+" in Flint is a function call, while AA is handled directly by the compiler....
I guess we can/ should implement the Native.GF(p) arithmetic directly ourselves

fieker · 2024-02-29T21:54:34Z

I withdraw my comments. fpFieldElem is usng preinv and is immutable julia. * is calling into C however, so the call mihght be
much more expensive (at least it should!!) thay checking if a*b is even or odd. Has anyone done the timings on 30 bit or 60 bit primes as well? The Native code is not at all optimized for small p

thofma · 2024-02-29T23:28:30Z

I tried 30 bits and got the same timings. We could in fact just copy the code from AA to Nemo.

thofma · 2024-03-29T10:02:27Z

@albinahlback since you interested in flint being fast, do you have an idea what could be going on? How long does

julia> @btime begin
           K = Native.GF(1099511627791)
           a = one(K)

           for _ = 1:1_000_000
               a *= a
           end
       end

  5.737 ms (0 allocations: 0 bytes)

take for you and how long does that take for you with an equivalent C program using nmod? It is clear that the ccall on the julia side is not for free, but I wonder how expensive it is.

albinahlback · 2024-03-29T11:27:16Z

Do you know if AbstractAlgebra-stuff is being inlined or not?

albinahlback · 2024-03-29T11:49:20Z

If someone would be able to give me the generated assembly for each case, it would be pretty easy to see why the difference is so big.

thofma · 2024-03-29T12:38:42Z

Hm, I tried to produce the assembly code, but I noticed something strange. It seems that there is something dodgy going one due to some problems in the benchmark setup.

@Ktrompfl can you reproduce the following?

function f_with_return(K)
   a = one(K)
  for _ = 1:1_000_000
    a *= a
  end
  return a
end

function f_without_return(K)
   a = one(K)
  for _ = 1:1_000_000
    a *= a
  end
end

K_AA = AbstractAlgebra.GF(1099511627791)
K_Nemo = Nemo.Native.GF(1099511627791)

I then get

julia> t_AA = @belapsed(f_with_return(K_AA)), @belapsed(f_without_return(K_AA))
(0.005825292, 0.000312042)

julia> t_Nemo = @belapsed(f_with_return(K_Nemo)), @belapsed(f_without_return(K_Nemo))
(0.012017084, 0.012015667)

So clearly the compiler is doing something "clever" in the AA case, since the function always returns nothing.

Looks like it is 2x slower, which is not "too bad".

Ktrompfl · 2024-04-14T11:38:03Z

With the setup above I get

julia> t_AA = @belapsed(f_with_return(K_AA)), @belapsed(f_without_return(K_AA))
(0.003119424, 0.000221444)

julia> t_Nemo = @belapsed(f_with_return(K_Nemo)), @belapsed(f_without_return(K_Nemo))
(0.005218629, 0.005237672)

which should be fine, whereas with K_Nemo2 = Nemo.GF(1099511627791) I get

julia> t_Nemo2 = @belapsed(f_with_return(K_Nemo2)), @belapsed(f_without_return(K_Nemo2))
(0.05128647, 0.051381936)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Nemo Galois fields decrease performance compared to AbstractAlgebra #1693

Nemo Galois fields decrease performance compared to AbstractAlgebra #1693

Ktrompfl commented Feb 29, 2024

thofma commented Feb 29, 2024

Ktrompfl commented Feb 29, 2024

thofma commented Feb 29, 2024

fieker commented Feb 29, 2024

fieker commented Feb 29, 2024

thofma commented Feb 29, 2024

thofma commented Mar 29, 2024 •

edited

Loading

albinahlback commented Mar 29, 2024

albinahlback commented Mar 29, 2024

thofma commented Mar 29, 2024

Ktrompfl commented Apr 14, 2024

Nemo Galois fields decrease performance compared to AbstractAlgebra #1693

Nemo Galois fields decrease performance compared to AbstractAlgebra #1693

Comments

Ktrompfl commented Feb 29, 2024

thofma commented Feb 29, 2024

Ktrompfl commented Feb 29, 2024

thofma commented Feb 29, 2024

fieker commented Feb 29, 2024

fieker commented Feb 29, 2024

thofma commented Feb 29, 2024

thofma commented Mar 29, 2024 • edited Loading

albinahlback commented Mar 29, 2024

albinahlback commented Mar 29, 2024

thofma commented Mar 29, 2024

Ktrompfl commented Apr 14, 2024

thofma commented Mar 29, 2024 •

edited

Loading