Skip to content

edponce/masprng

Repository files navigation

MASPRNG - Intel MIC (vector) accelerated SPRNG library

General requirements:
    GNU compilers, require 4.8 or greater for __builtin_cpu_supports()
    C++98 - function overloading, object-oriented 
    C++11 - non-static initialization in struct/class

SIMD requirements:
    SSE - at least SSE4.1 for _mm_mullo_epi32() and _mm_mul_epi32()
    AVX - at least AVX2 for integer instructions, _mm256_mullo_epi32(), _mm256_mul_epi32(), and others
          at least SSE2 _mm_shuffle_epi32() and _mm_or_epi32()

Only a single SIMD technology is allowed per SPRNG object.
All SIMD macro decisions are ordered from most recent technology to oldest.
This allows the use of best available technology without the user having to specify.
Note that the user is allowed to decide on a specific technology using a macro definition.

Each VSPRNG object generates random numbers from multiple streams. The number of streams
depends on the width of the vector unit and the type of random number considered.

SPRNG provides a capability of performing 64-bit operations using 64-bit (LONG_SPRNG) or 32-bit registers.
LONG_SPRNG mode is generally faster and the current implementation generates different set of random numbers.

The scalar SPRNG objects are supported via the same interface as SPRNG v5.0. The vector SPRNG
objects are supported via a very similar interface as SPRNG v5.0.

As part of this project, a C interface for Intel SIMD intrinsics was developed. This interface
allows applications to easily and systematically access the available vector technologies.

Arrays of SIMD intrinsics are allocated using posix_memalign(), using fixed arrays
worked for SSE but not AVX. It seems this is due to compiler (GCC) not providing automatic
support for large vector datatypes.