Core library

Core library
- Data structures
  - Array
  - Array Handler
- Debug
- IO
- Math
- Memory
- Meta
- Types

Data structures

Array

The Array data structure is a dynamically allocated array in which the elements have template type T and the allocation, copy and set operations are handled according the memory specification dictated by the template type Allocator, providing the following set of features:

Capable of handling data copies between different types of allocators, making it suitable for copies between different kind of devices ex.:

  Array<T,cpuAllocator> arrayCPU(size,dev);
  ...
  Array<T,gpuAllocator> arrayGPU(arrayCPU);

Memory model independent which enables portability and extensibility for ex.:

  Array<T,amdAllocator> arrayAMD(size,dev);
  Array<T,nvidiaAllocator> arrayNVIDIA(size,dev);
  Array<T,mpiAllocator> arrayMPI(size,dev);

Interface somewhat similar to the one found in the c++ std.

Array Handler

Is a light weight version of the Array data structure with the sole purpose of being used in places where the Array data structure is too complicated, like device code.

Debug

Provides a series of functions to introduce with ease debugging operations to the code, specifically it provides:

The error macro which in case of being evaluated to false generates a certain type of notification. In particular there is a possibility to select to emit only certain types of errors, ex.:

  #define DEBUG_LEVEL api_debug|runtime_debug
  ...
  error(condition,"Error message",API_ERROR,stderr_error) //May get evaluated
  error(condition,"Error message",RUNTIME_ERROR,throw_error) //May get evaluated
  error(condition,"Error message",MEMORY_ERROR,assert_error) //Never gets evaluated

The cuda_error checks for errors in the last CUDA call.
The class cpu_timer and gpu_timer provides accurate timers with great ease of use:

  cpu_timer timer;
  timer.start();
  timer.stop();
  double elapsed_time;
  elapsed_time<<timer;

IO

Provides a series of functions for IO operations, like printing the Array, std::vector, and other data structures.

Math

Provides a series of interfaces for elementary math operations specialized for CUDA, for example:

  __add__<RN>(1.1,0.9); // Add the numbers rounding to the nearest
  __add__<RU>(1.1,0.9); // Add the numbers rounding to + infinity
  __div__<FAST_MATH>(2.1f,1.2f); // Divide the numbers using the fast CUDA intrinsic __fdividef

Memory

Provides a series of interfaces to allocate, set, copy and free memory both in CUDA and CPU, specifically it provides cpuAllocator, gpuAllocator, managedAllocator and pinnedAllocator allocators, which are compatible with the Array data structure.

Types

Provides statically allocated vector types compatible with device code, with template type T, size N and alignment A :

  Vector<double,2,16> v({3.14,2.71});
  v[0]+=v[1]; // Now v[0] holds 3.14+2.71

Is important to understand that the alignment A plays a vital role on how the compiler will load the values from RAM.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Readme.md

Readme.md

Core library

Data structures

Array

Array Handler

Debug

IO

Math

Memory

Meta

Types

Files

Readme.md

Latest commit

History

Readme.md

File metadata and controls

Core library

Data structures

Array

Array Handler

Debug

IO

Math

Memory

Meta

Types