Skip to content

Commit

Permalink
Merge pull request #309 from Felixiose/documentation
Browse files Browse the repository at this point in the history
Added Guide for Memory Profiling into the IPPL documentation in its o…
  • Loading branch information
s-mayani authored Sep 25, 2024
2 parents f8ce0fc + 106a750 commit 0572628
Show file tree
Hide file tree
Showing 2 changed files with 168 additions and 0 deletions.
2 changes: 2 additions & 0 deletions doc/DoxygenLayout.xml
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,8 @@
<tab type="mainpage" visible="yes" title=""/>
<tab type="user" visible="yes" title="Basics Usage" url="@ref Basics" intro=""/>
<tab type="user" visible="yes" title="Installation" url="@ref Installation" intro=""/>
<tab type="user" visible="yes" title="Profiling" url="@ref Profiling" intro=""/>

<tab type="pages" visible="no" title="" intro=""/>
<tab type="topics" visible="yes" title="" intro=""/>
<tab type="modules" visible="yes" title="" intro="">
Expand Down
166 changes: 166 additions & 0 deletions doc/extras/Profiling.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,166 @@
# Profiling in IPPL {#Profiling}

In certain applications, you might want to use profiling tools for debugging and testing. Since IPPL uses **Kokkos** as a backend, you can leverage Kokkos' built-in profiling tools.

This guide explains how to use Kokkos' profiling tools, using the **MemoryEvents** tool as an example.

### Description of MemoryEvents

MemoryEvents tracks a timeline of allocation and deallocation events in Kokkos Memory Spaces. It records time, pointer, size, memory-space-name, and allocation-name. This is in particular useful for debugging purposes to understand where all the memory is going.

Additionally, the tool provides a timeline of memory usage for each individual Kokkos Memory Space.

The tool is located at: https://github.com/kokkos/kokkos-tools/tree/develop/profiling/memory-events


---

## Steps to Use Kokkos Profiling Tools

### 1. Clone the Kokkos Tools Repository

First, clone the Kokkos tools repository, which contains a variety of profiling tools:
```
git clone https://github.com/kokkos/kokkos-tools
```

### 2. Build and Install the Tools

Navigate into the repository and build the tools using CMake:

```
cd kokkos-tools
cmake ..
make -j
sudo make install
```

### 3. Set Up the Profiling Tool

Before running your application, export the Kokkos Tools environment variable to point to the `kp_memory_events.so` tool:
```
export KOKKOS_TOOLS_LIBS={PATH_TO_TOOL_DIRECTORY}/kp_memory_events.so
```
Replace `{PATH_TO_TOOL_DIRECTORY}` with the actual path where the tool is located.


### 4. Run your Application

Execute your application normally. The MemoryEvents tool will automatically collect data during execution. For example:

```
./application COMMANDS
```

### 5. Output Files

The MemoryEvents tool will generate the following files:

- `HOSTNAME-PROCESSID.mem_events:` Lists memory events.
- `HOSTNAME-PROCESSID-MEMSPACE.memspace_usage:` Provides a utilization timeline for each active memory space.

### 6. Example on with SLURM

Here’s an example of how to run the profiling with a SLURM system using `sbatch`:
```
sbatch -n 2 --wrap="export KOKKOS_TOOLS_LIBS=$HOME/kokkos-tools/kp_memory_events.so; \
mpirun -n 2 LandauDamping 128 128 128 10000 10 FFT 0.01 LeapFrog --overallocate 2.0 --info 10"
```

In this example:

- `sbatch -n 2` specifies 2 nodes.
- The Kokkos tool is exported and applied to the `LandauDamping` application.

This guide provides the basic steps for integrating Kokkos profiling tools into your IPPL-based projects. You can adjust the commands as needed depending on your specific application and environment.


## Example

Consider the following code:

```
#include <Kokkos_Core.hpp>
typedef Kokkos::View<int*,Kokkos::CudaSpace> a_type;
typedef Kokkos::View<int*,Kokkos::CudaUVMSpace> b_type;
typedef Kokkos::View<int*,Kokkos::CudaHostPinnedSpace> c_type;
int main() {
Kokkos::initialize();
{
int N = 10000000;
for(int i =0; i<2; i++) {
a_type a("A",N);
{
b_type b("B",N);
c_type c("C",N);
for(int j =0; j<N; j++) {
b(j)=2*j;
c(j)=3*j;
}
}
}
}
Kokkos::finalize();
}
```

This will produce the following output:

**HOSTNAME-PROCESSID.mem_events**

```
# Memory Events
# Time Ptr Size MemSpace Op Name
0.311749 0x2048a0080 128 CudaHostPinned Allocate InternalScratchUnified
0.311913 0x2305ca0080 2048 Cuda Allocate InternalScratchFlags
0.312108 0x2305da0080 16384 Cuda Allocate InternalScratchSpace
0.312667 0x23060a0080 40000000 Cuda Allocate A
0.317260 0x23086e0080 40000000 CudaUVM Allocate B
0.335289 0x2049a0080 40000000 CudaHostPinned Allocate C
0.368485 0x2049a0080 -40000000 CudaHostPinned DeAllocate C
0.377285 0x23086e0080 -40000000 CudaUVM DeAllocate B
0.379795 0x23060a0080 -40000000 Cuda DeAllocate A
0.380185 0x23060a0080 40000000 Cuda Allocate A
0.384785 0x23086e0080 40000000 CudaUVM Allocate B
0.400073 0x2049a0080 40000000 CudaHostPinned Allocate C
0.433218 0x2049a0080 -40000000 CudaHostPinned DeAllocate C
0.441988 0x23086e0080 -40000000 CudaUVM DeAllocate B
0.444391 0x23060a0080 -40000000 Cuda DeAllocate A
```
**HOSTNAME-PROCESSID-Cuda.memspace_usage**

```
# Space Cuda
# Time(s) Size(MB) HighWater(MB) HighWater-Process(MB)
0.311913 0.0 0.0 81.8
0.312108 0.0 0.0 81.8
0.312667 38.2 38.2 81.8
0.379795 0.0 38.2 158.1
0.380185 38.2 38.2 158.1
0.444391 0.0 38.2 158.1
```
**HOSTNAME-PROCESSID-CudaUVM.memspace_usage**

```
# Space CudaUVM
# Time(s) Size(MB) HighWater(MB) HighWater-Process(MB)
0.317260 38.1 38.1 81.8
0.377285 0.0 38.1 158.1
0.384785 38.1 38.1 158.1
0.441988 0.0 38.1 158.1
```

**HOSTNAME-PROCESSID-CudaHostPinned.memspace_usage**

```
# Space CudaHostPinned
# Time(s) Size(MB) HighWater(MB) HighWater-Process(MB)
0.311749 0.0 0.0 81.8
0.335289 38.1 38.1 120.0
0.368485 0.0 38.1 158.1
0.400073 38.1 38.1 158.1
0.433218 0.0 38.1 158.1
```

0 comments on commit 0572628

Please sign in to comment.