Skip to content

Commit

Permalink
Merge branch 'OpenCV_WoA' of https://github.com/madeline-underwood/ar…
Browse files Browse the repository at this point in the history
…m-learning-paths into OpenCV_WoA
  • Loading branch information
madeline-underwood committed Dec 13, 2024
2 parents 66d1de1 + 459005e commit 82f5bb7
Show file tree
Hide file tree
Showing 24 changed files with 420 additions and 245 deletions.
42 changes: 41 additions & 1 deletion .wordlist.txt
Original file line number Diff line number Diff line change
Expand Up @@ -3380,4 +3380,44 @@ wiseeye
wlcsp
xB
xmodem
yolov
yolov
Dsouza
FGCT
GCT
GCs
GC’s
HNso
HeapRegionSize
HugePages
InitiatingHeapOccupancyPercent
JDKs
JVMs
LZMA
Lau
LuaJIT
NGFW
ParallelGCThreads
Preema
Roesch
Sourcefire
TPACKET
WebGPU’s
Whitepaper
YGCT
axion
callstack
et
gc
grubfile
jstat
mqF
netresec
parallelizing
profileable
profilers
ruleset
snortrules
techmahindra
unreferenced
uptime
wC
10 changes: 5 additions & 5 deletions content/learning-paths/laptops-and-desktops/_index.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,12 +13,12 @@ operatingsystems_filter:
- ChromeOS: 1
- Linux: 29
- macOS: 7
- Windows: 37
- Windows: 38
subjects_filter:
- CI-CD: 3
- Containers and Virtualization: 6
- Migration to Arm: 26
- Performance and Architecture: 20
- Performance and Architecture: 21
subtitle: Create and migrate apps for power efficient performance
title: Laptops and Desktops
tools_software_languages_filter:
Expand Down Expand Up @@ -57,8 +57,8 @@ tools_software_languages_filter:
- Neovim: 1
- Node.js: 3
- OpenCV: 1
- perf: 2
- Python: 2
- perf: 3
- Python: 3
- Qt: 2
- Remote.It: 1
- RME: 1
Expand All @@ -73,7 +73,7 @@ tools_software_languages_filter:
- Windows Performance Analyzer: 1
- Windows Presentation Foundation: 1
- Windows Sandbox: 1
- WindowsPerf: 3
- WindowsPerf: 4
- WinUI 3: 1
- WSL: 1
- Xamarin Forms: 1
Expand Down
21 changes: 12 additions & 9 deletions content/learning-paths/servers-and-cloud-computing/_index.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,9 +9,9 @@ maintopic: true
operatingsystems_filter:
- Android: 2
- Baremetal: 1
- Linux: 109
- Linux: 111
- macOS: 9
- Windows: 12
- Windows: 13
pinned_modules:
- module:
name: Recommended getting started learning paths
Expand All @@ -22,9 +22,9 @@ subjects_filter:
- CI-CD: 4
- Containers and Virtualization: 25
- Databases: 15
- Libraries: 6
- Libraries: 7
- ML: 14
- Performance and Architecture: 38
- Performance and Architecture: 40
- Storage: 1
- Web: 10
subtitle: Optimize cloud native apps on Arm for performance and cost
Expand All @@ -44,9 +44,10 @@ tools_software_languages_filter:
- Assembly: 4
- assembly: 1
- AWS CodeBuild: 1
- AWS EC2: 1
- AWS EC2: 2
- AWS Elastic Container Service (ECS): 1
- AWS Elastic Kubernetes Service (EKS): 2
- Bash: 1
- Bastion: 3
- BOLT: 1
- bpftool: 1
Expand All @@ -69,7 +70,7 @@ tools_software_languages_filter:
- Flink: 1
- Fortran: 1
- FVP: 3
- GCC: 18
- GCC: 19
- gdb: 1
- Geekbench: 1
- GenAI: 5
Expand All @@ -83,7 +84,7 @@ tools_software_languages_filter:
- InnoDB: 1
- Intrinsics: 1
- JAVA: 1
- Java: 1
- Java: 2
- JAX: 1
- Kafka: 1
- Keras: 1
Expand All @@ -105,9 +106,9 @@ tools_software_languages_filter:
- Nginx: 3
- Node.js: 3
- PAPI: 1
- perf: 3
- perf: 4
- PostgreSQL: 4
- Python: 12
- Python: 13
- PyTorch: 5
- RAG: 1
- Redis: 3
Expand All @@ -116,6 +117,7 @@ tools_software_languages_filter:
- Rust: 2
- snappy: 1
- Snort: 1
- Snort3: 1
- SQL: 7
- Streamline CLI: 1
- Supervisor: 1
Expand All @@ -130,6 +132,7 @@ tools_software_languages_filter:
- TypeScript: 1
- Vectorscan: 1
- Visual Studio Code: 3
- WindowsPerf: 1
- WordPress: 3
- x265: 1
- zlib: 1
Expand Down
Original file line number Diff line number Diff line change
@@ -1,14 +1,16 @@
---
title: Example Application
weight: 4
weight: 5

### FIXED, DO NOT MODIFY
layout: learningpathall
---

## Example Application.
## Example Application

Using a file editor of your choice, copy the Java snippet below into a file named `HeapUsageExample.java`. This code example allocates 1 million string objects to fill up the heap. You can use this example to easily observe the effects of different GC tuning parameters.
Using a file editor of your choice, copy the Java snippet below into a file named `HeapUsageExample.java`.

This code example allocates 1 million string objects to fill up the heap. You can use this example to easily observe the effects of different GC tuning parameters.

```java
public class HeapUsageExample {
Expand All @@ -32,9 +34,13 @@ public class HeapUsageExample {
}
```

### Enable GC logging
### Enable Garbage Collector logging

To observe what the Garbage Collector is doing, one option is to enabling logging while the JVM is running.

To enable this, you need to pass in some command-line arguments. The `gc` option logs the GC information. The `filecount` option creates a rolling log to prevent uncontrolled growth of logs with the drawback that historical logs might be rewritten and lost.

To observe what the GC is doing, one option is to enabling logging while the JVM is running. To enable this, you need to pass in some command-line arguments. The `gc` option logs the GC information. The `filecount` option creates a rolling log to prevent uncontrolled growth of logs with the drawback that historical logs may be rewritten and lost. Run the following command to enable logging with JDK 11 and higher:
Run the following command to enable logging with JDK 11 and higher:

```bash
java -Xms512m -Xmx1024m -XX:+UseSerialGC -Xlog:gc:file=gc.log:tags,uptime,time,level:filecount=10,filesize=16m HeapUsageExample.java
Expand All @@ -46,7 +52,7 @@ If you are using JDK8, use the following command instead:
java -Xms512m -Xmx1024m -XX:+UseSerialGC -Xloggc:gc.log -XX:+PrintGCTimeStamps -XX:+UseGCLogFileRotation HeapUsageExample.java
```

The `-Xms512m` and `-Xmx1024` options create a minimum and maximum heap size of 512 MiB and 1GiB respectively. This is simply to avoid waiting too long to see activity within the GC. Additionally, you will force the JVM to use the serial garbage collector with the `-XX:+UseSerialGC` flag.
The `-Xms512m` and `-Xmx1024` options create a minimum and maximum heap size of 512 MiB and 1GiB respectively. This is to avoid waiting too long to see activity within the GC. Additionally, you can force the JVM to use the serial garbage collector with the `-XX:+UseSerialGC` flag.

You will now see a log file, named `gc.log` created within the same directory.

Expand All @@ -58,17 +64,19 @@ Open `gc.log` and the contents should look similar to:
[2024-11-08T15:04:54.350+0000][0.759s][info][gc] GC(3) Pause Young (Allocation Failure) 139M->3M(494M) 3.699ms
```

These logs provide insights into the frequency, duration, and impact of Young garbage collection events. The results may vary depending on your system.
These logs provide insights into the frequency, duration, and impact of Young garbage collection events. The results can vary depending on your system.

- Frequency: ~ every 46 ms
- Pause duration: ~ 3.6 ms
- Reduction size: ~ 139 MB (or 3M objects)

This logging method can be quite verbose. Also, this method isn't suitable for a running process which makes debugging a live running application slightly more challenging.
This logging method can be quite verbose, and makes it challenging to debug a live running application.

### Use jstat to observe real-time GC statistics

Using a file editor of your choice, copy the java code below into a file named `WhileLoopExample.java`. This java code snippet is a long-running example that prints out a random integer and double precision floating point number 4 times a second.
Using a file editor of your choice, copy the java code below into a file named `WhileLoopExample.java`.

This java code snippet is a long-running example that prints out a random integer and double precision floating point number four times a second:

```java
import java.util.Random;
Expand All @@ -91,7 +99,7 @@ public class GenerateRandom {
// Print random double
System.out.println("Random Doubles: " + rand_dub1);

// Sleep for 1 second (1000 milliseconds)
// Sleep for 1/4 second (250 milliseconds)
try {
Thread.sleep(250);
} catch (InterruptedException e) {
Expand All @@ -107,13 +115,15 @@ Start the Java program with the command below. This will use the default paramet
```bash
java WhileLoopExample.java
```
While the program running, open another terminal session. In the new terminal use the `jstat` command to print out the JVM statistics specifically related to the GC using the `-gcutil` flag:
While the program is running, open another terminal session.

In the new terminal use the `jstat` command to print out the JVM statistics specifically related to the GC using the `-gcutil` flag:

```bash
jstat -gcutil $(pgrep java) 1000
```

You will observe output like the following until `ctl+c` is pressed.
You will observe output like the following until `ctl+c` is pressed:

```output
S0 S1 E O M CCS YGC YGCT FGC FGCT CGC CGCT GCT
Expand All @@ -125,10 +135,10 @@ You will observe output like the following until `ctl+c` is pressed.
```

The columns of interest are:
- **E (Eden Space Utilization)**: The percentage of the Eden space that is currently used. High utilization indicates frequent allocations and can trigger minor GCs.
- **O (Old Generation Utilization)**: The percentage of the Old (Tenured) generation that is currently used. High utilization can lead to Full GCs, which are more expensive.
- **YGCT (Young Generation GC Time)**: The total time (in seconds) spent in Young Generation (minor) GC events. High values indicate frequent minor GCs, which can impact performance.
- **FGCT (Full GC Time)**: The total time (in seconds) spent in Full GC events. High values indicate frequent Full GCs, which can significantly impact performance.
- **GCT (Total GC Time)**: The total time (in seconds) spent in all GC events (Young, Full, and Concurrent). This provides an overall view of the time spent in GC, helping to assess the impact on application performance.
- **E (Eden Space Utilization)**: The percentage of the Eden space that is being used. High utilization indicates frequent allocations and can trigger minor GCs.
- **O (Old Generation Utilization)**: The percentage of the Old (Tenured) generation that is being used. High utilization can lead to Full GCs, which are more expensive.
- **YGCT (Young Generation GC Time)**: The total time in seconds spent in Young Generation (minor) GC events. High values indicate frequent minor GCs, which can impact performance.
- **FGCT (Full GC Time)**: The total time in seconds spent in Full GC events. High values indicate frequent Full GCs, which can significantly impact performance.
- **GCT (Total GC Time)**: The total time in seconds spent in all GC events (Young, Full, and Concurrent). This provides an overall view of the time spent in GC, helping to assess the impact on application performance.


Original file line number Diff line number Diff line change
@@ -1,14 +1,16 @@
---
title: Basic GC Tuning Options
weight: 5
weight: 6

### FIXED, DO NOT MODIFY
layout: learningpathall
---

### Update the JDK version

If you are on an older version of JDK, a sensible first step is to use one of the latest long-term-support (LTS) releases of JDK. This is because the GC versions included with recent JDKs offer improvements. For example, the G1GC included with JDK 11 offers improvements in the pause time compared to JDK 8. As shown earlier, you can use the `java --version` command to check the version currently in use.
If you are on an older version of JDK, a sensible first step is to use one of the latest long-term-support (LTS) releases of JDK. This is because the GC versions included with recent JDKs offer improvements on previous releases. For example, the G1GC included with JDK 11 offers improvements in the pause time compared to JDK 8.

As shown earlier, you can use the `java --version` command to check the version currently in use:

```output
$ java --version
Expand All @@ -22,25 +24,27 @@ OpenJDK 64-Bit Server VM Corretto-21.0.4.7.1 (build 21.0.4+7-LTS, mixed mode, sh

In this section, you will use the `HeapUsageExample.java` file you created earlier.

The G1 GC (Garbage-First Garbage Collector) is designed to handle large heaps and aims to provide low pause times by dividing the heap into regions and performing incremental garbage collection. This makes it suitable for applications with high allocation rates and large memory footprints.
The Garbage-First Garbage Collector (G1GC) is designed to handle large heaps and aims to provide low pause times by dividing the heap into regions and performing incremental garbage collection. This makes it suitable for applications with high allocation rates and large memory footprints.

You can run the following command to generate the GC logs using a different GC and compare the two.

You can run the following command to generate the GC logs using a different GC and compare. You just need to change the GC from `Serial` to `G1GC` using the `-XX:+UseG1GC` option as shown:
To make this comparison, change the Garbage Collector from `Serial` to `G1GC` using the `-XX:+UseG1GC` option:

```bash
java -Xms512m -Xmx1024m -XX:+UseG1GC -Xlog:gc:file=gc.log:tags,uptime,time,level:filecount=10,filesize=16m HeapUsageExample.java
```
From the created log file `gc.log`, you can observe that at a very similar time after start up (~0.75s), the Pause Young time reduced from ~3.6ms to ~1.9ms. Further, the time between GC pauses has improved from ~46ms to every ~98ms.
From the created log file `gc.log`, you can see that at a similar time after startup (~0.75s), the Pause Young time reduced from ~3.6ms to ~1.9ms. Further, the time between GC pauses has improved from ~46ms to every ~98ms.

```output
[2024-11-08T16:13:53.088+0000][0.790s][info][gc ] GC(2) Pause Young (Normal) (G1 Evacuation Pause) 307M->3M(514M) 1.976ms
...
[2024-11-08T16:13:53.186+0000][0.888s][info][gc ] GC(3) Pause Young (Normal) (G1 Evacuation Pause) 307M->3M(514M) 1.703ms
```
As discussed in the previous section, the performance improvement from moving to a G1GC will depend on the CPU overhead of your system. The performance may vary depending on the cloud instance size and available CPU resources.
As described in the previous section, the performance improvement from moving to a G1GC depends on the CPU overhead of your system. The performance can vary depending on the cloud instance size and available CPU resources.

### Add GC Targets
### Add Garbage Collector Targets

You can manually provide targets for specific metrics and the GC will attempt to meet those requirements. For example, if you have a time-sensitive application such as a REST server, you may want to ensure that all customers receive a response within a specific time. You may find that if a client request is sent during GC you need to ensure that the GC pause time is minimised.
You can manually provide targets for specific metrics and the GC will attempt to meet those requirements. For example, if you have a time-sensitive application such as a REST server, you might want to ensure that all customers receive a response within a specific time. You might find that if a client request is sent during Garbage Collection that you need to ensure that the GC pause time is minimized.

Running the command with the `-XX:MaxGCPauseMillis=<N>` sets a target max GC pause time:

Expand All @@ -55,19 +59,19 @@ Looking at the output below, you can see that at the same initial state after ~0
[2024-11-08T16:27:37.149+0000][0.853s][info][gc] GC(19) Pause Young (Normal) (G1 Evacuation Pause) 193M->3M(514M) 0.482ms
```

Here are some additional target options you can consider to tune performance:
Here are some additional target options that you can consider to tune performance:

- -XX:InitiatingHeapOccupancyPercent:

Defines the old generation occupancy threshold to trigger a concurrent GC cycle. Adjusting this can be beneficial if your application experiences long GC pauses due to high old generation occupancy. For example, lowering this threshold can help start GC cycles earlier, reducing the likelihood of long pauses during peak memory usage.
This defines the old generation occupancy threshold to trigger a concurrent GC cycle. Adjusting this is beneficial if your application experiences long GC pauses due to high old generation occupancy. For example, lowering this threshold can help start GC cycles earlier, reducing the likelihood of long pauses during peak memory usage.

- -XX:ParallelGCThreads

Specifies the number of threads for parallel GC operations. Increasing this value can be beneficial for applications running on multi-core processors, as it allows GC tasks to be processed faster. For instance, a high-throughput server application might benefit from more parallel GC threads to minimize pause times and improve overall performance.
This specifies the number of threads for parallel GC operations. Increasing this value is beneficial for applications running on multi-core processors, as it allows GC tasks to be processed faster. For instance, a high-throughput server application might benefit from more parallel GC threads to minimize pause times and improve overall performance.

- -XX:G1HeapRegionSize

Determines the size of G1 regions, which must be a power of 2 between 1 MB and 32 MB. Adjusting this can be useful for applications with specific memory usage patterns. For example, setting a larger region size can reduce the number of regions and associated overhead for applications with large heaps, while smaller regions might be better for applications with more granular memory allocation patterns.
This determines the size of G1 regions, which must be a power of 2 between 1 MB and 32 MB. Adjusting this can be useful for applications with specific memory usage patterns. For example, setting a larger region size can reduce the number of regions and associated overhead for applications with large heaps, while smaller regions might be better for applications with more granular memory allocation patterns.

You can refer to [this technical article](https://www.oracle.com/technical-resources/articles/java/g1gc.html) for more information of G1GC tuning.
See [Garbage First Garbage Collector Tuning](https://www.oracle.com/technical-resources/articles/java/g1gc.html) for more information of G1GC tuning.

Original file line number Diff line number Diff line change
Expand Up @@ -3,16 +3,17 @@ title: Tune the Performance of the Java Garbage Collector

minutes_to_complete: 45

who_is_this_for: This learning path is designed for Java developers aiming to optimize application performance on Arm-based servers. It is especially valuable for those migrating applications from x86-based to Arm-based instances.
who_is_this_for: This Learning Path is for Java developers aiming to optimize application performance on Arm-based servers, especially those migrating applications from x86-based to Arm-based instances.

learning_objectives:
- Understand the key differences among Java garbage collectors (GCs).
- Monitor and interpret GC performance metrics.
- Describe the key differences between individual Java Garbage Collectors (GCs).
- Monitor and interpret Garbage Collector performance metrics.
- Adjust core parameters to optimize performance for your specific workload.

prerequisites:
- An Arm based instance from a cloud service provider, or an on-premise Arm server.
- Basic understanding of Java and [Java installed](/install-guides/java/) on your machine.
- An Arm-based instance from a cloud service provider, or an on-premise Arm server.
- Basic understanding of Java.
- An [installation of Java](/install-guides/java/) on your machine.

author_primary: Kieran Hejmadi

Expand Down
Loading

0 comments on commit 82f5bb7

Please sign in to comment.