Skip to content

Commit

Permalink
Merge pull request #1166 from ArmDeveloperEcosystem/main
Browse files Browse the repository at this point in the history
Merge to production
  • Loading branch information
pareenaverma authored Aug 12, 2024
2 parents c5e3745 + 3fd2635 commit 30e2a36
Show file tree
Hide file tree
Showing 10 changed files with 503 additions and 9 deletions.
10 changes: 5 additions & 5 deletions content/install-guides/aperf.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ weight: 1

APerf (AWS Perf) is an open source command line performance analysis tool which saves time by collecting information which is normally collected by multiple tools such as `perf`, `sysstat`, and `sysctl`.

APerf was recently created by AWS to help with Linux performance analysis.
APerf was created by AWS to help with Linux performance analysis.

In addition to the CLI, APerf includes an HTML view to visualize the collected data.

Expand Down Expand Up @@ -50,19 +50,19 @@ Visit the [releases page](https://github.com/aws/aperf/releases/) to see a list
You can also download a release from the command line:

```bash { target="ubuntu:latest" }
wget https://github.com/aws/aperf/releases/download/v0.1.9-alpha/aperf-v0.1.9-alpha-aarch64.tar.gz
wget https://github.com/aws/aperf/releases/download/v0.1.12-alpha/aperf-v0.1.12-alpha-aarch64.tar.gz
```

Extract the release:

```bash { target="ubuntu:latest" }
tar xvfz aperf-v0.1.9-alpha-aarch64.tar.gz
tar xvfz aperf-v0.1.12-alpha-aarch64.tar.gz
```

Add the path to `aperf` in your `.bashrc` file.

```console
echo 'export PATH="$PATH:$HOME/aperf-v0.1.9-alpha-aarch64"' >> ~/.bashrc
echo 'export PATH="$PATH:$HOME/aperf-v0.1.12-alpha-aarch64"' >> ~/.bashrc
source ~/.bashrc
```

Expand All @@ -81,7 +81,7 @@ aperf --version
The output should print the version:

```output
aperf 0.1.0 (0c4f58c)
aperf 0.1.0 (4b910d2)
```

## Verify APerf is working
Expand Down
4 changes: 2 additions & 2 deletions content/install-guides/docker/docker-woa.md
Original file line number Diff line number Diff line change
Expand Up @@ -44,8 +44,8 @@ Additional models of Windows on Arm computers are expected to be available in mi

### How do I install and test Docker Desktop for Windows on Arm?

The current version is 4.31.0 and you can
download [Docker Desktop for Windows on Arm](https://desktop.docker.com/win/main/arm64/153195/Docker%20Desktop%20Installer.exe) and run the installer.
The current version is 4.33.1 and you can
download [Docker Desktop for Windows on Arm](https://desktop.docker.com/win/main/arm64/161083/Docker%20Desktop%20Installer.exe) and run the installer.

Check the [Docker Desktop release notes](https://docs.docker.com/desktop/release-notes/) for the latest release information.

Expand Down
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
title: Code level Performance Analysis using the PMUv3 plugin
draft: false
draft: true
minutes_to_complete: 60

who_is_this_for: Engineers who want to do C/C++ performance analysis by instrumenting code at the block level.
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
---
title: Run the AV1 and VP9 codecs on Arm Linux
draft: true
author_primary: Odin Shen

minutes_to_complete: 30

who_is_this_for: This is an introductory topic for software developers who want to
build and run the VP9 and AV1 codecs on Arm servers and measure performance.

learning_objectives:
- Build the AV1 and VP9 codecs on Arm Linux.
- Run the AV1 and VP9 codecs on Arm Linux using example videos with various resolutions and encodings.

armips:
- Neoverse
- Cortex-A

prerequisites:
- An Arm Linux system or an [Arm based instance](/learning-paths/servers-and-cloud-computing/csp/) from a
cloud service provider.

skilllevels: Introductory
subjects: Libraries

test_images:
- ubuntu:latest
test_link: null
test_maintenance: false

tools_software_languages:

weight: 1
layout: learningpathall
---
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
---
# ================================================================================
# Edit
# ================================================================================

next_step_guidance: >
You can continue learning about porting cloud applications to the Arm architecture for increased performance and cost savings. The Learning Path on MongoDB is a great next step.
# 1-3 sentence recommendation outlining how the reader can generally keep learning about these topics, and a specific explanation of why the next step is being recommended.

recommended_path: "/learning-paths/servers-and-cloud-computing/mongodb/"
# Link to the next learning path being recommended.


# further_reading links to references related to this path. Can be:
# Manuals for a tool / software mentioned (type: documentation)
# Blog about related topics (type: blog)
# General online references (type: website)

further_reading:
- resource:
title: Ampere Altra Max Delivers Sustainable High-Resolution H.265 Encoding
link: https://community.arm.com/arm-community-blogs/b/infrastructure-solutions-blog/posts/ampere-altra-max-delivers-sustainable-high-resolution-h-265-video-encoding-without-compromise
type: blog
- resource:
title: Optimized Video Encoding with FFmpeg on AWS Graviton Processors
link: https://aws.amazon.com/blogs/opensource/optimized-video-encoding-with-ffmpeg-on-aws-graviton-processors/
type: blog
- resource:
title: OCI Ampere A1 Compute instances can significantly reduce video encoding costs versus modern CPUs
link: https://community.arm.com/arm-community-blogs/b/operating-systems-blog/posts/oracle-cloud-infrastructure-arm-based-a1
type: blog

# ================================================================================
# FIXED, DO NOT MODIFY
# ================================================================================
weight: 21 # set to always be larger than the content in this path, and one more than 'review'
title: "Next Steps" # Always the same
layout: "learningpathall" # All files under learning paths have this same wrapper
---
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
---
# ================================================================================
# Edit
# ================================================================================

# Always 3 questions. Should try to test the reader's knowledge, and reinforce the key points you want them to remember.
# question: A one sentence question
# answers: The correct answers (from 2-4 answer options only). Should be surrounded by quotes.
# correct_answer: An integer indicating what answer is correct (index starts from 0)
# explanation: A short (1-3 sentence) explanation of why the correct answer is correct. Can add additional context if desired


review:
- questions:
question: >
Does AV1 run on Arm servers?
answers:
- "Yes"
- "No"
correct_answer: 1
explanation: >
libaom codec is fully supported on 64-bit Arm servers running Linux.
- questions:
question: >
Does VP9 run on Arm servers?
answers:
- "Yes"
- "No"
correct_answer: 1
explanation: >
libvpx codec is fully supported on 64-bit Arm servers running Linux.
- questions:
question: >
Does varying the preset settings on the images impact the codec performance?
answers:
- "Yes"
- "No"
correct_answer: 1
explanation: >
You can vary the preset settings on the different resolution images and measure the impact on performance.
# ================================================================================
# FIXED, DO NOT MODIFY
# ================================================================================
title: "Review" # Always the same title
weight: 20 # Set to always be larger than the content in this path
layout: "learningpathall" # All files under learning paths have this same wrapper
---
172 changes: 172 additions & 0 deletions content/learning-paths/servers-and-cloud-computing/codec1/libaom.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,172 @@
---
layout: learningpathall
title: Build and Run the AV1 codec
weight: 2
---

## What is the AV1 codec?

AV1 is a free software video codec library from the [Alliance for Open Media (AOM)](https://aomedia.org/).

It serves as the reference software implementation for the AV1 video coding format.

Significant efforts to optimize the open-source implementation, known as `libxaom`, of the AV1 encoder are available for Arm Neoverse platforms with Neon and SVE2 instructions. The optimized code is available on [Google Git](https://aomedia.googlesource.com/aom/).

## Install the necessary software packages

You will need various development tools to build AV1 including CMake and the GNU compiler.

The instructions assume you are running Ubuntu.

Install the required tools by running:

```bash
sudo apt install gcc g++ wget cmake p7zip-full -y
```

## Download and build AV1 from source

Download the AV1 source code:

```bash
git clone https://aomedia.googlesource.com/aom
```

Change directory to the repository, configure the build, and build the library:

```bash
mkdir -p aom/build_aom
cd aom/build_aom
cmake ..
make -j$(nproc)
```

For additional details refer to the [README](https://aomedia.googlesource.com/aom/?pli=1#basic-build).

## Run AV1 unit tests

The AV1 library has a comprehensive suite of unit tests, written using the GTest framework.

The build above includes the `test_libaom` executable.

You can run all unit tests by starting `test_libaom` with no arguments. However, the number of tests is huge, and it takes a long to run them all. Instead, you can constrain the number of tests by specifying a filter.

There is a help argument you can try:

```bash
./test_libaom --help
```

There is also an argument to list the tests:

```bash
./test_libaom --gtest_list_tests | less
```

To run a subset of tests, you can use a filter to run only the Neon Sum of Absolute Difference (SAD) tests.

Here is an example with the filter:

```bash
./test_libaom --gtest_filter="*NEON*SAD*"
```

The output is:

```output
Note: Google Test filter = *NEON*SAD*-:NEON_I8MM.*:NEON_I8MM/*:NEON_I8MM_*:SVE.*:SVE/*:SVE_*:SVE2.*:SVE2/*:SVE2_*
[==========] Running 3650 tests from 17 test suites.
[----------] Global test environment set-up.
[----------] 22 tests from NEON/MaskedSADTest
[ RUN ] NEON/MaskedSADTest.OperationCheck/0
[ OK ] NEON/MaskedSADTest.OperationCheck/0 (152 ms)
[ RUN ] NEON/MaskedSADTest.OperationCheck/1
[ OK ] NEON/MaskedSADTest.OperationCheck/1 (152 ms)
[ RUN ] NEON/MaskedSADTest.OperationCheck/2
[ OK ] NEON/MaskedSADTest.OperationCheck/2 (152 ms)
[ RUN ] NEON/MaskedSADTest.OperationCheck/3
[ OK ] NEON/MaskedSADTest.OperationCheck/3 (152 ms)
[ RUN ] NEON/MaskedSADTest.OperationCheck/4
[ OK ] NEON/MaskedSADTest.OperationCheck/4 (153 ms)
[ RUN ] NEON/MaskedSADTest.OperationCheck/5
[ OK ] NEON/MaskedSADTest.OperationCheck/5 (152 ms)
[ RUN ] NEON/MaskedSADTest.OperationCheck/6
[ OK ] NEON/MaskedSADTest.OperationCheck/6 (152 ms)
< output omitted>
[----------] 90 tests from NEON_DOTPROD/SADSkipx4Test (1433 ms total)
[----------] Global test environment tear-down
[==========] 3650 tests from 17 test suites ran. (56720 ms total)
[ PASSED ] 3650 tests.
YOU HAVE 581 DISABLED TESTS
```

## Performance benchmarking

You can benchmark video encoding either on-demand or live-stream.

To start, download some example `8-bit FHD`, `8-bit 4K` and `10-bit 4K` video files:

```bash
wget https://ultravideo.fi/video/Bosphorus_1920x1080_120fps_420_8bit_YUV_Y4M.7z
wget https://ultravideo.fi/video/Bosphorus_3840x2160_120fps_420_8bit_YUV_Y4M.7z
wget https://ultravideo.fi/video/Bosphorus_3840x2160_120fps_420_10bit_YUV_Y4M.7z
```

Next, extract the contents of the 7z files:

```bash
7za e Bosphorus_1920x1080_120fps_420_8bit_YUV_Y4M.7z
7za e Bosphorus_3840x2160_120fps_420_8bit_YUV_Y4M.7z
7za e Bosphorus_3840x2160_120fps_420_10bit_YUV_Y4M.7z
```

### On-demand video encoding

For on-demand encoding you can experiment different number of processors and monitor performance.

For example, run with `--good` and use the `--cpu-used` argument to vary the number of processors from 2 to 6.

Run standard bit depth and change the CPU count and see the results using:

```bash
./aomenc --good --cpu-used=4 --bit-depth=8 -o output.mkv Bosphorus_1920x1080_120fps_420_8bit_YUV.y4m
```

Try the above command with different `--cpu-used` values.

You can do the same for high bit depth:

```bash
./aomenc --good --cpu-used=4 --bit-depth=10 -o output.mkv Bosphorus_3840x2160_120fps_420_10bit.y4m
```

### Live-stream video encoding

For live-stream encoding you can experiment different number of processors and monitor performance using the `--cpus-used` in the range from 5 to 9.

For standard bit depth with 8 CPUs run:

```bash
./aomenc --rt --cpu-used=8 --bit-depth=8 -o output.mkv Bosphorus_1920x1080_120fps_420_8bit_YUV.y4m
```

For high bit depth run:

```bash
./aomenc --rt --cpu-used=8 --bit-depth=10 -o output.mkv Bosphorus_3840x2160_120fps_420_10bit.y4m
```

## View Results

The encoding frame rate (Frames per second) for the video files is output at the end of each run.

Shown below is example output from running the AV1 codec on the 8-bit FHD sample video file:

```output
Pass 1/2 frame 600/601 139432B 1859b/f 55770b/s 62641 ms (9.58 fps)
Pass 2/2 frame 600/600 638429B 8512b/f 255360b/s 1103538 ms (0.54 fps)
```
Loading

0 comments on commit 30e2a36

Please sign in to comment.