Encoder Performance

The encoder performance tests focus on the most relevant HD (1920x1080) and UHD (3840x2160) resolution use cases, in the following denoted as HD4K use case, with random access encoding as defined in the JVET common test conditions (CTC) ¹⁰. Unless stated otherwise, all presented results are shown for the CTC test sequences, i.e. classes A1 and A2 for UHD and class B for HD sequences. Constant QP encoding is used with quantization paramter (QP) values of 22, 27, 32 and 37 according to VTM CTC. Reported rate distortion results are calculated as Bjøntegaard delta rate (BD-rate) ¹¹^,12 to evaluate the compression performance based on the following weighted average of peak signal-to-noise ratio (PSNR) and multiscale structural similarity measure (MS-SSIM) ¹³ values per color component:

PSNR_YUV = (6 * PSNR_Y + PSNR_Cb + PSNR_Cr) / 8

MS-SSIM_YUV = (6 * MS-SSIM_Y + MS-SSIM_Cb + MS-SSIM_Cr) / 8

For VVenC and x265, multi-threading with 8 threads has been enabled to generate all results. For HM and VTM, multi-threading is not supported. All tests have been performed on Dell Servers with AMD EPYC 7502P 32-Core Processor @2.5GHz. Since v1.7.0 the AMD EPYC architecture is used. All anchors and older versions are remeasured on that platform. Before the performance was measured on Supermicro servers with Intel Xeon processors E5-2697A v4 @2.6GHz.

PSNR Optimized Use Case

The PSNR_YUV BD-rate gain of VVenC over the HEVC test model reference software HM-17.0 is shown in Figure 1. PSNR_YUV BD-rate represents the approximate average bit-rate savings between two encoders for the same objective quality (as quantified by PSNR_YUV above). Here lower values mean larger bit-rate savings with respect to the HM-17.0 anchor. Please note the logarithmic scale of the relative encoder runtime in comparison to HM-17.0.

With the slower preset and multi-threading enabled the BD-rate gain of VVenC over HM is similar to VTM-19.2 CTC, but a speedup of more than 20x for HD4K sequences is achieved over VTM.

With the faster preset and multi-threading the BD-rate gain over HM is still approx. 13.7%, with a speedup of more than 2300x for HD4K over VTM-19.2. Comparing the runtime with HM gives a speedup of around 370x.

As a good tradeoff between encoder runtime and BD-rate performance, we recommend the medium preset with multi-threading enabled. Here, the BD-rate gain over HM is approx. 35.8%, which is close to the slower preset and VTM CTC, but in comparison to VTM-19.2 the encoder runs 280x faster for HD4K sequences. Compared to HM-17.0 this is an encoder runtime speedup of 45x. A summary of all results is shown in Table III.

Table III: PSNR_YUV BD-rate and multi-threaded (8 threads) encoder speedup for HD, UHD and both test sequences, for VVenC v1.12.0.

	HD			UHD			HD4K
Preset	PSNR BD-rate vs. HM	Speedup vs. HM	Speedup vs. VTM	PSNR BD-rate vs. HM	Speedup vs. HM	Speedup vs. VTM	PSNR BD-rate vs. HM	Speedup vs. HM	Speedup vs. VTM
faster	-10.9%	360x	2400x	-14.8%	400x	2500x	-13.0%	380x	2400x
fast	-25.7%	150x	990x	-26.9%	170x	1000x	-26.4%	160x	1000x
medium	-34.9%	40x	260x	-36.4%	53x	320x	-35.7%	47x	290x
slow	-39.1%	12.0x	78x	-40.2%	17x	100x	-39.7%	14x	90x
slower	-42.0%	2.6x	17x	-43.2%	3.8x	23x	-42.6%	3.2x	20x

VVenC MT preset history

Figure 1: PSNR_YUV BD-rate gain and relative encoder runtime for VVenC in comparison to HM-17.0 and VTM (JVET HD and UHD test sequences, MCTF enabled for HM-17.0 and VTM-19.2). Results are given for the 5 preset options: faster, fast, medium, slow and slower. VVenC is running multi-threaded using 6 threads for version <= 0.2 and 8 threads for version >=1.0. Lower PSNR YUV BD-rate values mean a better compression for the same objective quality in terms of PSNR_YUV

Additionally, Figure 2 includes multi-threaded results for the HEVC open-source encoder x265 v3.5 at comparable speed presets ¹⁴. For the comparison with VVenC, also x265 was configured to run with 8 threads. Besides sequence-specific parameters, the following parameter settings have been used for x265:

--preset {0,1,2,3,…,9} --tune psnr --crf {17,22,27,32} --keyint 1s --min-keyint 1s --profile main10 --output-depth 10

VVenC MT preset

Figure 2: PSNR_YUV BD-rate gain and relative encoder runtime in comparison to HM-17.0 for VVenC and x265 running with 8 threads. Lower PSNR YUV BD-rate values mean a better compression for the same objective quality in terms of PSNR_YUV.

Perceptually Optimized Quantization Parameter Adaptation

To improve the perceived (subjective) coding quality, VVenC supports a low-complexity quantization parameter adaptation (QPA) algorithm based on the simplified model of the human visual system adopted in the XPSNR psychovisual video quality measure ¹⁵. To evaluate the quality of the perceptually optimized quantization parameter adaptation (PQPA) especially in comparison to the approaches used in VTM and x265, MS-SSIM_YUV as a measure of subjective video quality (see MS-SSIM_YUV above) is used.

In Figure 3 the MS-SSIM_YUV BD-rate gain of VVenC over the HEVC test model reference software HM-16.24 is shown (lower is better). For VTM simulation, JVET's common test conditions CTC with additional PQPA are used (--PerceptQPA=1). With PQPA enabled, the speedups achieved over HM and VTM are similar to the Non-PQPA results presented in the previous section. This demonstrates the low-complexity nature of the PQPA algorithm. Also, the MS-SSIM_YUV based BD-rates show that additional bit-rate reduction can be achieved. Especially for the slower preset, an MS-SSIM BD-rate gain of more than 4% over VTM CTC without PQPA is realized. We recommend using the medium preset with multi-threading and PQPA enabled as a good tradeoff between encoder runtime and resulting perceived video quality. A summary of the MS-SSIM_YUV results for PQPA is shown in Table IV.

VVenC MT QPA preset history

Figure 3: MS-SSIM YUV BD-rate gain and encoder runtime in comparison to HM-16.24 for VTM and VVenC with perceptually optimized quantization parameter adaptation enabled for HD4K sequences (MCTF enabled for HM-16.24 and VTM-19.2). VVenC results are given for the 5 preset options: faster, fast, medium, slow and slower. VVenC is running multi-threaded using 6 threads for version <= 0.2 and 8 threads for version >=1.0. Lower MS-SSIM YUV BD-rate values mean a better compression for the same quality in terms of MS-SSIM_YUV.

Additionally, Figure 4 includes multi-threaded results for the HEVC open-source encoder x265 v3.5 tuned for SSIM at comparable speed presets ¹⁴. For the comparison with VVenC, also x265 was configured to run with 8 threads. Besides sequence-specific parameters, the following parameter settings have been used for x265:

--preset {0,1,2,3,…,9} --tune ssim --crf {17,22,27,32} --keyint 1s --min-keyint 1s --profile main10 --output-depth 10

VVenC MT QPA

Figure 4: MS-SSIM_YUV BD-rate gain and encoder runtime in comparison to HM-16.24 for VVenC with QPA enabled and x265 with --tune=ssim. Both VVenC and x265 are running with 8 threads. Lower YUV MS-SSIM YUV BD-rate values mean a better compression for the same quality in terms of MS-SSIM_YUV.

A summary of the MS-SSIM_YUV results for PQPA is shown in Table IV.

Table IV: MS-SSIM_YUV BD-rate gain and multi-threaded encoder speedup for HD and UHD test sequences for VVenC v1.12.0 with perceptually optimized QPA enabled.

	HD			UHD			HD4K
Preset	SSIM BD-rate vs. HM	Speedup vs. HM	Speedup vs. VTM	SSIM BD-rate vs. HM	Speedup vs. HM	Speedup vs. VTM	SSIM BD-rate vs. HM	Speedup vs. HM	Speedup vs. VTM
faster	-21.1%	340x	2400x	-20.6%	390x	2700x	-20.8%	370x	2500x
fast	-32.5%	150x	1000x	-31.8%	170x	1100x	-32.1%	160x	1100x
medium	-39.0%	39x	270x	-40.3%	52x	350x	-39.7%	45x	310x
slow	-42.7%	12.0x	81x	-43.8%	17x	110x	-43.3%	14x	98x
slower	-44.9%	2.6x	18x	-46.5%	3.7x	25x	-45.8%	3.2x	22x

Rate Control

To support encoding with a predefined target rate instead of a fixed QP in which the final bitrate is generally unknown beforehand, VVenC includes one- and two-pass rate control ^16,17. The one-pass (GOP-wise) rate control uses a short look-ahead window to collect information on-the-fly by fast encoding all frames in the group of pictures (GOP) covered by the window. It is intended for applications that cannot perform a full first pass through the entire sequence. The two-pass (sequence-wise) rate control includes a first, fast encoding pass in which the statistics for the entire sequence are collected in advance. This information is then used in the second pass to improve the rate control performance at the cost of increased encoding time. In addition, rate capping by means of a maximum rate parameter is supported in one- and two-pass mode ¹⁸.

PSNR_YUV BD-rate results of both rate control variants for HD4K content over a fixed QP VVenC encoding are shown in Table V. It should be noted that 10-second versions of the CTC sequences from classes A1 and A2 are used for the rate control tests. The target rates for the rate control runs were set based on the resulting target rates from the fixed QP runs. All runs were executed using 8 threads and an Intra period of 1 second. The one-pass approach achieves BD-rate performance that is similar to that of the fixed QP encoding, while keeping the encoding runtime overhead low. The BD-rate performance of the two-pass version is sometimes even better than that of the fixed QP encoding. For both rate control variants, the computational overhead is decreasing as presets become slower due to the constant complexity of the look-ahead pass. The average bitrate deviation for the one-pass approach is around 2%, while in the two-pass case it is around 1.3%.

Table V: PSNR_YUV BD-rate and relative encoding runtime for 1- and 2-pass rate control on HD4K sequences in comparison to a VVenC fixed QP encoding for all presets. Encoders were running with 8 threads and version v1.12.0.

	HD4K
	1-pass RC		2-pass RC
Preset	PSNR_YUV BD-rate vs. Fixed QP	Encoding Time vs. Fixed QP	PSNR_YUV BD-rate vs. Fixed QP	Encoding Time vs. Fixed QP
faster	1.84%	116%	0.40%	118%
fast	2.19%	107%	1.02%	113%
medium	2.27%	107%	1.14%	109%
slow	2.45%	106%	1.25%	102%
slower	2.57%	104%	1.38%	105%

For improved perceived video quality, the rate control can be used in combination with the PQPA method introduced above. The MS-SSIM_YUV BD-rate results for a combination of rate control and QPA over a fixed QP VVenC encoding with QPA are shown in Table VI. The results are similar to the results shown in Table V.

Table VI: MS-SSIM_YUV BD-rate and relative encoding runtime for 1- and 2-pass rate control on HD4K sequences in comparison to a VVenC fixed QP encoding with QPA for all presets. Encoders were running with 8 threads and version v1.12.0.

	HD4K
	1-pass RC		2-pass RC
Preset	MS-SSIM_YUV BD-rate vs. Fixed QP with QPA	Encoding Time vs. Fixed QP with QPA	MS-SSIM_YUV BD-rate vs. Fixed QP with QPA	Encoding Time vs. Fixed QP with QPA
faster	4.67%	116%	2.10%	124%
fast	5.04%	109%	2.72%	117%
medium	2.44%	111%	0.99%	114%
slow	2.48%	102%	1.34%	104%
slower	2.40%	96%	0.96%	101%

To test the rate control in conditions that correspond to the real-world use cases, experiments were conducted with the intra period size of approximately 4 seconds. The results for the rate control without and with PQPA are shown in Table VII and Table VIII. The results are similar to the results for the intra period length of 1 second in Table V and Table VI. This confirms the robustness of the rate control methods in VVenC.

Table VII: PSNR_YUV BD-rate and relative encoding runtime for 1- and 2-pass rate control on HD4K sequences in comparison to a VVenC fixed QP encoding for all presets with intra period size 4 seconds. Encoders were running with 8 threads and version v1.12.0.

	HD4K
	1-pass RC		2-pass RC
Preset	PSNR_YUV BD-rate vs. Fixed QP	Encoding Time vs. Fixed QP	PSNR_YUV BD-rate vs. Fixed QP	Encoding Time vs. Fixed QP
faster	0.60%	120%	0.18%	121%
fast	1.02%	108%	0.59%	116%
medium	1.27%	107%	0.96%	108%
slow	1.40%	105%	0.97%	102%
slower	1.48%	106%	1.07%	104%

The MS-SSIM_YUV BD-rate results for a combination of rate control and QPA over a fixed QP VVenC encoding with QPA for 4 seconds intra period are shown in Table VIII. The results are similar to the results shown in Table VI.

Table VIII: MS-SSIM_YUV BD-rate and relative encoding runtime for 1- and 2-pass rate control on HD4K sequences in comparison to a VVenC fixed QP encoding with QPA for all presets with intra period size 4 seconds. Encoders were running with 8 threads and version v1.12.0.

	HD4K
	1-pass RC		2-pass RC
Preset	MS-SSIM_YUV BD-rate vs. Fixed QP with QPA	Encoding Time vs. Fixed QP with QPA	MS-SSIM_YUV BD-rate vs. Fixed QP with QPA	Encoding Time vs. Fixed QP with QPA
faster	2.87%	110%	3.05%	120%
fast	3.39%	100%	2.51%	113%
medium	1.66%	106%	1.28%	110%
slow	1.78%	105%	1.19%	103%
slower	1.59%	107%	0.93%	105%

References

[10] F. Bossen, X. Li, V. Seregin, K. Sharman, and K. Sühring, “VTM and HM common test conditions and software reference configurations for SDR 4:2:0 10-bit video,” Doc. JVET-Y2010 of Joint Video Experts Team (JVET), Feb. 2022. [Online]. Available: https://www.jvet-experts.org/doc_end_user/current_document.php?id=11471
[11] G. Bjøntegaard, “Improvement of BD-PSNR Model,” Doc. VCEG-AI11 of ITU-T SG16/Q6, Berlin, Germany, July 2008. [Online]. Available: http://wftp3.itu.int/av-arch/video-site/0807_Ber/
[12] ITU-T and ISO/IEC JTC 1, Working practices using objective metrics for evaluation of video coding efficiency experiments, Technical Paper ITU-T HSTP-VID-WPOM and ISO/IEC DTR 23002-8, 2020.
[13] Z. Wang, E. Simoncelli, and A. C. Bovik, “Multi-Scale Structural Similarity for Image Quality Assessment,” in Proc. IEEE Asilomar Conf. Signals, Systems, and Comp., Pacific Grove, Nov. 2003.
[14] x265 software repository, version 3.4. Available online: https://github.com/videolan/x265/tree/Release_3.4
[15] C. R. Helmrich, S. Bosse, H. Schwarz, D. Marpe, and T. Wiegand, “A Study of the Extended Perceptually Weighted Peak Signal-to-Noise Ratio (XPSNR) for Video Compression with Different Resolutions and Bit Depths,” ITU Journal: ICT Discoveries – Special Issue: The Future of Video and Immersive Media, vol. 3, no. 1, May 2020. [Online]. Available: https://www.itu.int/en/journal/2020/001/Pages/08.aspx, https://github.com/fraunhoferhhi/xpsnr
[16] C. R. Helmrich, I. Zupancic, J. Brandenburg, V. George, A. Wieckowski, and B. Bross, “Visually Optimized Two-Pass Rate Control for Video Coding Using the Low-Complexity XPSNR Model”, in Proc. IEEE Int. Conf. Visual Communications and Image Processing (VCIP), Munich, Dec. 2021. DOI: 10.1109/VCIP53242.2021.9675364
[17] C. R. Helmrich, C. Bartnik, J. Brandenburg, V. George, T. Hinz, C. Lehmann, I. Zupancic, A. Wieckowski, B. Bross, and D. Marpe, “A Scene Change and Noise Aware Rate Control Method for VVenC, an Open VVC Encoder Implementation”, in Proc. IEEE Picture Coding Symposium (PCS), San Jose, Dec. 2022. DOI: 10.1109/PCS56426.2022.10018041
[18] C. R. Helmrich, C. Bartnik, J. Brandenburg, A. Wieckowski, B. Bross, and D. Marpe, “A Constrained Variable Bit Rate (CVBR) Algorithm for VVenC, an Open VVC Encoder Implementation”, in Proc. IEEE International Conf. Visual Communications and Image Processing (VCIP), Jeju, Dec. 2023.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Encoder Performance

PSNR Optimized Use Case

Perceptually Optimized Quantization Parameter Adaptation

Rate Control

References

Clone this wiki locally