x265 Workload Brief
Open-source software library and application for encoding video streams.
The Ampere® Altra® Max processor is a complete system-on-chip (SOC) solution that supports up to 128 high-performance cores with an innovative architecture that delivers predictable high performance, linear scaling and high energy efficiency. Online video continues to rapidly grow, driving usage of video encoding to compress videos which greatly reduces both storage space and network bandwidth. We demonstrate Ampere Altra Max is ideal for running video encoding using x265 by delivering both industry leading performance and power efficiency.
Ampere Altra Max is designed to deliver exceptional performance and power efficiency for applications like video encoding. We use x265 which implements the H.265/MPEG-H Part2 standard which is the second most widely used video codec today after H.2641,2. Previously, we reported industry leading performance and power efficiency running x264 on Ampere Altra Max3,4. Compared to x264, more advanced video codecs such as x265 provide greater video compression at the expense of greater computing resources and power usage.
Ampere Altra Max uses an innovative architectural design, operating at consistent frequencies with single-threaded cores that make applications more resistant to noisy neighbor issues. This allows workloads to run in a predictable manner with minimal variance. Additionally, the processors are designed to be highly power efficient. Recent x265 performance optimizations for the aarch64 architecture have improved performance significantly5. Excellent hardware running optimized software gives Ampere Altra Max outstanding performance and power efficiency running x265. Now it is possible to run x265 with the highest performance and the most energy efficient execution using Ampere Altra Max.
Cloud Native: Designed from the ground up for cloud customers, Ampere Altra Max processors are ideal for video encoding in the cloud using applications like x265.
Scalable: With an innovative scale-out architecture, Ampere Altra Max processors have a high core count with compelling single-threaded performance combined with consistent frequency for all cores delivering greater performance at the socket level.
Power Efficient: Industry-leading energy efficiency allows Ampere Altra Max processors to hit competitive levels of raw performance while consuming much lower power than the competition.
We evaluated x265 performance on the Ampere Altra Max M128-30 processor compared to Intel® Xeon® Platinum 8380 (Ice Lake) and AMD EPYC™ 7763 (Milan). We ran the tests using several x265 presets (medium, slower, veryslow and placebo) and video inputs with different resolutions (480, 720 and 1080) using CentOS 8.4 with 4.18 kernel. To maximize platform throughput, multiple x265 instances equal to the number of CPU cores available on the socket were run, using one thread per instance. To minimize OS overhead, the x265 binary, input, and output files are stored on a RAM disk. We built the latest available versions of x265 downloaded here with gcc 11.2 on all platforms. See Additional Benchmarking Details description below for additional details.
Ampere Altra Max has the best encoding performance running x265 compared to Intel® Xeon® Platinum 8380 (Ice Lake) and AMD EPYC™ 7763 (Milan). Figure 1 shows Ampere Altra Max is consistently faster than the x86 platforms for all the x265 presents tested averaged across the 3 different input videos. We measure that Ampere Altra Max has a 2.0–2.5x average encoding speedup compared to Intel® Xeon® Platinum 8380 (Ice Lake) and the Ampere Altra Max is 1.1–1.3x faster compared to AMD EPYC™ 7763 (Milan).
In Figure 2, we plot aggregate FPS vs. the number of simultaneous x265 instances run. Ampere Altra Max shows excellent platform scaling with linear scaling from 1 to 128 cores, highlighting Ampere Altra Max innovative scale-out architecture. Intel® Xeon® Platinum 8380 Processor (Ice Lake), with 40 physical cores, and AMD EPYC™ 7763 (Milan), with 64 physical cores, have lower overall performance, don’t scale as well and show the characteristic drop when running with hyperthreading.
In addition to the best video encoding performance, Ampere Altra Max is the most power efficient processor reducing the carbon footprint of video encoding. Figure 3 shows the average power consumption at the socket level with Ampere Altra Max using 0.79–0.81x of the power compared to Intel Xeon Platinum 8380 Processor (Ice Lake) and 0.79–0.80x of the power of AMD EPYC™ 7763 (Milan).
With industry leading performance and energy efficiency, Ampere Altra Max delivers outstanding performance per Watt. Figure 4 shows FPS/Watt (equivalent to Frames/Joule), with Ampere Altra Max delivering 2.5 – 3.1 greater FPS/Watt compared to Intel® Xeon® Platinum 8380 Processor (Ice Lake) and 1.4 – 1.7x greater FPS/Watt vs. AMD EPYC™ 7763 (Milan).
Ampere Altra Max processors are a complete System On Chip (SOC) solution built for Cloud Native workloads, designed to deliver exceptional performance, platform scalability and energy efficiency for applications like video encoding using x265. The h.265 compression standard is the second most widely used video format today after h.264 and x265 is the leading implementation of h.265. We previously showed Ampere Altra Max delivers both industry leading performance and power efficiency running x2643,4. In this work, we demonstrate that Ampere Altra Max delivers both industry leading performance and power efficiency running x265. More advanced video codecs such as x265, which provide greater video compression at the expense of greater computing resources and power usage, are a perfect fit to run on Ampere Altra Max.
Ampere Altra Max demonstrates up to 2.5x higher encoding performance, is 1.2–1.3x more energy efficient and has up to 3.1x greater FPS/Watt (equivalent to Joules/Frame) compared to Intel® Xeon® Platinum 8380 (Ice Lake). Compared to AMD EPYC™ 7763 (Milan), Ampere Altra Max is up to 1.35x faster, 1.2–1.3x more energy efficient and has up to 1.7x greater FPS/Watt (equivalent to Joules/Frame). In additional to providing the fastest video encoding, Ampere Altra Max delivers predictable high performance with linear scaling from 1 to 128 cores using a highly energy efficient design to reduce the carbon footprint of video encoding. It is now possible to encode with x265 at the highest levels of performance while making no compromises on energy efficiency.