Company
Solutions
Support
Search
Ampere Computing Logo
Solutions
SupportBlogCareers
Search
Solutions Home
Systems
Solutions
Performance Overview
Briefs Overview
Where to Try
Ampere Systems
Ampere Altra
Equinix
Oracle
Tencent Cloud
Ampere AIDownloadsHow It WorksFAQs
Hero Image

Ampere AI

Ampere Optimized Frameworks
Ampere Model Library (AML)

AI Solutions
Key Benefits
Downloads
How It Works
Resources
FAQs
Testing & Regression
Frameworks
Recommended Systems
AI Solutions

Solutions for AI on Ampere Altra

Ampere AI delivers world class AI Inference solutions! Ampere Optimized AI delivers a significant inference performance benefit to any existing model that runs on our supported frameworks out of the box. Ampere AI currently supports the following frameworks available for free download here or at some of our supporting partners:

  • TensorFlow
  • PyTorch
  • ONNX

Ampere hardware supports native FP16 data format providing nearly 2X speedup over FP32 with almost no accuracy loss for most AI models.

Ampere provides easy-to-use Docker containers that include Computer Vision and Natural Language Processing model examples and benchmarks that enable developers to get started quickly. Download one of our Docker containers today to experience the benefits of our best-in-class performance. Read more about our solutions below.

Throughput Performance on TensorFlow 2.7

Key Benefits

Ampere AI optimized frameworks + Ampere Altra Max deliver disruptive value for MLPerf and more:

  • Predictable Performance: Up to 5X Higher Throughput than AWS Graviton using FP16!
  • Predictable Performance: More than 2X Higher Throughput than x86 competition using FP16

System configurations, components, software versions, and testing environments that differ from those used in Ampere’s tests may result in different measurements than those obtained by Ampere. The system configurations and components used in our testing are detailed

here

Downloads

Ampere Optimized AI Software

Ampere Altra and Ampere Altra Max, with high performance Ampere optimized frameworks, offers the best-in-class Artificial Intelligence inference performance for frameworks including Tensorflow, PyTorch and ONNX Runtime. Ampere Model Library (AML) offers pretrained models to help accelerate AI development.

Ampere Optimized PyTorch
Ampere's inference acceleration engine is fully integrated with Pytorch framework. Pytorch models and software written with Pytorch API can run as-is, without any modifications.
Pytorch_logo
Ampere Optimized TensorFlow
Ampere's inference acceleration engine is fully integrated with Tensorflow framework. Tensorflow models and software written with Tensorflow API can run as-is, without any modifications.
TensorFlow_logo
Ampere Optimized ONNX Runtime
Ampere's inference acceleration engine is fully integrated with ONNX Runtime framework. ONNX models and software written with ONNX Runtime API can run as-is, without any modifications.
Ampere Optimized ONNX Runtime
Ampere Model Library (AML)
Ampere Model Library (AML) is a collection of AI model architectures that handle the industry's most demanding workloads. Access the AML open GitHub repository to validate the excellent performance of the Ampere AI with optimized frameworks on our Ampere Altra family of cloud-native processors.
GitHub-Small.png
How It Works

Ampere Optimized Framework Components

With an inference engine, Ampere optimized frameworks offer significant benefits. Click here to view the demo.

AmpereAILayer.jpg

Ampere helps customers achieve superior performance for AI workloads by integrating optimized inference layers into common AI frameworks.

This seamless integration to any AI framework accelerates inference without any accuracy loss, conversions, or model retraining. The architecture is diagrammed in the figure above. The main components are as follows:

  • Framework Integration Layer: Provides full compatibility with popular developer frameworks. Software works with the trained networks “as is”. No conversions or approximations are required.
  • Model Optimization Layer: Implements techniques such as structural network enhancements, changes to the processing order for efficiency, and data flow optimizations, without accuracy degradation.
  • Hardware Acceleration Layer: Includes a “just-in-time”, optimization compiler that utilizes a small number of Microkernels optimized for Ampere processors. This approach allows the inference engine to deliver high-performance and support multiple frameworks.
FAQs

Ampere AI FAQ

TESTING AND REGRESSION

Solutions and Regression Testing

Frameworks

Regression of the currently available Ampere Optimized AI images

Recommended Systems
Ampere_Altra_logo

Ampere Altra Systems

Ampere Altra and Ampere Altra Max. These systems are flexible enough to meet the needs of any cloud deployment and come packed with Ampere's 80-core Altra or 128-core Altra Max processors

Ampere_Altra_logo
Learn More

Ampere Computing

4655 Great America Parkway

Suite 601 Santa Clara, CA 95054

Tel: +1-669-770-3700

info[at]amperecomputing.com

About
image
image
image
image
© 2022 Ampere Computing LLC. All rights reserved. Ampere, Altra and the A and Ampere logos are registered trademarks or trademarks of Ampere Computing.