hero

Ampere® AI

Ampere AI Optimizer (AIO) Ampere Model Library (AML)

AI Solutions

Solutions for AI on Ampere® Altra®

Ampere® platforms are the best choice for Artificial Intelligence from training to inference. For CPU based inference, Ampere AIO provides the best results for workloads using common AI frameworks such as TensorFlow or PyTorch. Ampere processors are also an ideal fit for high performance training and inference when used in conjunction with GPUs or other accelerators.

Ampere AI is our new line of tools to enhance the performance and efficiency of CPU-based inference. Check back frequently as our line will be expanding in the future.

Ampere AI Solutions Downloads

Ampere AIO Supported Frameworks

Ampere® Altra®, with high performance Ampere® AI Optimizer, offers the best-in-class Artificial Intelligence inference performance for standard frameworks including Tensorflow and PyTorch.

AIO for Pytorch

Ampere® AIO inference acceleration engine is fully integrated with Pytorch framework. Pytorch models and software written with Pytorch API can run as-is, without any modifications.

Click here to download a detailed document on running AIO Pytorch on Altra/Altra Max machines.

To download, accept the End Users License Agreement (EULA). The download link will be sent over an email.

AIO for Tensorflow

Ampere® AIO inference acceleration engine is fully integrated with Tensorflow framework. Tensorflow models and software written with Tensorflow API can run as-is, without any modifications.

Click here to download a detailed document on running AIO Tensorflow on Altra/Altra Max machines.

To download, accept the End Users License Agreement (EULA). The download link will be sent over an email.

Ampere AIO Components

How it Works

By deploying an optimized software inference accelerator, Ampere AIO offers significant performance benefits without diverging from common open source AI frameworks. Checkout the details below.

Ampere AIO helps customers achieve superior performance for AI workloads by integrating optimized Inference layers into common AI frameworks.

This seamless integration to any AI framework accelerates inference without any accuracy loss, conversions, or model retraining. The architecture is diagrammed in the figure above. The main components of AIO are as follows:

  • Framework Integration Layer: Provides full compatibility with popular developer frameworks. AIO works with the trained networks “as is”. No conversions or approximations are required.
  • Model Optimization Layer: Implements techniques such as structural network enhancements, changes to the processing order for efficiency, and data flow optimizations, without accuracy degradation.
  • Hardware Acceleration Layer: Includes a “just-in-time”, optimization compiler that utilizes a small number of Microkernels optimized for Ampere processors. This approach allows the inference engine to deliver high-performance and support multiple frameworks.

FAQs

Ampere® AIO FAQ

Ampere AI is a family of software tools that optimizes the processing of AI and ML inference workloads on Ampere processors. With Ampere AI tools, CPU-based inference workloads can take advantage of the cost, performance, scalability and power efficiency of Ampere processors, while enabling users to program with common and standard AI frameworks.

AIO is a library that is integrated into many common AI frameworks such as Tensorflow, Pytorch, ONNX Runtime, etc. AIO accelerates inference without any accuracy loss, conversions, or model retraining. No API changes are required, and it works out-of-the-box for any neural network and workload built for the supported standard frameworks.

AML is a collection of AI models for computer vision, natural language processing and recommendation engine applications. The models have been pretrained on standard datasets and are available for our customers and partners to quickly build into their applications.

We offer ready-to-use Docker images that can be pulled from Ampere including code snippets and documentation along with the link to download after accepting the End user license agreement. The Docker image includes a standard ML framework (TensorFlow etc.) preinstalled with AIO. It runs on any Ampere processor. You can run your inference scripts without change. Example models like image classification and object detection are provided with the image.

The AIO docker images are pre-configured with Python 3.8.

It is currently based on Ubuntu 20.04. We will continue to expand the support of other Linux distributions.

AIO allows an application to control its behavior, like the number of AIO threads and CPU binding, through a set of environment variables. Please refer to the AIO documentation for details.

AIO accelerates inference workloads only. Ampere does support GPU accelerated training on Ampere® Altra® processors.

Please contact your Ampere sales representative and they can help you find more information. If you are using the SW through a 3rd party please contact their customer support team for help. Alternatively, you can also contact the AI team at ai-support@amperecomputing.com

Testing and Regression on Ampere Solutions Portal

Current Regression Testing

Ampere AIO on Common Inference Frameworks

Regression of the current AIO images available from Ampere

Results which are reported as "Unverified" imply that we were unable to collect a result due to an issue within our test infrastructure. When we root-cause an Unverified result the write-up will appear in the Test Notes section of the solution page for the software under test. An unverified test result does not imply an issue with the software under test - it means only that Ampere was unable to confirm one or more steps in our verification process.