Ampere and OnSpecta have collaborated to deliver world class AI solutions! Specializing in custom tailored compilation layers for various Silicon architectures, the OnSpecta DLS abstraction layer can deliver significant performance enhancements over typical non-optimized ML frameworks. In the case of Ampere® Altra®, the performance gains can be 4-6 times greater depending on the SW stack and workload demands. See below for more on this exciting AI technology collaboration.

Related Tags
AI InferenceML


OnSpecta’s Inference Engine (DLS) optimizes the performance of trained neural networks. Customers can achieve up to 10x in performance acceleration for AI workloads by deploying DLS across their existing compute environment, which can include CPUs, GPUs and AI chips. DLS is a seamless binary drop-in library to any AI framework (e.g. TensorFlow or ONNX) that accelerates inference without any accuracy loss, conversions, or model retraining. No API changes are required, and it works with all neural network architectures (including image classification, object detection, language and recommendation).

Custom layers are also supported. DLS delivers the best inference performance (latency or throughput), best performance per watt of power consumed and a features a small memory footprint. OnSpecta’s DLS now offers industry leading inference performance through optimizations made for the highly scalable Ampere® Altra® processor family.


Resources and Test Results

Test Results:Verified TestsUnverifiedCode Change