Edge Processor Takes AI out of the Server Room

The Kinara Ara-2 Edge AI processor is a high powered inference to run high load applications

Applications include video analytics, Large Language Models (LLMs) and other Generative AI models without the need for using data centre equipment. It has a latency optimised design with alanced on-chip memories and high off-chip bandwidth to execute very large models with extremely low latency.

LLMs and Generative AI in general have become incredibly popular, but most of the associated applications are running on GPUs in data centres and are burdened with high latency, high cost and questionable privacy.

The Ara-2 also offers secure boot, encrypted memory access, and a secure host interface to enable enterprise AI deployments with high security. It also has a comprehensive SDK that includes a model compiler and computing-unit scheduler, flexible quantisation options that include the integrated Kinara quantiser as well as support for pre-quantised PyTorch and TFLite models, a load balancer for multi-chip systems, and a dynamically moderated host runtime.

Ara-2 is available as a stand-alone device, a USB module, an M.2 module, and a PCIe card featuring multiple Ara-2’s.

Author
Recent Posts

Jonathan Newell

Jonathan Newell is a graduate of Loughborough University and has three decades of experience in engineering as well as broadcast and technical journalism.

Latest posts by Jonathan Newell (see all)

UK manufacturing steps up to COVID-19 crisis - April 2, 2020
Clustering Innovation - March 12, 2020
A Global Monitor - March 6, 2020

Tags: Electronics Software

Edge Processor Takes AI out of the Server Room

Related news

Read More News From Unspecified Company: