The Kinara Ara-2 Edge AI processor is a high powered inference to run high load applications
Applications include video analytics, Large Language Models (LLMs) and other Generative AI models without the need for using data centre equipment. It has a latency optimised design with alanced on-chip memories and high off-chip bandwidth to execute very large models with extremely low latency.
LLMs and Generative AI in general have become incredibly popular, but most of the associated applications are running on GPUs in data centres and are burdened with high latency, high cost and questionable privacy.
The Ara-2 also offers secure boot, encrypted memory access, and a secure host interface to enable enterprise AI deployments with high security. It also has a comprehensive SDK that includes a model compiler and computing-unit scheduler, flexible quantisation options that include the integrated Kinara quantiser as well as support for pre-quantised PyTorch and TFLite models, a load balancer for multi-chip systems, and a dynamically moderated host runtime.
Ara-2 is available as a stand-alone device, a USB module, an M.2 module, and a PCIe card featuring multiple Ara-2’s.
- UK manufacturing steps up to COVID-19 crisis - April 2, 2020
- Clustering Innovation - March 12, 2020
- A Global Monitor - March 6, 2020