HyperAccel

A Novel Silicon Solution
for Generative AI
with Ultimate Sustainability

LPU. Redefining AI Performance
at the Silicon Core.

At HyperAccel, we engineer next-generation AI semiconductors that deliver unparalleled
performance, efficiency, and scalability — empowering the future of intelligent computing.

AI-Specialized Core

Coarse-grained, programmable processor that can accelerate all operations of LLM inference in each core for high performance.

Streamlined Dataflow

Perfectly aligned bandwidth between memory-and-compute and model parameter reuse between cores for maximum efficiency.

Multi-Chip Scalability

Overlap of communication and computation when leveraging increasing size of AI models to achieve perfect scalability.

Traditional Dataflow

Hierarchical with limited data reuse

Vs.

LPU Dataflow

Streamlined with maximum data reuse

AI-Optimized Solutions,
Purpose-Built for Every Scale

From edge devices to cloud datacenters, we deliver a full spectrum of AI semiconductor products.

Bertha 500

Enabling Generative AI with Unprecedented Efficiency

Optimized for high-throughput, cost-effective AI inference by integrating LPU and Samsung LPDDR5X memory

Bertha 500 Thumb

Forte 55X

Running Generative AI at Unrivaled Speed

Optimized for low-latency, power-efficient AI inference by bringing together LPU technology with HBM-equipped AMD Alveo U55C high performance compute card

Forte 55X Thumb

HX-F55X

Enabling On-Premise AI with Unmatched Affordability

HyperAccel Xceleration (HX) system comprised of Forte 55X and ASUS server for high-speed datacenter inference, formerly called Orion

HX-F55X Thumb

Built for AI Services,
Fully Compatible with
Leading AI Frameworks

To bridge AI applications and the LPU, we develop user-friendly software platform.

  • Provides standardized ecosystem for Generative AI inference
  • Supports LLM inference and model frameworks, such as vLLM and HuggingFace
  • Accommodates all transformer-based LLMs (e.g., GPT, Llama, Qwen, Mistral, Grok, DeepSeek, Falcon, Gemma) and multi-modal models
  • Provides Pytorch support and Python-embedded domain-specific language (eDSL) for authoring high-performance, efficient LPU kernels
  • Facilitates developer page and model zoo for easy compilation
  • Implements device runtime and driver to create and execute binaries on the LPU
  • Enables seamless LLM inference experience for developers familiar with GPUs