8×8 Systolic Hardware Matrix Multiplier in Stratix-10 FPGA
This video showcases the Maitrix Systolic Matrix Multiplier, an advanced hardware accelerator built on a novel carry-free Residue Number System (RNS) architecture. Demonstrated at SC22, this design pushes the boundaries of matrix computation speed, precision, and efficiency. The architecture divides each matrix operation across eight parallel “digit matrix multipliers”, one for each residue digit of the RNS representation. This enables modular, carry-free computation, dramatically reducing multiplier complexity and internal bus widths while increasing throughput.
- In this demo: Single-precision floating-point values are converted to RNS for matrix multiplication.
- Internal accumulation is performed entirely in modular form — no large accumulators required.
- Final rounding is deferred until the full dot product completes, enabling precision that often exceeds traditional FP32 results.
Fault Tolerant Product Summation in Arria-10 FPGA
This video highlights Maitrix’s breakthrough in error-correcting arithmetic using a carry-free, modular Residue Number System (RNS). Filmed at SC22, the demonstration shows how fault-tolerant, high-speed computation can be achieved using RNS-based architecture — a major leap forward for AI, aerospace, and mission-critical systems where silent data corruption is unacceptable.
- Hosted on an Intel Arria-10 FPGA and clocked beyond 350 MHz, the demo features a fully pipelined arithmetic engine that performs:
- Binary fixed-point (32.32) to RNS conversion
- Modular dot products (15 accumulations, each with 16 multiplicands)
- Real-time error injection (via fault circuits and deliberate overclocking)
- Automatic error detection and correction using RNS redundancy
- Conversion back to binary and visual display of final results
Fault Tolerant Product Summation Tutorial
In this tutorial, Maitrix, LLC demonstrates a cutting-edge error-correcting matrix arithmetic engine powered by carry-free Residue Number System (RNS) computation. Built on an Intel Arria-10 FPGA and overclocked to 400+ MHz, this system delivers unmatched performance, fault detection, and self-healing arithmetic capabilities — all crucial for resilient AI, aerospace, and high-reliability computing.
- 🔍 Highlights:
- 32.32-bit signed fixed-point arithmetic accelerated via
- RNS conversion Fully pipelined multiply-and-accumulate (MAC) unit
- Live real-time fault injection via overclocking and injected errors
- Detection and correction of single-digit modular errors
- High-speed throughput: up to 427 MHz observed in demo
- Visual indicators for:
- ✅ Correct computations
- 🟡 Corrected errors
- 🔴 Uncorrectable failures
MaiTRIX provides on-line educational webinars and training at no cost. If you’re interested in learning the latest in Modular Computation, and you would like to do it on your own time, and at your own pace, our on-line tutorials are an ideal choice. Corporate customers are invited!
On-line Tutorials and Webinars
Learn the inside basics of RNS-APAL, the first arbitrary precision library using residue arithmetic. Understand the basics and subtleties of Modular Computation research and our unique MaiTRIX research papers! Learn the Mod-9 architecture and how to program the Mod-9 ALU. Discover the inside workings of our advanced modular arithmetic Verilog modules. Its all here in our extensive video tutorial library.