RNS Tensor Processor Unit (RNS TPU™)

Maitrix’s RNS TPU™ offers significantly higher speed, efficiency and accuracy over floating-point based TPU designs offered by everyone else! 

The RNS based matrix multiplier

MaiTRIX’s RNS TPU™ matrix multiplier technology provides the fastest, most accurate and most efficient processing of matrix multiplication in the world. Even FP16 and FP32 floating point hardware cannot emulate the high accuracy, high speed and low power capabilities of our technology. 

Today, many top-tier companies are relying on approximate computing techniques to attain greater speed and efficiency for their AI chips.  The problem is approximate computing is not accurate enough to perform general purpose neural network training.  Maitrix’s RNS TPU™ does not suffer from this problem, and is capable of higher efficiency, higher speed, and higher accuracy for AI matrix and vector operations.

Combine this fact with our unique carry free operation and it’s no wonder our technology is a significant breakthrough in computation!  Yet our RNS TPU™ IP is simple to use and integrate; it can be configured to use FP16, FP32 or FP64 floating-point data as input and output formats!

 

Technologies for implementation

The RNS TPU™ can be deployed using high-end FPGA devices providing significant performance increase for hardware matrix multiplication.  MaiTRIX’s IP is incredibly flexible and scalable, providing even higher relative throughput when synthesized using conventional ASIC or custom IC technologies.  Key parameters of the RNS TPU can be adjusted; for example, RNS digit-width may be adjusted to support hard or soft multipliers supporting operands from 6-bits wide to 18-bits wide, thereby allowing very high-speed operation.  The number of digits can be adjusted as well, thereby providing extreme numeric precision!  Adding more RNS digits also provides unprecedented summation distance for large matrices without reduction of speed!

For off the shelf solutions, our RNS TPU™ IP cores are tailored to run on Arria 10, Stratix 10, and Agilent FPGA devices from Intel.  Demonstration cores are available for popular development boards from Intel and Terasic.  MaiTRIX IP operates on a wide range of accelerator cards supporting Intel FPGAs to provide applications that demand high-capacity processing, high bandwidth memory and high-speed communication interfaces.

Applications for the RNS TPU™

The RNS TPU™ can accelerate, increase accuracy and reduce power consumption for convolutional neural networks, convolutional image processing, high-speed radar applications, autonomous vehicles, space-based applications, scientific processing applications, weather and turbulence modeling, and many, more applications!  

 

Because of the carry-free nature of modular computation, our technology is ideally primed for advanced implementation into 3-D IC technologies and advanced quantum computing applications!  For extreme reliability applications, such as deep space exploration, please see our error correcting EC-TPU™!  These technologies may be leveraged or simulated for development of advanced hybrid computers of the future!


 

Access our public RNS-TPU research papers

Access our RNS-TPU via the cloud!

Preliminary Specifications for Arria 10 based RNS TPU 1.0

See Erica, our Electronic Residue Integration and Computation Accelerator!

* RNS TPU™ is a trademark of MaiTRIX, LLC

*Arria, Stratix and Agilent are trademarks of Intel Corporation.

*RNS-TPU technology are inventions of MaiTRIX LLC and are protected by the following US patents, 10,992,314, 10,649,737, 10,649,736, 10,599,398, 10,387,122, 9,712,185, 9,395,952, 9,081,608, 9,311,050, CA2868833A1