Robotics societyof singapore


Log in

Neuromorphic Chip Gets $1 Million in Pre-Orders

13 Aug 2022 8:31 AM | Anonymous

Neuromorphic computing company GrAI Matter has $1 million in pre–orders for its GrAI VIP chip, the company told EE Times.

The startup has engagement to date from companies across consumer Tier-1s, module makers (including ADLink, Framos, and ERM), U.S. and French government research, automotive Tier-1s and system integrators, white box suppliers, and distributors.

As with previous generations of the company’s Neuron Flow core, the company’s approach for its GrAI VIP chip uses concepts from event–based sensing and sparsity to process image data efficiently. This means using a stateful neuron design (one that remembers the past) to process only information that has changed between one frame of a video and the next, which helps avoid processing unchanged parts of the frames over and over again. Combine this with a near–memory compute/dataflow architecture and the result is low–latency, low–power, real–time computer vision.

The company’s first–generation chip—GrAI One—was launched in autumn 2019. A second generation was produced solely for a project GrAI Matter worked on with the U.S. government, making GrAI VIP a third–gen product.

GrAI VIP can handle MobileNetv1–SSD running at 30fps for 184 mW, around 20× the inferences per second per Watt compared to a comparable GPU, the company said, adding that further optimizations in sparsity and voltage scaling could improve this further.

The GrAI VIP chip is an SoC with an updated version of the company’s neuron flow fabric plus dual Arm Cortex M7 CPUs (including DSP extensions) for pre– and post–processing. It has dual MIPI Rx/Tx camera interfaces.

“It’s about moving on to a new application case of AI,” GrAI Matter CEO Ingolf Held told EE Times. “Today, most of the world cares about understanding audio and video, and you get metadata out of it. So, nobody really cares what happened to the original feed, not really. All the architectures basically cram as many MACs into their architecture with as little precision as possible to basically get to the metadata. But that only brings us so far… We want to transform the audio and video experience for the consumer at home and in the workplace. And in order to transform it, you need a different architecture. The architecture has much different requirements to satisfy in terms of latency, in terms of quality, the metrics are very different.”

The key upgrade to the company’s neuron flow fabric in this third gen is the core is now FP16 capable, explained Mahesh Makhijani, VP of business development at GrAI Matter Labs. For an endpoint chip, where precision is usually reduced as much as possible to save power, this is unusual.

“All our MAC operations are done in 16 bit floating point, which is kind of unique compared to pretty much any other edge architecture out there,” Makhijani said. “A lot of people trade–off for power and efficiency by going to 8–bit INT… with sparsity and event–based processing, we had to do 16–bit floating point just because we keep track of what’s happened in the past. But we essentially come out ahead, because there’s so much to be gained that the 16–bit floating point is not an overhead for us. And in fact, it helps us quite a bit in some key use cases in terms of real–time processing.”

This includes advantages from a development standpoint. Models trained in 32–bit floating point can be quantized to 16–bit floating point, losing typically less than one percentage point in accuracy. (Typical INT8 quantization would lose two to three percentage points, Makhijani said). The result is that quantized models don’t need retraining, cutting out a step that can take significant development time.

“If you want to maximize the throughput relative to power consumption, accuracy can be sacrificed to some extent, especially for detection tasks… but there is a trade off in terms of training time, you will consistently spend a lot more time training models,” Makhijani said. “It adds up when situations change in the market and you need to re–train.”

GrAI Matter balances the power consumption required for the upgrade to higher precision MACs with its energy saving concepts based on event–based processing and sparsity. Since the higher precision means better accuracy can be preserved, models can also be pruned to a greater degree, reducing their size for a given prediction accuracy.

For example, for ResNet–50 trained on the ImageNet dataset, quantizing from FP16 to FP8 reduced the model size from 51.3 MB to 5.8 MB (about 9×) with pruning, preserving accuracy to within 0.5%. This is possible without removing layers, branches, or output classes. The size could be further reduced by using mixed precision (ie, a combination of FP4 and FP8), Makhijani said.

GrAI Matter sees its offering in between edge server chips and tinyML, though its device is intended to sit next to sensors in the system. An ideal use case would be GrAI VIP next to a camera in a compact camera module, he added.

“We are aiming to provide capabilities in the tens to hundreds of milliwatts range, depending on the use case,” Makhijani said.

Compared to the first–gen chip GrAI One, the third–gen GrAI VIP is slightly physically smaller at 7.6 x 7.6 mm, but the company has skipped a process node and migrated to TSMC 12 nm. The chip has slightly fewer neuron cores, 144 compared to 196, but each core is bigger. The result is a jump from 200,000 neuron cores (250,000 parameters) to around 18 million neurons for a total of 48 million parameters. On–chip memory has jumped from 4 MB to 36 MB.

An M.2 hardware development kit featuring GrAI VIP is available now, shipping with GrAI Matter’s GrAI Flow software stack and model zoo for image classification, object detection, and image segmentation.

Search and Navigate:

Call or Email Us:
+65-6790 5754


Room N3-02C-96, School of Mechanical & Aerospace Engineering, Nanyang Technological University, Singapore 639798

Powered by Wild Apricot Membership Software