How startup Kalray takes on machine learning behemoth Nvidia

December 14, 2018 //By Christoph Hammerschmidt
How startup Kalray takes on machine learning behemoth Nvidia
With a massive parallel processing architecture (up to 288 cores), startup company Kalray (Grenoble, France) claims superior performance for compute-intensive real-time tasks, in particular for Artificial Intelligence applications. In an exclusive interview with eeNews Europe, Stéphane Cordova, Vice President of Kalray’s Embedded Business unit, explains why these processors are beneficial in automotive environments.

eeNews Europe: What makes your architecture different from other multicore approaches?

Stéphane Cordova: The high computing performance makes our manycore architecture particularly suited for real-time applications where the results need to be provided within a predictable time span. This is the case in some data center applications, but also in the first place in embedded systems with stringent requirements such as autonomous vehicles.

eeNews: Safety-critical systems must meet the stringent requirements of ISO 26262 which, among others, mandate deterministic behavior in terms of timing. However, machine learning and determinism do not go well together.

Cordova: True, determinism and machine learning are difficult to reconcile. But one does not have to have very sophisticated machine learning algorithms for basic safety. The point is redundancy. And, by the way, there is already a new safety standard under discussion that will be put in place in autonomous cars on top of the ISO 26262 requirements – it is called SOTIF – “Safety of The Intended Functionality”. This standard will take these aspects into account.

eeNews Europe: What makes your architecture different from those of established market players, such as Nvidia?

Cordova: Along to the massive parallel approach, we also use on-chip memory. Thus, all relevant data are located in close proximity to the compute units. This eliminates long data paths with long signal propagation delays and losses of clock cycles. The entire neural network resides on chip. In the data exchange between processors and memory – all on-chip – we have data transfer bandwidths of 600 gigabytes per second.


Vous êtes certain ?

Si vous désactivez les cookies, vous ne pouvez plus naviguer sur le site.

Vous allez être rediriger vers Google.