AI breakthrough renders interactive 3D environments

December 03, 2018 // By Rich Pell
Nvidia (Santa Clara, CA) has introduced what it says is "groundbreaking" AI research that enables developers to render entirely synthetic, interactive 3D environments using a model trained on real-world videos. Such technology offers the potential to quickly create virtual worlds for gaming, automotive, architecture, robotics, or virtual reality.

Nvidia researchers used a neural network to render synthetic 3D environments in real time, in contrast to current methods which require that every object in a virtual world be modeled individually - an expensive and time consuming process. The NVIDIA research uses models automatically learned from real video to render objects such as buildings, trees, and vehicles.

"Nvidia has been inventing new ways to generate interactive graphics for 25 years, and this is the first time we can do so with a neural network," says Bryan Catanzaro, vice president of Applied Deep Learning Research at NVIDIA, who led the team developing this work. "Neural networks - specifically generative models - will change how graphics are created. This will enable developers to create new scenes at a fraction of the traditional cost."

The research resulted in a simple driving game that allows participants to navigate an urban scene where all content is rendered interactively using a neural network that transforms sketches of a 3D world produced by a traditional graphics engine into video. The generative neural network learned to model the appearance of the world, including lighting, materials, and their dynamics.

Since the scene is fully synthetically generated, it can be easily edited to remove, modify, or add objects. The demo was made possible, say the researchers, by Nvidia Tensor Core GPUs.

The neural network works by first operating on high-level descriptions of a scene - for example, segmentation maps or edge maps that describe where objects are and their general characteristics, such as whether a particular part of the image contains a car or a building, or where the edges of an object are. The network then fills in the details based on what it learned from real-life videos.

For training, the researchers used Nvidia Tesla V100 GPUs on a DGX-1 with the cuDNN-accelerated PyTorch deep learning framework, and thousands of videos from the Cityscapes and Apolloscapes datasets.

"The capability


Vous êtes certain ?

Si vous désactivez les cookies, vous ne pouvez plus naviguer sur le site.

Vous allez être rediriger vers Google.