AI breakthrough renders interactive 3D environments

AI breakthrough renders interactive 3D environments

Technology News |
Nvidia (Santa Clara, CA) has introduced what it says is "groundbreaking" AI research that enables developers to render entirely synthetic, interactive 3D environments using a model trained on real-world videos. Such technology offers the potential to quickly create virtual worlds for gaming, automotive, architecture, robotics, or virtual reality.
By Rich Pell


Nvidia researchers used a neural network to render synthetic 3D environments in real time, in contrast to current methods which require that every object in a virtual world be modeled individually – an expensive and time consuming process. The NVIDIA research uses models automatically learned from real video to render objects such as buildings, trees, and vehicles.

“Nvidia has been inventing new ways to generate interactive graphics for 25 years, and this is the first time we can do so with a neural network,” says Bryan Catanzaro, vice president of Applied Deep Learning Research at NVIDIA, who led the team developing this work. “Neural networks – specifically generative models – will change how graphics are created. This will enable developers to create new scenes at a fraction of the traditional cost.”

The research resulted in a simple driving game that allows participants to navigate an urban scene where all content is rendered interactively using a neural network that transforms sketches of a 3D world produced by a traditional graphics engine into video. The generative neural network learned to model the appearance of the world, including lighting, materials, and their dynamics.

Since the scene is fully synthetically generated, it can be easily edited to remove, modify, or add objects. The demo was made possible, say the researchers, by Nvidia Tensor Core GPUs.

The neural network works by first operating on high-level descriptions of a scene – for example, segmentation maps or edge maps that describe where objects are and their general characteristics, such as whether a particular part of the image contains a car or a building, or where the edges of an object are. The network then fills in the details based on what it learned from real-life videos.

For training, the researchers used Nvidia Tesla V100 GPUs on a DGX-1 with the cuDNN-accelerated PyTorch deep learning framework, and thousands of videos from the Cityscapes and Apolloscapes datasets.

“The capability to model and recreate the dynamics of our visual world is essential to building intelligent agents,” say the researchers in their paper. “Apart from purely scientific interests, learning to synthesize continuous visual experiences has a wide range of applications in computer vision, robotics, and computer graphics.”

For more, see “Video-to-Video Synthesis. (PDF)

Related articles:
Nvidia AI creates realistic slow motion from standard video
New Nvidia GPU architecture achieves ‘Holy Grail’ of computer graphics
New Nvidia GPU turns PC into AI supercomputer
AI generates synthetic MRIs in medical research advance
AI-created fake fingerprints a ‘wake-up call’ for biometric systems

Linked Articles