AI technique helps robots learn by observing humans

Technology News |
By Rich Pell

The artificial intelligence (AI) technique is offered as a first-of-its-kind deep learning-based system that can teach a robot to complete a task by just observing the actions of a human. According to the researchers, the method is designed to enhance communication between humans and robots and at the same time further research that will enable people to work alongside robots seamlessly.

“For robots to perform useful tasks in real-world settings, it must be easy to communicate the task to the robot,” say the researchers in a paper on the research. “This includes both the desired result and any hints as to the best means to achieve that result. With demonstrations, a user can communicate a task to the robot and provide clues as to how to best perform the task.”

The researchers used NVIDIA TITAN X GPUs to train neural networks to perform duties associated with perception, program generation, and program execution. In their method, a camera first acquires a live video feed of a scene. The positions and relationships of objects in the scene are inferred in real time by a pair of neural networks.

The resulting percepts are fed to another network that generates a plan to explain how to recreate those perceptions. Finally, an execution network reads the plan and generates actions for the robot, taking into account the current state of the world to ensure robustness to external disturbances.

As a result, say the researchers, the robot was able to learn a task from a single demonstration in the real world. Once the robot sees a task, it generates a human-readable description of the steps necessary to re-perform the task, allowing the user to identify and correct any issues with the robot’s interpretation of the human demonstration before execution by the real robot.

The key to achieving this capability, say the researchers, is leveraging the power of synthetic data – as opposed to the large amounts of labeled training data usually required in current approaches – to train the neural networks. With synthetic data generation, an almost infinite amount of labeled training data can be produced with very little effort.

This method is also claimed to be the first time an image-centric domain randomization approach – where synthetic data with large amounts of diversity fools the perception network into seeing the real-world data as simply another variation of its training data – has been used on a robot. The researchers chose to process the data in such a manner to ensure that the networks are not dependent on the camera or environment.

“The perception network as described applies to any rigid real-world object that can be reasonably approximated by its 3D bounding cuboid,” say the researchers. “Despite never observing a real image during training, the perception network reliably detects the bounding cuboids of objects in real images, even under severe occlusions.”

The researchers demonstrated their method by training object detectors on several colored blocks and a toy car (see video). The system was taught the physical relationship of blocks – whether they are stacked on top of one another or placed next to each other – and then infers an appropriate program and correctly places the cubes in the correct order. Because it takes the current state of the world into account during execution, the system is able to recover from mistakes in real time.

The researchers plan to continue to explore the use of synthetic training data for robotics manipulation to extend the capabilities of their method to additional scenarios. For more, see “Synthetically Trained Neural Networks for Learning Human-Readable Plans from Real-World Demonstrations.”


Related articles:
Microsoft buys conversational AI startup
New Nvidia GPU turns PC into AI supercomputer
Toyota, NTT develop AI robots for the home
Nvidia, ARM partner to bring AI to billions of IoT devices


Linked Articles