Robots need to learn from experience to solve complex situations in the real world. Deep reinforcement learning is the most common approach to robotic learning, but requires a lot of trial and error. This requirement limits its deployment in the physical world. This limitation makes robot training highly dependent on simulators. The disadvantage of simulators is that they do not capture aspects of the natural world and inaccuracies affect the training process. Recently, the Dreamer algorithm outperformed pure reinforcing learning in video games in terms of learning short interactions by planning in a learned world model. Planning in the imagination is made possible by learning a world model that can predict the results of various actions, minimizing the amount of trial and error required in the natural world.
However, it is unknown whether Dreamer can facilitate faster learning on physical robots. Researchers applied Dreamer to real robots instead of simulators. The algorithm trained a four-legged robot to get up and walk within an hour. It also adapts to external pushing within 10 minutes. In addition, Dreamer allowed robotic arms to pick multiple objects from camera views and place them in a less rewarding environment. Dreamer on a robot on wheels learned to navigate to a target position based purely on camera images that resolved ambiguity about the robot’s orientation. Researchers found that Dreamer was able to learn online in the real world, laying a solid foundation.
One of the fundamental issues in robotics research is how to teach robots to tackle challenging real-world problems. Deep reinforcement learning (RL), a popular method of teaching robots, allows them to learn from their mistakes and gradually improve their behavior. However, current algorithms are unsuitable for many real-world activities because they require too much contact with the environment to acquire effective behavior. Modern world models have recently demonstrated significant potential for data-efficient learning in virtual worlds and video games. Robots can now predict the results of likely actions thanks to learning from world models from past experience. Prediction reduces the amount of real-world trial and error required to develop effective behaviors.
Although it can be challenging to develop correct world models, they have attractive properties for robot learning. Global models, which anticipate future events, enable planning and behavioral learning with little real involvement. In addition, world models condense generic information about environmental dynamics that, once understood, can be used for various subsequent activities. World models develop representations that combine and integrate many sensor modalities into latent states, eliminating the need for manual state estimations. Last but not least, world models effectively generalize from accessible offline data, which could accelerate real-world learning even more.
Despite the benefits of world models, learning the right world models for the real world remains a major open problem. The study uses recent advances in the Dreamer world model to train a range of robots in the simplest and most fundamental problem scenario: online reinforcement learning in the real world, devoid of simulators or demonstrations. The image below illustrates how Dreamer builds a world model based on a iteration buffer of past experiences, learns behavior through implementations provided in the world model’s latent space, and continuously interacts with the environment to discover and refine its behavior.
Researchers wanted to test the limits of robot learning in the real world while providing a solid foundation for future research that establishes the benefits of global robot learning models. The following is a summary of the major contributions of this document:
- Learn to demonstrate well in the virtual environment without using new algorithms. The exercises cover a variety of difficulties, including different areas of action, sensory modalities, and reward systems.
- Training a four-legged friend to roll off his back, get up and walk in less than an hour. After that, the Robot is used to being pushed within 10 minutes.
- Visual choice and placement By locating items from pixels and combining images with proprioceptive input to train robotic arms to pick and rank objects from scarce rewards.
- The open-source software architecture of all studies accommodates different fields of action and sensory modalities and provides a versatile framework for future research on global models for real-world robot learning.
By discovering current world models that effectively sample robot learning for a variety of tasks, from scratch in the real world and without simulators, this algorithm undeniably contributes to developing the future of physical robot learning.
This Article is written as a summary article by Marktechpost Staff based on the research paper 'DayDreamer: World Models for Physical Robot Learning'. All Credit For This Research Goes To UC Berkeley Researchers. Checkout the paper and project. Please Don't Forget To Join Our ML Subreddit