Artificial Intelligence (AI) The system first invented how to collect diamonds in popular video games Massjo-No difficult task is a number of steps: play without showing it. His creators say the system, called the dreamy, is a step in favor of machines that can generalize knowledge, to new learning situations in a domain, a The main goal of AIR.
“Dreamer marks an important step towards the AI general system,” says Danijar Hafner, Google Deepmind, California, Google Deepmind. “AI allows you to understand his physical environment and also self-improvement over time, without saying what man should do.” Hafner and his colleagues describe Mother In a study Nature Published on April 2.
In MassjoPlayers explore the 3D virtual world so many lands, including forests, mountains, deserts and swamps. Players use world resources to create objects, such as boxes, fences and swords, and they collect items, and among the precious diamonds.
To help Science Journalism
If you enjoy this article, consider entering award-winning journalism Subscribe. By purchasing subscription, you are helping to ensure the future of stories about the discoveries and ideas that are conformed to today.
It is important that Hafner, does not have two equal experiences. “Every time you play MassjoIt is a new world created in random, “he says. This is useful for questioning a system that the researchers want to be able to generalize from one situation to another.” You really need to understand what is in front of you; You cannot memorize a specific strategy alone, “he said.
The collection of a diamond “is a very hard task,” said Jeff Clune in Britain University, Canada, which prepared a separate group that prepared a program Find the diamonds using the videos of the human play. “There is no doubt that it represents an important breakthrough of the field.”
Diamonds are forever
A- Researchers have been focused on finding diamondssays Hafner, because it needs intricate steps, finding trees and collecting wood, which can use the artisanal table to build the work table.
This, along with more wood, can be used to make a wooden pickaxe, and so on. “There is a long chain of these milestones, so it is very deep exploration,” he said.
Previous attempts were based on the use of diamonds to collect AI systems, using videos of the steps of human play or researchers.
On the contrary, the dreamers examines everything about the game using his own, using judgment and error technique. In some of the learning consolidation Important advances AIN. But the previous programs were specialists, they could not apply knowledge to new domains from scratch.
Build me the world’s model
Key to Dreamer’s success, says that this model of “world-model” represents and decision-making the model, the world is not a detailed replica in the world. But the dreamer has predicted potential rewards for different actions using less computing than required to carry out these actions Massjo. “The world’s model really equals ai system with the ability to represent the future,” says Hafner.
This ability can help create robots that can learn to interact in the real world, where the costs of trial and error costs are much larger than a video game, says Hafner.
He tested the dreamy made in Diamond Challenge. “We built the whole algorithm without considering this,” Hafner says. But it happened to the team that his algorithm could test whether it could work, out of the box, in a task unknown.
In MassjoThe team gave a “plus” award “a 12 progressive steps involved in diamond collections.
These intermediate rewards encouraged the selection of actions that would bring a dreamy diamond. The team reset the game every 30 minutes, so that the dreamer does not get used to a particular configuration, but learned the general rules for obtaining prizes.
Under this configuration, the dreamer lasts at least nine days to find at least a diamond, says Hafner. Human players experience expert experts will take 20-30 minutes to find a diamond, and beginners need more time.
“This article is preparing a single algorithm to work properly in reinforcement learning tasks,” Keyon Vafa Keyon Cetcol at Boston, Harvard University of Massachusetts. “This is a hard problem and the results are fantastic.”
Air is a greater goal, says Clune’s last challenge Massjo Players: Killing Ender Dragon, the world’s most terrible creature.
This article reproduces with permission and has been First posted On April 2, 2025.