
Der neueste KI-Agent von Deepmind lernt, indem er KI-erstellte Welten erkundet | SIMA 2 verbessert sich selbst, indem es neue Aufgaben durch Versuch und Irrtum erlernt, ohne auf menschliche Trainingsdaten angewiesen zu sein. Die Beispiele und das Feedback werden von Gemini generiert.
https://the-decoder.com/deepminds-latest-ai-agent-learns-by-exploring-unfamiliar-games-and-ai-built-worlds/
4 Kommentare
„SIMA 2 is Deepmind’s latest AI agent for 3D virtual environments. Unlike its predecessor, SIMA 1, which could only follow simple voice commands, SIMA 2 is built to understand tasks, apply reasoning, and make its own decisions.
The agent navigates complex 3D worlds by analyzing on-screen visuals and simulating keyboard and mouse inputs – all without direct access to internal game data. This makes SIMA 2 an „embodied agent“ that interacts with virtual environments much like a human player would.
One of the biggest upgrades is SIMA 2’s ability to improve itself. It can learn new tasks through trial and error without relying on human training data. The process begins with examples and feedback generated by Gemini. Once that foundation is set, SIMA 2 creates its own training data, evaluates its own performance, and uses that feedback to guide further learning – all autonomously.“
An example of the level of autonomy achieved here.
„`
> User: Go up and slightly to the left to the little cave and mine to get some coal
„`
Interesting! But also not very autonomous. The thing I’m most interested in is this:
> SIMA 2 also has a relatively short memory of its interactions – it must use a limited context window to achieve low-latency interaction.
Infinitely long context windows aren’t a goal in LLMs, because they reduce overall effectiveness of each individual piece of context. So is the goal to use this work to parallelize creation of training data that gets retrained into the next model?
That’s not really „it teaches itself“, especially with the level of user prompting above. But still interesting.
Seeing several new projects that could all be categorized as explorative AI. Most involve the use of dynamically expanding multi-agent systems.
Cool concept, we will see how it goes. Humans learn like this too, it’s why we proof read after all. However, I still think we have an issue.