The new system, which DeepMind is calling MuZero, is based in part on DeepMind's work with the AlphaZero AI, which taught itself to master rule-based games like chess and Go. But MuZero also adds a new twist that makes it substantially more flexible.
That twist is called "model-based reinforcement learning." In a system that uses this approach, the software uses what it can see of a game to build an internal model of the game state. Critically, that state isn't prestructured based on any understanding of the game—the AI is able to have a lot of flexibility regarding what information is or is not included in it. The reinforcement learning part of things refers to the training process, which allows the AI to learn how to recognize when the model it's using is both accurate and contains the information it needs to make decisions.