Blog

World Models To The Rescue


With the GUG release complete, we’re re-emerging with the next step of our journey to improve video games. During GUG’s development we created Nomic, a procedural logical rules engine, which allows games to dynamically alter their rules and introduce novel mechanics into the fold. However, a big problem arises when your game can logically do anything: visualizing this change in a comprehensive manner to the players.

As we couldn't create a bespoke VFX for every effect, we had to resort to generic shapes to visualize the change. However, this gets underwhelming fast.

When doing the visualization through conventional means, you very quickly run into a barrier - composing graphical primitives blows up combinatorially in complexity. You need to anticipate how any pair of visual changes will interact with each other, resulting in endless edge cases. For example, we conceptually need to support the ability to arbitrarily add a third player into a GUG battle.

Adding the player works logically - gugs combat each other, each player maintains their own health points and win conditions are respected. But the game doesn't have a place to place this third player, resulting in a cluttered visual that's almost impossible to parse.

Luckily, recent developments in machine learning come to the rescue once again! Controllable video generation, also called world modelling, allows you to specify vague requirements of the final picture you’d like to produce, and you can leave it up to the model to interpolate all the fine details. Importantly, this introduces a whole new design surface for interactions, where outcomes of actions are neurally interprolated.

The fantastic demo of neural interpolation capabilities showcased by MotionStream.

So, now we’ve created a local, real-time, video generation model that supports sophisticated forms of control. Stay tuned for a series of technical blogs on how this is possible on current consumer hardware.