Most AI benchmarks don't tell us much ... "mindcraft," that gives a model control over a Minecraft character and tests its ability to design structures, along the lines of Microsoft's Project Malmo.