Most AI benchmarks don't tell us much ... "mindcraft," that gives a model control over a Minecraft character and tests its ability to design structures, along the lines of Microsoft's Project Malmo.
Results that may be inaccessible to you are currently showing.