- cross-posted to:
- AI_Coding_Agents@lemmy.ml
- cross-posted to:
- AI_Coding_Agents@lemmy.ml
cross-posted from: https://lemmy.ml/post/45705520
The benchmark is a set of handcrafted 2d puzzle games that are easy to solve by humans, but require features like skill acquisition and long-term planning by agents.
You must log in or register to comment.

