Launch of ARC-AGI-3 - next edition of a benchmark for agents

2mon 9d ago by lemmy.ml/u/vermaterc in AI_Coding_Agents@lemmy.ml from www.youtube.com

The benchmark is a set of handcrafted 2d puzzle games that are easy to solve by humans, but require features like skill acquisition and long-term planning by agents.___

Easy by humans my ass