A benchmark task where an agent can achieve high scores without actually solving the intended problem.