Planning in entropy-regularized Markov decision processes and games — ThinkLLM