Using Reward Uncertainty to Induce Diverse Behaviour in Reinforcement Learning — ThinkLLM