End-to-End Efficient RL for Linear Bellman Complete MDPs with Deterministic Transitions — ThinkLLM