Reasoning effort, not tool access, buys first-try reliability in agentic code generation: an observational study — ThinkLLM