Safety and accuracy follow different scaling laws in clinical large language models — ThinkLLM