Self-Evaluation Is Already There: Eliciting Latent Judge Calibration in Base LLMs with Minimal Data — ThinkLLM