How Do Instructions Shape Speech? Cross-Attention Attribution for Style-Captioned Text-to-Speech — ThinkLLM