Replay-buffer engineering for noise-robust quantum circuit optimization — ThinkLLM