Complete-muE: Optimal Hyperparameter Transfer and Scaling for MoE Models — ThinkLLM