Scalable Hyperparameter-Divergent Ensemble Training with Automatic Learning Rate Exploration for Large Models — ThinkLLM