Quantifying Hyperparameter Transfer and the Importance of Embedding Layer Learning Rate — ThinkLLM