The component of adaptive optimizers like Adam that tracks the squared gradients to scale learning rates per parameter.