Generic training stage that teaches a model to suppress erased-span influence by learning from diverse span-deletion examples.