PEFT methods should be evaluated on a stability-plasticity trade-off: how well they adapt to new tasks versus how much they forget pretrained capabilities. Orthogonal finetuning achieves the best balance, and you can improve results by rewinding to earlier checkpoints.
This paper evaluates parameter-efficient finetuning (PEFT) methods by measuring both downstream task performance and how well they preserve a model's original capabilities.