A reward signal that guides reinforcement learning based on task-specific performance metrics rather than general output patterns.