A learning algorithm that selects actions based on context and learns from feedback to improve future decisions.