A reinforcement learning technique that estimates action value only within trajectories meeting specific conditions.