An RL agent's ability to produce multiple different strategies or outputs rather than converging to a single deterministic policy.