Encouraging an agent to explore diverse state-action pairs by maximizing the entropy of its occupancy measure.