A foundational result showing how to compute gradients of expected return with respect to policy parameters.