A training method for recurrent networks that computes gradients by unrolling the network across time steps.