Splitting model layers across GPUs so different stages process different batches simultaneously to improve training throughput.