A technique that allows a model to handle multiple requests or tasks simultaneously within a single forward pass, improving efficiency on concurrent workloads.