GPU operations combined into a single kernel to reduce memory traffic and improve computational efficiency.