Optical nonlinear computation can eliminate a key latency bottleneck in transformers without sacrificing accuracy, opening a path to faster inference through specialized hardware.
Researchers use optical hardware (lithium niobate modulators) to speed up the Softmax and Sigmoid functions in transformers, which are computational bottlenecks despite being a tiny fraction of operations. The system maintains accuracy even with aggressive quantization and works at very high speeds, suggesting optical components could accelerate transformer inference in hybrid hardware setups.