Computer Science & AI1 June 2026
Light-Speed AI: How Optical Deep Learning Could Transform Transformer Models
Source PublicationIEEE Transactions on Neural Networks and Learning Systems
Primary AuthorsIbadulla, Chen, Reyes-Aldasoro

To scale the processing demands of modern artificial intelligence, researchers have developed ConvShareViT, a novel architecture designed to run optical deep learning models directly on light-based hardware.
Vision Transformers (ViTs) are highly effective at image processing but require significant computational resources. By mapping these complex models onto a 4f free-space optical system, scientists aim to process visual data using light. This hardware-software integration represents a significant step forward for optical neural networks, though current designs remain at the theoretical modelling stage.
The study evaluated the performance of ConvShareViT by replacing standard linear layers with shared-weight convolutional layers. The researchers found that configurations using valid-padded shared convolutions successfully learned attention mechanisms, achieving comparable quantitative attention scores to standard ViTs. Critically, the model demonstrated a theoretical speedup of up to 3.04 times compared to traditional GPU-based systems.
Vision Transformers (ViTs) are highly effective at image processing but require significant computational resources. By mapping these complex models onto a 4f free-space optical system, scientists aim to process visual data using light. This hardware-software integration represents a significant step forward for optical neural networks, though current designs remain at the theoretical modelling stage.
The study evaluated the performance of ConvShareViT by replacing standard linear layers with shared-weight convolutional layers. The researchers found that configurations using valid-padded shared convolutions successfully learned attention mechanisms, achieving comparable quantitative attention scores to standard ViTs. Critically, the model demonstrated a theoretical speedup of up to 3.04 times compared to traditional GPU-based systems.
The Future of Optical Deep Learning
This potential acceleration suggests that optical neural networks could transition from conceptual designs to viable, high-speed hardware. Over the next decade, this architecture may pave the way for:- More efficient optical deep learning applications that leverage light-based processing.
- The adaptation of complex transformer models to free-space optical systems.
- Hardware-software co-design that optimises neural network structures for optical components.
- Alternative acceleration pathways that complement traditional GPU-based systems.
Cite this Article (Harvard Style)
Ibadulla, Chen, Reyes-Aldasoro (2026). 'ConvShareViT: A Vision Transformer-Like Architecture for Free-Space Optical Accelerators. '. IEEE Transactions on Neural Networks and Learning Systems. Available at: https://doi.org/10.1109/tnnls.2026.3689450