Fujitsu Develops PHOTON, a Transformer Alternative That Boosts AI Efficiency 475 Times
PHOTON directly addresses Transformer's memory bottleneck for long documents and multi-user queries, offering a path to much cheaper inference for multi-agent systems and large-scale LLM deployment.
Reporting from 1 sources: GIGAZINE.
Fujitsu announced PHOTON, a new architecture for large language models that improves throughput per GPU resource by up to 475 times compared to Transformer. PHOTON processes text in meaningful chunks rather than token units and uses multi-query integration for stable performance. The company will present details at ACL 2026 in July.
Fujitsu's new architecture, PHOTON, treats text as hierarchical chunks instead of individual tokens, cutting the computational load that makes Transformer slow on long inputs or simultaneous queries. In tests with a 1.2 billion parameter model, PHOTON achieved up to 475 times the multi-query throughput per GPU resource compared to standard Transformer, with only slight performance degradation. The architecture also uses a multi-query integration technique that combines multiple candidate outputs via majority voting or best-candidate selection, producing stable results from a single inference pass. Fujitsu plans to present the full details at ACL 2026, the top natural language processing conference, running July 2-7 in San Diego.
Synthesized by Yomimono from the 1 cited source below, including Japanese-language reporting where cited, then editorially reviewed before publishing.