Microsoft Announces Seven AI Models Including MAI-Thinking-1 and MAI-Voice-2
Microsoft is positioning its MAI model family as a self-sufficient alternative to leading models from Anthropic and OpenAI, claiming competitive performance without relying on third-party data or infrastructure.
Reporting from 1 sources: GIGAZINE.
On June 2, 2026, Microsoft announced seven proprietary AI models, including the reasoning model MAI-Thinking-1, which the company says outperformed Anthropic's Claude Sonnet 4.6 in human evaluation, and the voice cloning model MAI-Voice-2. The models were developed using Microsoft's own Maia 200 chip and clean training data, with no distillation from other companies' models.
Microsoft announced seven proprietary AI models on June 2, 2026, led by the reasoning model MAI-Thinking-1. The company claims the model outperformed Anthropic's Claude Sonnet 4.6 in human evaluation across 1,276 tasks, though benchmark scores show it trailing in several standard tests. MAI-Thinking-1 is a mixture-of-experts model with 1 trillion total parameters and 35 billion active parameters, trained on clean, licensed data and developed using Microsoft's own Maia 200 chip. The company stated it did not use distillation from other models.
The lineup also includes MAI-Code-1-Flash, a 5-billion-parameter coding model that Microsoft says consistently beats Claude Haiku 4.5 on benchmarks; MAI-Image-2.5, an image generation model ranked third in text-to-image and second in image editing on the Arena leaderboard; a faster Flash variant; MAI Transcribe-1.5 supporting 43 languages; and MAI-Voice-2, a speech synthesis model supporting 15 languages. MAI-Thinking-1 is available in private preview on Microsoft Foundry, while MAI-Code-1-Flash will roll out to Visual Studio Code and GitHub Copilot.
- MAI-Thinking-1: MoE model, 1 trillion total parameters, 35 billion active; outperforms Claude Sonnet 4.6 in human evaluation
- MAI-Code-1-Flash: 5 billion parameters, high-speed coding, beats Claude Haiku 4.5 on benchmarks
- MAI-Image-2.5: Image generation, ranked 3rd in text-to-image, 2nd in image editing on Arena
- MAI-Image-2.5 Flash: Faster, more cost-efficient image generation variant
- MAI Transcribe-1.5: Transcription model supporting 43 languages
- MAI-Voice-2: Speech synthesis supporting 15 languages
- MAI-Voice-2 Flash: High-speed speech synthesis variant
Synthesized by Yomimono from the 1 cited source below, including Japanese-language reporting where cited, then editorially reviewed before publishing.