Microsoft MAI-Thinking-1

Microsoft MAI-Thinking-1

Microsoft's first in-house reasoning model. Announced at Microsoft Build 2026 on June 2, 2026. ~1T total parameters, 35B active via sparse Mixture of Experts. Trained entirely on commercially licensed data — no distillation from OpenAI or other third-party models, and no AI-generated content in pre-training. Runs on Microsoft's Maia 200 inference accelerators inside Azure.

Architecture

Sparse MoE. ~1T total parameters, 35B active per token. 256K token context window. Supports function calling and system instructions. Compatible with the standard Chat Completions API.

Benchmarks

Benchmark Score
AIME 2025 97.0%
AIME 2026 94.5%
SWE-Bench Pro ~53% (competitive with Claude Opus 4.6)
Human preference (blind side-by-side, 1,276 tasks) Preferred over Claude Sonnet 4.6

Strong in mathematical and scientific reasoning for its weight class. Slightly below the very top tier on coding benchmarks (Claude Opus 4.8, GPT-5.5).

Training

Ground-up training on enterprise-grade, commercially licensed data. No distillation from third-party models. AI-generated synthetic content explicitly excluded from pre-training. Microsoft describes this as enabling clean enterprise licensing without legal ambiguity.

Hardware

Runs on Maia 200 — Microsoft's second-generation in-house AI inference accelerator. Specs: TSMC 3nm process, 216GB HBM3e memory, 7 TB/s memory bandwidth, 140B+ transistors, >10 petaflops FP4 / >5 petaflops FP8, 750W TDP. Four chips per tray with direct non-switched interconnects; scales to 6,144 accelerators. Deployed in Azure US Central (Iowa) and US West 3 (Phoenix), with Italy, Australia, and South Korea expanding next.

MAI Family

MAI-Thinking-1 is the reasoning flagship of a 7-model MAI family announced at Build 2026. Known members:

Model Role
MAI-Thinking-1 Reasoning flagship
MAI-Code-1-Flash 5B coding model, live in GitHub Copilot
MAI-Image-2.5 Multimodal image generation and editing
MAI-Voice-2 Text-to-speech and voice cloning
MAI-Transcribe-1.5 Speech-to-text

Project Polaris

Project Polaris is a coding-specialized MAI model (companion to MAI-Thinking-1) being positioned as the GitHub Copilot default engine. Replaces GPT-4 Turbo as the default model for all GitHub Copilot subscribers starting August 2026. Three-month fallback period offered for teams that want to stay on GPT-4.

MAI-Code-1-Flash (5B) is already live in the Copilot model picker as of June 2, 2026, scoring 51.2% on SWE-Bench Pro vs. 35.2% for Claude Haiku 4.5.

Availability

  • Azure AI Foundry (enterprise, private preview at launch)
  • MAI Playground (public preview planned)
  • OpenRouter, Fireworks AI, Baseten (third-party API access)
  • GitHub Copilot model picker (MAI-Code-1-Flash)

Pricing

Not disclosed at launch.

Strengths

  • No licensing ambiguity: fully commercially licensed training data, no OpenAI distillation — clean for enterprise IP-sensitive deployments.
  • Strong math/reasoning: 97.0% AIME 2025, 94.5% AIME 2026 — top-tier scores.
  • Native Azure integration: runs on Maia 200 inside Azure, which reduces latency and lowers routing cost vs. third-party model calls.
  • 256K context: sufficient for most enterprise document-processing workloads.

Weaknesses

  • Context window (256K) is smaller than Nemotron 3 Ultra or claude-opus-4-8 (both 1M).
  • SWE-Bench Pro score (~53%) trails the current frontier (claude-opus-4-8 ~69%, MiniMax M3 59%).
  • Not open-weights; Microsoft controls weights and deployment.
  • Pricing not yet disclosed — hard to model cost vs. alternatives.

Use Cases

Best for: enterprise reasoning tasks on Azure, math/science workflows, organizations that need a model with clean commercial training provenance. Will be the default Copilot coding engine from August 2026 via Project Polaris.

Not ideal for: very long context tasks over 256K tokens (use claude-opus-4-8 or nvidia-nemotron-3-ultra), pure coding benchmark maximization.

Related

claude-opus-4-8 · claude-sonnet-4-6 · agentic-workflows · evals · extended-thinking

Sources