Gemma 4
Gemma 4
Google's open-weight model family. The 31B variant is the current reference point for open-source models that can actually run locally at professional quality. Apache 2.0 licensed — no restrictions on commercial use or modification.
Key Specs
- Parameters: 31B active parameters (primary variant; smaller variants exist)
- License: Apache 2.0
- AIME 2025: 89.2% — stronger than most frontier closed models on math reasoning
- LiveCodeBench v6: 80.0%
- Arena ranking: #3 overall on Chatbot Arena open model leaderboard
Benchmarks
The AIME 2025 score (89.2%) is notable — AIME is the American Invitational Mathematics Examination, used as a proxy for hard reasoning. Gemma 4 outperforms many closed frontier models here. At 80% on LiveCodeBench v6, it's in the same range as models costing 10-50x more to run via API.
Arena ranking at #3 for open models reflects actual user preference in blind comparisons, not just cherry-picked benchmarks.
Why It Matters
Gemma 4 at 31B is the first open model that's genuinely competitive for production coding tasks. Prior open models (Llama, Mistral) fell short on complex multi-step problems. Gemma 4 closes that gap meaningfully.
Apache 2.0 means you can fine-tune on proprietary data, modify architecture, run in air-gapped environments, and redistribute — none of which is possible with closed APIs.
Running Locally
Works with ollama and lm-studio. At 31B, VRAM requirements:
- 4-bit quantization: ~20GB VRAM
- 8-bit quantization: ~34GB VRAM
Not fast for interactive use on consumer hardware, but viable for overnight-runs or batch processing.
Weaknesses
- 31B is expensive to run locally — needs real hardware
- Smaller variants (2B, 7B) trail closed models significantly
- Less agentic-workflow hardening than Claude or GPT — more prompt engineering required
- Context window shorter than gemini-2-5-pro
Use Cases
Best for: local inference where data privacy matters, fine-tuning for domain-specific tasks, math/reasoning pipelines, teams that need Apache 2.0 for commercial products.
Related
ollama · lm-studio · deepseek-r1 · evals