Google has released Gemma 4, its latest family of open-weight models, built on the same research foundation as the proprietary Gemini 3. The release, made available on 2 April 2026, ships under the commercially permissive Apache 2.0 licence — a meaningful move that lets enterprises integrate the models with fewer legal restrictions.
Four Models, One Family
Gemma 4 is available in four sizes, each targeting a different deployment environment:
| Model | Parameters | Context Window | Best For |
|---|---|---|---|
| E2B (Effective 2B) | ~2B effective | 128K | Edge devices, low-latency apps |
| E4B (Effective 4B) | ~4B effective | 128K | On-device reasoning |
| 26B MoE | 26B (mixture-of-experts) | 256K | High performance, efficient compute |
| 31B Dense | 31B | 256K | Maximum capability, data-centre scale |
The “Effective” designation on the smaller models indicates that, despite a higher raw parameter count, only a fraction of parameters are active per token — yielding the speed of a small model with the quality of a larger one.
Native Multimodal and Agentic Support
Unlike earlier Gemma generations, all Gemma 4 variants ship with native multimodal support, handling text, images, and audio/video across more than 140 languages. The models are also designed with multi-step agentic workflows in mind, making them viable for building AI agents that can plan, reason, and act across complex tasks — without requiring a cloud call.
Where to Get It
Gemma 4 is available through:
- Hugging Face and Kaggle (full model weights)
- Google AI Studio (API access)
- Vertex AI, GKE, and Cloud Run (enterprise deployment)
- Android AICore Developer Preview (on-device development)
Why It Matters
Gemma 4 continues Google’s twin-track strategy: push the frontier with proprietary Gemini, while giving the open-source ecosystem capable, commercially usable alternatives. For teams that need strong reasoning and multimodal capability without sending data to an external API, Gemma 4’s 26B MoE and 31B Dense models are now arguably the most capable open-weight options available.
The 100K+ context windows and agentic-first design also signal that Google intends open models to be first-class citizens in the emerging multi-agent landscape — not just language completers, but active reasoning participants.
Source: blog.google, google.dev, mashable.com