Google Releases Gemma 4: Open-Weight Models for Edge to Data Centre

Back to News

Google has released Gemma 4, its latest family of open-weight models, built on the same research foundation as the proprietary Gemini 3. The release, made available on 2 April 2026, ships under the commercially permissive Apache 2.0 licence — a meaningful move that lets enterprises integrate the models with fewer legal restrictions.

Four Models, One Family

Gemma 4 is available in four sizes, each targeting a different deployment environment:

Model	Parameters	Context Window	Best For
E2B (Effective 2B)	~2B effective	128K	Edge devices, low-latency apps
E4B (Effective 4B)	~4B effective	128K	On-device reasoning
26B MoE	26B (mixture-of-experts)	256K	High performance, efficient compute
31B Dense	31B	256K	Maximum capability, data-centre scale

The “Effective” designation on the smaller models indicates that, despite a higher raw parameter count, only a fraction of parameters are active per token — yielding the speed of a small model with the quality of a larger one.

Native Multimodal and Agentic Support

Unlike earlier Gemma generations, all Gemma 4 variants ship with native multimodal support, handling text, images, and audio/video across more than 140 languages. The models are also designed with multi-step agentic workflows in mind, making them viable for building AI agents that can plan, reason, and act across complex tasks — without requiring a cloud call.

Where to Get It

Gemma 4 is available through:

Hugging Face and Kaggle (full model weights)
Google AI Studio (API access)
Vertex AI, GKE, and Cloud Run (enterprise deployment)
Android AICore Developer Preview (on-device development)

Why It Matters

Gemma 4 continues Google’s twin-track strategy: push the frontier with proprietary Gemini, while giving the open-source ecosystem capable, commercially usable alternatives. For teams that need strong reasoning and multimodal capability without sending data to an external API, Gemma 4’s 26B MoE and 31B Dense models are now arguably the most capable open-weight options available.

The 100K+ context windows and agentic-first design also signal that Google intends open models to be first-class citizens in the emerging multi-agent landscape — not just language completers, but active reasoning participants.

Source: blog.google, google.dev, mashable.com

Written By

Marcus Chen

Lead Tech Analyst

Marcus is a hardware specialist and machine learning systems analyst who tracks large language model architectures, cloud compute infrastructure, and GPU accelerators. He specializes in decoding training efficiency and hardware benchmarks.

All Stories by Marcus →