Artificial IntelligenceHigh Priority (9/10)

Google DeepMind Launches Gemma 4: Frontier AI Running on Single GPU

Google DeepMind has released Gemma 4, a suite of four open-weight AI models that run entirely on a single 80GB Nvidia H100 GPU while delivering benchmark scores comparable to models twenty times their size.

Key Points

  • Four open-weight models run entirely on a single 80GB Nvidia H100 GPU
  • Benchmark scores rival models 20 times larger in size
  • Trained on over 140 languages for multinational deployments
  • E4B edge model exceeds Gemma 3 27B at roughly one-sixth the size

Full Details

Google DeepMind unveiled Gemma 4 this week, marking the company's most aggressive move yet in the open AI model race against Meta's Llama. The four models in the family fit entirely on a single 80GB Nvidia H100 GPU, making them accessible for enterprises wanting to run AI locally rather than paying per-token cloud fees. Google trained the family on over 140 languages, positioning it as a practical option for multinational deployments where a single model needs to handle diverse language requirements. The smaller E4B edge model exceeds Gemma 3 27B on most benchmarks at roughly one-sixth the size, validating Google's claim of delivering more intelligence per parameter than any previous release. The release signals a deeper alignment between Google and Nvidia on the open model front and gives enterprises a new reason to invest in local GPU infrastructure.

Why It Matters

Gemma 4 could accelerate enterprise adoption of local AI infrastructure, reducing reliance on cloud-based AI services and potentially disrupting the per-token pricing model of cloud AI providers.

Sourceforbes.com

Get stories like this delivered daily

AI-curated news, personalized to your interests. Zero noise.

Start 7-Day Free Trial →

More in Artificial Intelligence