Gemma 4 Is Free, Fast, and Apache-Licensed

A $200-a-month AI bill just became optional

On April 2, Google released Gemma 4 — a family of four open-weight AI models that you can legally download, run on your own hardware, and use in a commercial product without paying Google a cent. The largest variant, a 31-billion-parameter dense model, ranks #3 on the Arena AI open-source leaderboard. The smallest fits on a phone.

For a small business paying hundreds of dollars a month in AI subscription fees, the headline is that Gemma 4 is capable enough to do a lot of that work for free. For a small business that does not pay hundreds of dollars a month because it cannot justify the cost, the headline is that the baseline price of usable AI just dropped to zero.

Neither headline is quite the whole story. Here is what actually changed, and what it means if you run a restaurant, a shop, or a contracting business in the Appalachian region.

What Google shipped

Gemma 4 is four models, not one:

E2B — an “edge” model with a 2-billion-parameter effective footprint, designed to run on phones and laptops
E4B — a 4-billion effective-parameter edge model, still laptop-class
26B A4B — a Mixture-of-Experts model that has 26 billion total parameters but only activates 3.8 billion at a time, keeping it fast
31B — a dense 31-billion-parameter model, the high-end variant

All four process images and video. The two edge models also accept audio input. Context windows are 128K tokens on the edge models and 256K tokens on the larger two — enough to fit a full year of email threads, a 400-page contract, or every review your business has ever received.

The entire family trained on over 140 languages and supports function calling and structured JSON output out of the box, which matters the moment you want to wire one of these models into a booking system, a CRM, or an invoice workflow.

The license is the real news

Google shipped the first three Gemma generations under a custom license with acceptable-use carve-outs. Gemma 4 ships under standard Apache 2.0 — the same widely understood license that covers projects like Kubernetes and TensorFlow. As VentureBeat put it, the license change “may matter more than benchmarks.”

That is not hyperbole. A custom license means your lawyer needs to read it, your compliance team needs to approve it, and your product team needs to track whether your use case crosses any of the restrictions. Apache 2.0 is a known quantity. You can build on it, ship it in a commercial product, and move on.

For comparison: Meta quietly shifted the opposite direction this month, keeping its new flagship models partially proprietary. Google just made the opposite bet.

Why this matters for a small business

Most small business owners are not going to download a 31-billion-parameter model and run it themselves. That is fine. The practical effects of a free, permissively licensed, genuinely competitive model show up through the tools you already use.

AI tool prices are getting squeezed

When a capable model is free, the floor on what anyone can charge for “basic AI features” drops. The software vendors who sell you a booking system, a receipt scanner, or an email assistant now have a credible open-weight alternative they can integrate — which means their margin on AI features shrinks, and some of that margin gets passed along. Anthropic already holds a reported 40% of enterprise LLM API spend against OpenAI’s 27%, and Gemma 4 adds a third credible pressure point on pricing.

You will not see a line-item price cut. You will see AI features that used to cost extra get folded into base pricing.

The “run it yourself” math actually works now

For a business with a technical operator — a shop that already runs its own server for point-of-sale, or a restaurant with a tech-savvy owner — the numbers have genuinely shifted. Self-hosting Gemma 4 on a single modern GPU can cut inference costs by 60 to 80 percent compared to calling a proprietary API, at volumes where that matters. The 26B MoE model fits on a single RTX 4090 and delivers roughly 97 percent of the 31B model’s quality at a fraction of the compute.

That is not a fit for a two-person bakery. It is a real option for a regional HVAC company processing thousands of service tickets a month, or a property management firm summarizing stacks of lease applications.

On-device AI stops being a pitch deck slide

The E2B model runs in about 5 GB of RAM at 4-bit quantization. That is not a data-center figure. That is “fits on a decent laptop” territory. For the first time, a business can run a serious AI feature — a receipt categorizer, a voicemail summarizer, a photo-to-work-order converter — entirely on its own hardware, without sending customer data to a cloud API.

For any business in a regulated industry (healthcare, legal, financial services), that changes the compliance conversation. “The model runs locally” is a much easier answer than “our vendor processes the data in accordance with their privacy policy.”

What is actually different about Gemma 4 vs other open models

The open-weight space is crowded. Mistral Small 4 shipped in March under Apache 2.0. Meta is still releasing Llama variants, though with more strings attached than before. AI2’s OLMo is genuinely, fully open.

Three things set Gemma 4 apart in practical terms:

Native multimodal across the family. Every Gemma 4 model handles image and video input. Most competitors in this size range are text-first with vision bolted on as a separate model.
The 256K context window on the mid-tier model. Long context used to be a premium API feature. Fitting a full quarter of customer emails into a single prompt is now a free-model capability.
Google’s infrastructure as a fallback. If you do not want to self-host, the same model weights are available through Vertex AI with managed scaling — so you can prototype locally and deploy to production without rewriting your code.

That combination — multimodal, long context, genuinely free license, managed cloud available — is narrow enough to make Gemma 4 the default starting point for new small-business AI projects, not the last step.

What you should do this week

For most Appalachian small businesses, the right move is not to download a 40 GB model file tonight. The right moves are quieter.

Ask your current AI vendors what they are doing with Gemma 4. Any AI-powered tool you pay for — booking assistants, review responders, inventory predictors — should have a credible answer. “We are evaluating it” is fine. “We had not heard of it” is a warning sign about how closely that vendor tracks the space.
Audit what you pay per month for AI features. Write down every AI-powered tool with a subscription. Note the monthly spend. In six months, compare. If nothing has dropped or expanded in capability, your vendors are not passing along the savings — and you have leverage.
Flag any use case where data privacy blocks adoption. If you have held off on AI for a specific workflow because you did not want customer data going to an outside API, that workflow is now viable on-device. Write it down. The tooling to build it will get cheaper every month from here.
Do not rebuild what already works. If a paid tool is doing the job well, the existence of a free model underneath it is not a reason to switch. The reason to switch is price, capability, or data control — not novelty.

If you want to talk through where a local or open model might fit in your workflow — or where it probably does not — get in touch. We help small businesses in the region sort through what is actually useful from what is just loud.

The bottom line

Gemma 4 is not going to replace ChatGPT for a business owner drafting an email. It is going to do something quieter and more durable: it sets a new floor on what “basic AI” costs, and it takes away the license excuses for vendors who wanted to keep AI features in a premium tier. The ripple effects will show up in your subscription invoices over the next twelve months, not in a flashy launch.

Free does not mean effortless. But it does change the math, and for small businesses watching every line item, that matters.