Google’s dropped a bombshell that no one expected, but everyone seems to want to try. The AI Edge Gallery, launched on May 31, is an app that delivers artificial intelligence straight to your smartphone, without cloud reliance, internet, or Big Tech snooping on your data.
While it might sound like a weird experiment, the app's actually pretty cool. Available on GitHub under the Apache 2.0 license (that means you can basically do whatever you want with it), it’s currently up for grabs on Android, with an iOS version on the way.
The app runs offline, processing everything from image analysis to writing code using only the phone’s hardware. Google’s Gemma 3n is one of the models it uses, and to be honest, it works surprisingly well.
This app, while aimed at developers for now, comes with three key features: AI Chat for chatting, Ask Image for visual analysis, and Prompt Lab for quick tasks like rewriting text.

Models like Gemma-3n-E2B and Qwen2.5-1.5 B are downloadable, though there aren’t a ton of options just yet. Reddit’s already questioning whether this is anything new, comparing it to other existing apps like PocketPal. Some even raised security concerns, but given it’s hosted on Google’s official GitHub, the chances of a fake-out are slim. No malware has popped up so far, either.
Testing the App: Surprisingly Slick for an AI Running on Your Phone
We gave it a spin on a Samsung Galaxy S24 Ultra, downloading both the large and small Gemma 3 models available. Here’s how it works: each model is a self-contained file, packed with everything the model’s learned during training. Instead of a massive database, it’s like downloading a compressed snapshot. The largest Gemma 3 model? A solid 4.4 GB. The smallest? A tidy 554 MB.
Once it’s downloaded, you’re good to go; no need for any data after that. The app runs totally offline, answering questions and completing tasks using only what the model learned before being released.
Even on lower-speed CPU inference, it delivers about the same experience GPT-3.5 did at launch: not lightning-fast with the bigger models, but totally usable. The smaller Gemma 3 1B model even hit over 20 tokens per second, making it feel pretty smooth with decent accuracy.
The smaller model on the GPU was a beast, delivering 105 tokens per second, but even the CPU was a solid 39 tokens per second. Not bad, right? The real kicker? You can run all this without sending any data to Google or OpenAI’s servers.
This is a game-changer for privacy. If you’re handling sensitive data or working in a field like healthcare or journalism, you can now run AI tasks without having to worry about your data leaving your phone.
Privacy, Cost, and Speed
“No internet required” means this app works just as well in the middle of nowhere as it does in a busy city. Plus, it’s free to use, unlike cloud-based AI services that charge per use. No subscriptions, no credits, no fees. Just your phone doing the work.
Latencies are way better, too. No server round-trip means lightning-fast responses, which is a big deal if you’re using AI for real-time tasks like chatbots or image analysis. And since it’s running on your phone, there’s no risk of your chatbot or app going down.
The big question is: Why run a slower version of your favorite AI on your phone when you could just use an online chatbot? For starters, privacy. This is perfect for those who want the power of AI without sharing every little thing they type with Google or OpenAI. It also works great for those in areas with bad internet or those traveling with no access to Wi-Fi.
Of course, there are some hiccups. It’s not quite as fast or robust as cloud-based AI, and the app doesn't support the widely used .safetensor format, so options are limited. There’s also a small learning curve when it comes to setup.
Still, for basic tasks like rewriting text or summarizing concepts, it works like a charm. Plus, it runs entirely on your device, so no one else gets to see your input—except you.
This app signals a shift in how AI might be deployed in the future. Google’s taking a bold step by offering privacy and power at the same time, even if it’s still in its early stages. With more model support and a few tweaks, it could become a must-have for privacy-conscious users.

Disclaimer: All materials on this site are for informational purposes only. None of the material should be interpreted as investment advice. Please note that despite the nature of much of the material created and hosted on this website, HODL FM is not a financial reference resource, and the opinions of authors and other contributors are their own and should not be taken as financial advice. If you require advice. HODL FM strongly recommends contacting a qualified industry professional.