Project Astra: Google’s Multimodal Answer to OpenAI’s ChatGPT 4o

Is the competition between developers of artificial intelligence (AI) about to spell the end of the human race? We could all end up becoming to robots what dogs are to humans. This article interrogates Google’s voice-operated AI assistant Project Astra, which comes up as a serious competitor to Open AI’s ChatGPT 4o.

In what may appear like a case of keeping up with the Joneses, Google Inc. has introduced a new multimodal AI-powered agent dubbed Project Astra, one day following ChatGPT’s announcing GPT4o. The new system, which will be launched later this year, will be capable of answering human queries in real-time using text, audio, or video inputs. To demonstrate Project Astra’s innate skills, Google shared a demo video showing the AI assistant interacting with humans and answering questions in real-time via voice input.

Demo of Google’s AI agent.

This is insane 🤯 pic.twitter.com/EJWIgBkwpF
— Alvaro Cintas (@dr_cintas) May 14, 2024

Responds In a Humanlike Voice

Google’s announcement follows the launch of GPT4o a day earlier. OpenAI’s Artificial intelligence-powered model can quickly respond to voice prompts to say what it “sees” via a computer screen or smartphone camera. Project Astra responds in a humanlike voice using an emotionally expressive tone that simulates emotions like flirtatiousness and surprise.

Project Astra, which Google describes as a “Universal AI,” could become truly helpful for day-to-day uses. It’s intended to be a teachable, proactive agent that can decipher natural language. According to Google, its pioneer AI-for-everything agent can process information faster as it encodes video frames, combs video, and speech input into a timeline of events and caches the data for recall.

AI Models with More Understanding of the Physical World

The AI assistant is designed to sound natural, and users can select to switch between different voices. Commenting on the Tech giant’s choice for an autonomous agent, Google’s Deep Mind CEO Demis Hassabis said the agent needed to understand and respond to a complex and dynamic world just like humans.

The concept is likely to deliver on a promise Hassabis made last year regarding the potential of a project Gemini when Google first introduced it last December. Google says it will avail Project Astra through a new interface dubbed Gemini Live, whose several prototype smart glasses the Tech Giant is still testing as it seeks to reestablish leadership in AI. Hassabis opines that filling AI models with more understanding of the physical world will make them more effective, hence making systems like Project Astra more effective. Hassabis said to the AI agent:

It needs to understand and respond to the complex and dynamic world just like people do—and take in and remember what it sees and hears to understand the context and take action. […] It also needs to be proactive, teachable, and personal so users can talk to it naturally and without lag or delay.

Expected Later This Year

Thanks to multimodal AI, a combination of neural network models with the capacity to process inputs from multiple sources mixed from cameras and microphones before mixing it with AI, Project Astra can cache information that it can later recall at speed. Without specifying when Project Astra would be incorporated into its products, Hassabis reiterated that the Gemini App would most likely feature some of those capabilities later this year, adding:

I’d be very surprised if that doesn’t mean the Google Pixel 9, which we’re expecting to arrive later this year.

More Info:

Disclaimer: All materials on this site are for informational purposes only. None of the material should be interpreted as investment advice. Please note that despite the nature of much of the material created and hosted on this website, HODL FM is not a financial reference resource and the opinions of authors and other contributors are their own and should not be taken as financial advice. If you require advice of this sort, HODL FM strongly recommends contacting a qualified industry professional.

Project Astra: Google's Multimodal Answer to OpenAI's ChatGPT 4o

Responds In a Humanlike Voice

AI Models with More Understanding of the Physical World

Expected Later This Year

More Info:

Sign up for Newsletter

More News

Meta Plans to Cut 10% of Reality Labs Staff as AI Takes Priority

Google DeepMind’s Gemini Powers Atlas Robots for Hyundai Factories

Anthropic to Raise $10B at $350B Valuation amid AI Funding Surge