Google is integrating its powerful AI model Gemini into a wide range of products, from Gmail and YouTube to Android smartphones, giving users a more intuitive and personalized experience.

Gemini’s capabilities are impressive: from having in-depth conversations with the AI ​​assistant on your smartphone to automating complex tasks and searching your photo library using natural language. With Gemini, Google promises to radically transform the way we use technology.

The future of interaction with technology

During a keynote address at the company’s I/O 2024 developer conference on May 14, Google CEO Sundar Pichai revealed some of the upcoming places where their AI model will appear.

Pichai mentioned AI 121 times in his 110-minute keynote, which focused on the topic, and Gemini, which launched in December 2023, stole the show.

Google is integrating the large language model (LLM) into their offerings, including Android, Search and Gmail. Here’s what users can expect in the future:

Interaction with apps: Gemini gets more context because it can collaborate with applications. In a future update, users will be able to summon Gemini to interact with apps, such as dragging and dropping an AI-generated image into a message. YouTube users can also tap “Ask this video” to get specific information from the AI ​​in the video.

Gemini in Gmail: Google’s email platform Gmail will also get AI integration. Users can search, summarize and compose their emails using Gemini. The AI ​​assistant can take action on emails for more complex tasks, such as helping process e-commerce returns by searching the inbox, finding the receipt, and filling out online forms.

Gemini Live: Google also unveiled a new experience called Gemini Live, where users can have “in-depth” voice chats with the AI ​​on their smartphones. The chatbot can pause mid-answer for clarification, and it adapts to the user’s speech patterns in real time. In addition, Gemini can also see and interact with physical environments through photos or videos captured on the device.

Multimodal developments: Google is working to develop intelligent AI agents that can reason, plan, and perform complex multi-step tasks under the user’s supervision. Multimodal means that the AI ​​can go beyond text and process images, sound and video input.

Gemini in action

Examples and early applications include automating store returns and exploring a new city. Other planned updates to the company’s AI model include replacing Google Assistant on Android with Gemini, which will be fully integrated into the mobile operating system.

A new feature called “Ask Photos” makes it possible to search the photo library using natural language queries supported by Gemini. This feature can understand context, recognize objects and people, and summarize photo moments in response to questions.

In addition, Google Maps will show AI-generated summaries of places and areas, using insights from the platform’s map data.


Leave a Reply

Your email address will not be published. Required fields are marked *