Google Gemini: learn everything about ChatGPT’s competitor

-

Launched recently in the form of an app and personal assistant for Android cell phones in Brazil and other territories, Gemini, Google’s generative artificial intelligence, has been big tech’s big bet for 2024. However, its integration with the company’s ecosystem, Their functions and the differences between their models, apps, services and prices can generate a lot of confusion – something that is nothing new when it comes to Google products.

Below, we explain everything you need to know about Gemini, its ramifications, pricing, and capabilities.

What is Gemini?

Gemini is Google’s family of generative AI models. Developed by the company’s AI research labs, DeepMind and Google Research, Gemini stands out for its ability to understand and generate multimodal content, including audio, images and videos. It’s a different process from ChatGPT, for example, which although it is capable of transcribing audio, speaking and listening, it only natively understands text and code.

Gemini applications are an interface through which AI models can be accessed. In other words, Gemini is not just the app or website with chatbot and assistant functions that you download from the PlayStore or access via the web, but rather a set of models with different capabilities and applications, presented in three versions: Ultra, Pro and Nano.

How is Gemini different from ChatGPT and other generative AIs?

According to Google, Gemini is “natively multimodal”, capable of working with audio, images, videos and texts in different languages. This means that instead of feeding prompts to an image generator (like the DALL-E 3 in the case of ChatGPT), Gemini generates images “natively”, without an intermediate step.

Furthermore, Gemini 1.0 (its most powerful version) is also superior to GPT-4 in numerous benchmark tests, such as mathematical equations, Python code generation, reading comprehension and general knowledge, according to Google.

What can you do with Gemini?

Due to its multimodal structure, Gemini, in theory, is capable of solving a series of different problems, from the simplest to the most complex, including video transcriptions, generating images and graphs, pointing out errors in a data spreadsheet, analyzing academic texts, among other things.

Furthermore, its integration with Google’s productivity suite, Google Workspace, allows you to use it to write texts and emails, create spreadsheets, presentations and more.

However, not all of Gemini’s capabilities are available in its free version, in the form of an app or website. You need to know its different versions to understand which one best fits what you want to do:

Gemini Ultra

Available in version 1.0

Gemini Ultra is the most advanced and complete version of AI, which best makes use of its multimodality, according to Google. The company claims that Gemini Ultra is capable of identifying scientific articles relevant to a given problem, extracting the most relevant information from these texts and updating a pre-existing graph, generating the formulas necessary to recreate the graph with the most recent data.

In addition, Google also claims that, in its most advanced version, Gemini is capable of understanding and interpreting images and videos in different languages ​​of human expression, including musical, visual and code. This means that the AI ​​is capable of describing a score or creating an image of a wool doll from a photo or video of two balls of yarn.

Google says that, compared to rival GPT-4, the Ultra model has better results in Python code generation tests, math challenges and general knowledge answers.

This is the model used across Google Workspace, including Gmail, Docs documents, Sheets presentations, and Google Meet recordings, bringing a host of additional features to each program. This service costs: R$96.99/month, via Google One AI Premium subscription.

Gemini Ultra is also available to developers as an API through the Vertex AI platform and AI Studio, allowing it to be applied to new services.

Gemini Pro

Available in versions 1.0 and 1.5

Designed for developers, the Gemini Pro is a lighter version than the Ultra model, with a more “efficient” architecture. In addition to text, Gemini Pro is capable of understanding different languages, extracting information from audio and videos without the need to perform a written transcription. However, this can take time: searching through an hour of video can take anywhere from 30 seconds to a minute.

In its current latest version, 1.5 (in testing phase), the model is capable of processing up to 1 million tokens, equivalent to around 700 thousand words or approximately 30 thousand lines of code – eight times more than GPT-4 Turbo by Open AI.

Its main feature is the amount of context it can process. As an example, Google claims that the PDF containing the 402-page transcripts of the Apollo 11 mission can be analyzed by Gemini 1.5, which is equivalent to approximately 327 thousand tokens. In tests, the AI ​​was able to identify comical moments in the transcription, upon request, as well as understand that a simple drawing of a boot stepping on the ground, sent by the user, represented the moment of Neil Armstrong stepping on the Moon.

Gemini 1.5 Pro is available to the public in a “preview” version on Vertex AI, a platform for building AI applications aimed at companies.

Gemini Nano

Available in versions 1.0

The Geminio Nano is a “compact” version of the Pro and Ultra models, capable of running directly on cell phones instead of servers. Some more modern devices, such as the Pixel 8 Pro and the Samsung Galaxy S24, already have some Gemini Nano features.

One of them is a recorder app capable of transcribing audio from meetings and interviews and highlighting the most important parts, even if you don’t have internet access.

Google’s keyboard, Gboard, also has a Gemini Nano function that tries to predict your next words during a conversation, working in conjunction with WhatsApp.

How to use the different versions of Gemini?

Outside of its chatbot and assistant format, which can be downloaded from the PlayStore and used on the web, Gemini’s API is aimed at developers who want to embed their AI models into their applications. However, you can test it for free and experience its features here, simply by accessing AI Studio.

How much does it cost?

In preview format, Gemini 1.5 Pro can be tested for free, within the AI ​​Studio and Vertex AI platforms. When Gemini 1.5 Pro leaves its testing version on Vertex, the model will charge US$0.0025 per character typed and US$0.00005 per character present in responses. According to an analysis carried out by the website Tech Crunch, an article containing 2,000 characters could cost approximately US$5.

The price of the Ultra model has not yet been announced, but it can also be tested for free at Vertex. This template is used across Google Workspace, including Gmail, Docs documents, Sheets presentations, and Google Meet recordings, bringing a host of additional features to each program. This service costs: R$96.99/month, via Google One AI Premium subscription.

With information from Estadão Conteúdo
Image: Shutterstock

Like this:

Like Loading…

The article is in Portuguese

Tags: Google Gemini learn ChatGPTs competitor

-

-

PREV It will be simple and fast to transfer data from an old Android
NEXT Fraudulent ads with ‘deepfakes’ flood social networks, and famous ‘cloned’ people begin to take legal action