[ad_1]
OpenAI has been dominating the world of artificial intelligence (AI) and chatbots lately, with its GPT-4 large language model (LLM) powering ChatGPT and taking the world by storm. The company got an early lead and everyone else has been playing catch-up ever since.
Yet OpenAI has a fresh challenger in the form of Google Gemini. This new arrival burst onto the scene in December 2023 and stunned onlookers with its impressive capabilities (even if the demos were somewhat exaggerated). We’ve been waiting for months to see what Google has up its sleeve, and the results look pretty spectacular.
But is it enough to defeat GPT-4? What can it do right now, and what about in the future? And if you want to use Gemini, how exactly do you do that? We’ve taken a deep dive into the world of Gemini to find the answers to all these questions and more. If you’re curious about Google’s latest AI efforts, this is the place to be.
What is Google Gemini?
Gemini is Google’s latest large language model (LLM). What’s an LLM? It’s the system that underpins the types of AI tools you’ve probably seen and interacted with on the internet. For example, GPT-4 powers ChatGPT Plus, OpenAI’s advanced paid-for chatbot.
In Google’s case, Gemini will be woven into a wide array of tools, such as the Bard chatbot, Google Search, YouTube, and more. In other words, Gemini isn’t a chatbot itself, but the “brain” that makes it (and other tools) tick.
Google also specified that it has created three variants, or “sizes,” of Gemini: Nano, Pro and Ultra. Nano is now inside the Pixel 8 Pro and destined for other mobile devices, while Gemini Pro has already found its way into Google Bard. Ultra, meanwhile, is designed for “highly complex tasks,” although it will also come to Bard once Google has completed extensive testing and safeguarding.
What can Gemini do?
In a press release, Google explained that Gemini is a multimodal AI tool. In other words, it can deal with various forms of input and output, including text, code, audio, images and videos. That gives it a lot of flexibility to perform a wide range of tasks.
Google’s Gemini launch event saw it showcase the tool’s abilities in a “hands on” video, and it’s safe to say it was pretty mind-blowing (even if it wasn’t quite representative of today’s reality).
Gemini could be seen following a paper ball hidden under a cup and understanding a user’s sleight-of-hand coin trick. It could predict what a dot-to-dot puzzle showed before a single line was drawn and explain when one path on a map might lead to danger and one may lead to safety.
Better yet, all of this seemingly happened in real-time, with a human asking Gemini a question and rapidly getting an accurate response. It suggested that natural, flowing conversations will be possible with Google’s chatbot. However, the reality might not quite live up to the video demo’s hype.
A separate Google blog post showed how the demo had actually been created – by feeding Gemini still image frames from the captured footage and prompting the AI model using text, rather than voice. So while the video below does show real outputs from Gemini, we’re still quite far from the real-time conversations it depicts.
Gemini Pro has recently been incorporated into Google Bard but, as in the early days of other tools like ChatGPT (and earlier versions of Bard), it seems prone to mistakes.
For instance, it has struggled to name recent Oscar award winners and produce accurate code. It has also shown itself to be inaccurate when working in non-English languages – one user on X (formerly Twitter) asked Gemini to tell it a six-letter French word, to which Gemini responded with a five-letter word. (Then again, ChatGPT also sometimes struggles with this task.)
Google also claimed that Gemini beat OpenAI’s GPT-4 model in almost every test the two systems took. Yet in many cases the difference was only a couple of percentage points. GPT-4 has been out for almost a year, suggesting that Google’s progress is not as impressive as it might have seemed. It’s caught up to a year-old AI tool, but we’d have hoped for a bit more than that.
This all implies there’s plenty of work for Google to do. Gemini has some impressive abilities, but it’s probably not the all-conquering AI that Google wants you to believe it is – at least, not yet.
When was Gemini released?
Gemini Pro is already out in the wild, as Google Bard has been updated to contain the tech. It has some limitations, though, as it only works with text prompts and is available solely in English. Both of those things will change soon, Google says.
Gemini Pro is also rolling out to Google AI Studio and Google Cloud Vertex AI, which are tools for developers to prototype apps and manage data, respectively. That’s coming on December 13.
Gemini Ultra will take a little longer to reach the public, as Google says it is currently “completing extensive trust and safety checks” to ensure it is trustworthy and accurate. Since it’s the more powerful Gemini model, it might be more capable of creating dangerous content and misinformation, hence the need for more extensive testing.
Still, Google says it aims to add Gemini Ultra to Bard in 2024. It will be able to handle different modal types, from images to audio, and will “think more carefully before answering” tricky questions. This version will be called Bard Advanced.
As for Gemini Nano, that’s also available right now, albeit in a very limited way. Google issued a software update to the Pixel 8 Pro smartphone, which added Gemini Nano to the device’s capabilities. The company says it has added Gemini to the Smart Reply feature in its Gboard keyboard, as well as incorporating it in to the Recorder app’s Summarize feature.
In addition to the Pixel 8 Pro, Google says “the broader family of Gemini models will unlock new capabilities for the Assistant with Bard experience early next year on Pixel.” Keep your eyes peeled for updates there.
Is Google Gemini free?
Right now, we don’t know a huge amount about Gemini pricing, although we can take some cues from what has already been released. Gemini Pro in Google Bard is free and does not require any payment or credit system to use. Likewise, the Gemini Nano came to the Pixel 8 Pro smartphone in a free update.
It’s possible that Google will charge for Gemini Ultra given its more powerful capabilities, in a similar way to how OpenAI charges $20 / £16 a month for access to ChatGPT Plus. There’s been no official word on this from Google, though, so for now it’s just speculation.
How do I use Google Gemini?
The way you use Google Gemini depends on the version you’re interested in and the product it has been woven into. The most obvious way to use it, though, is with Google Bard.
Here, you simply enter a prompt and wait for Bard’s response. You can ask for almost anything – the weather forecast, a request for Bard to create some poetry, help with your coding project, and more – although it has safeguards built in against illegal or harmful content.
If you have a Pixel 8 Pro phone, there are a couple of ways you can use Gemini Nano. The first is using the Gboard keyboard. In a WhatsApp conversation, you’ll see suggested replies appear underneath a message from a contact. You can then just tap the reply and it will be sent. This feature – called Smart Reply – is coming to other apps next year, Google says.
In the Recorder app on a Pixel 8 Pro, Gemini is able to summarize recorded conversations, presentations and more. It does this on-device, meaning it will work even without an internet connection.
We’ll have to wait and see to find out how Gemini Ultra works, but given how Google positioned it as something designed for “highly complex tasks,” many of its applications might be designed for researchers and industry users rather than the general public. That said, we know it’s coming to Google’s chatbot as Bard Advanced, so we’ll be able to try that out when it finally arrives.
Gemini vs GPT-4: what’s the difference?
While Gemini and GPT-4 are both large language models meant to underpin AI tools, they have their differences.
For one thing, Google says that Gemini is more advanced than GPT-4. In a blog post, Google showed results from eight text-based benchmarks, with Gemini winning in seven of those tests. Across 10 multimodal benchmarks, Gemini came out on top in every one, according to Google at least.
That would seem to imply that Gemini is the superior system, but it’s not quite so straightforward. GPT-4 came out in March 2023, so Gemini is essentially catching up to a nine-month-old AI tool. We don’t know how capable OpenAI’s next version of GPT will be, so it’s hard to say which is truly the better tool at the moment.
As well as that, Google only put Gemini Ultra up against GPT-4. That means we don’t know how well Gemini Pro and Nano can compete with GPT-4 right now, but given the often-slim margins between GPT-4 and Gemini Ultra, OpenAI’s model probably comes out ahead of Gemini Pro and Nano.
You might also like
[ad_2]
Source link