For much of last year, knocking OpenAI off its perch atop the tech industry looked all but impossible, as the company rode a riot of excitement and hype generated by a remarkable, garrulous, and occasionally unhinged program called ChatGPT.
Google DeepMind CEO Demis Hassabis has recently at least given Sam Altman some healthy competition, leading the development and deployment of an AI model that appears both as capable and as innovative as the one that powers OpenAI’s barnstorming bot.
Ever since Alphabet forged DeepMind by merging two of its AI-focused divisions last April, Hassabis has been responsible for corralling its scientists and engineers in order to counter both OpenAI’s remarkable rise and its collaboration with Microsoft, seen as a potential threat to Alphabet’s cash-cow search business.
Google researchers came up with several of the ideas that went into building ChatGPT, yet the company chose not to commercialize them due to misgivings about how they might misbehave or be misused. In recent months, Hassabis has overseen a dramatic shift in pace of research and releases with the rapid development of Gemini, a ”multimodal” AI model that already powers Google’s answer to ChatGPT and a growing number of Google products. Last week, just two months after Gemini was revealed, the company announced a quick-fire upgrade to the free version of the model, Gemini Pro 1.5, that is more powerful for its size and can analyze vast amounts of text, video, and audio at a time.
A similar boost to Alphabet’s most capable model, Gemini Ultra, would help give OpenAI another shove as companies race to develop and deliver ever more powerful and useful AI systems.
Hassabis spoke to WIRED senior writer Will Knight over Zoom from his home in London. This interview has been lightly edited for length and clarity.
WIRED: Gemini Pro 1.5 can take vastly more data as an input than its predecessor. It is also more powerful, for its size, thanks to an architecture called mixture of experts. What do these things matter?
Demis Hassabis: You can now ingest a reasonable-sized short film. I can imagine that being super useful if there’s a topic you’re learning about and there’s a one-hour lecture, and you want to find a particular fact or when they did something. I think there’s going to be a lot of really cool use cases for that.
We invented mixture of experts—[Google DeepMind chief scientist] Jeff Dean did that—and we developed a new version. This new Pro version of Gemini, it’s not been tested extensively, but it has roughly the same performance as the largest of the previous generation of architecture. There’s nothing limiting us creating an Ultra-sized model with these innovations, and obviously that’s something we’re working on.
In the last few years, increasing the amount of computer power and data used in training an AI model is the thing that has driven amazing advances. Sam Altman is said to be looking to raise up to $7 trillion for more AI chips. Is vastly more computer power the thing that will unlock artificial general intelligence?
Written by: Carlton Reid
Reece Rogers
Andrew Couts
Virginia Heffernan
Was that a misquote? I heard someone say that maybe it was yen or something. Well, look, you do need scale; that’s why Nvidia is worth what it is today. That’s why Sam is trying to raise whatever the real number is. But I think we’re a little bit different to a lot of these other organizations in that we’ve always been fundamental research first. At Google Research and Brain and DeepMind, we’ve invented the majority of machine learning techniques we’re all using today, over the last 10 years of pioneering work. So that’s always been in our DNA, and we have quite a lot of senior research scientists that maybe other orgs don’t have. These other startups and even big companies have a high proportion of engineering to research science.
Are you saying this won’t be the only way that AI advances from here on?
My belief is, to get to AGI, you’re going to need probably several more innovations as well as the maximum scale. There’s no let up in the scaling, we’re not seeing an asymptote or anything. There are still gains to be made. So my view is you’ve got to push the existing techniques to see how far they go, but you’re not going to get new capabilities like planning or tool use or agent-like behavior just by scaling existing techniques. It’s not magically going to happen.
The other thing you need to explore is compute itself. Ideally you’d love to experiment on toy problems that take you a few days to train, but often you’ll find that things that work at a toy scale don’t hold at the mega scale. So there’s some sort of sweet spot where you can extrapolate maybe 10X in size.
Does that mean that the competition between AI companies going forward will increasingly be around tool use and agents—AI that does things rather than just chats? OpenAI is reportedly working on this.
Probably. We’ve been on that track for a long time; that’s our bread and butter really, agents, reinforcement learning, and planning, since the AlphaGo days. [DeepMind developed a breakthrough algorithm capable of solving complex problems and playing sophisticated games in 2016.] We’re dusting off a lot of ideas, thinking of some kind of combination of AlphaGo capabilities built on top of these large models. Introspection and planning capabilities will help with things like hallucination, I think.
It’s sort of funny, if you say “Take more care” or “Line out your reasoning,” sometimes the model does better. What’s going on there is you are priming it to sort of be a little bit more logical about its steps. But you’d rather that be a systematic thing that the system is doing.
This definitely is a huge area. We’re investing a lot of time and energy into that area, and we think that it will be a step change in capabilities of these types of systems—when they start becoming more agent-like. We’re investing heavily in that direction, and I imagine others are as well.
Carlton Reid
Reece Rogers
Andrew Couts
Virginia Heffernan
Won’t this also make AI models more problematic or potentially dangerous?
I’ve always expressed in safety meetings and events that the switch to agent-like systems would mark a significant transition. Current systems function mostly as passive Q&A mechanisms, but these agent-like systems would be able to actively learn. This obviously enhances their utility as they can perform tasks and complete them. However, this will require us to exercise greater caution.
I have continually promoted the development of strong simulation sandboxes for trialling agents before they go live on the internet. There are numerous other suggestions, yet I believe the industry ought to start seriously considering the emergence of such systems. It could occur within a few years, or even sooner. These represent an entirely different category of systems.
You previously mentioned that testing your most powerful model, Gemini Ultra, took a longer time. Was this due to the pace of development, or was the model more challenging in itself?
In fact, it was a combination of both. The larger the model, the more complex certain aspects become when you fine-tune it, resulting in a longer time period. Additionally, larger models possess more features that necessitate testing.
As Google DeepMind settles as a single organization, our approach is to release and experiment things early with a small group of trusted testers. Their feedback helps us modify and improve things before we roll them out to the general public.
So, how are the discussions with government organizations like the UK AI Safety Institute going?
In terms of safety discussions with government organizations like the UK AI Safety Institute, process is going smoothly. However, I’m limited in what I can reveal as a lot of our communication is confidential. Nonetheless, we extend access to our frontier models and continuously work closely with them. They’ve even tested Ultra. A similar institution is being set up in the U.S. These are positive outcomes from the Bletchley Park AI Safety Summit. Their expertise allows them to assess aspects we may not have security clearance to check, particularly CBRN [chemical, biological, radiological, and nuclear weapons] aspects.
While current systems are not yet powerful enough to pose any major concerns, it’s essential to prepare all stakeholders including the government, industry, and academia. I am of the opinion that the next significant change would most likely come from agent systems. There will be incremental improvements and possibly breakthroughs, but that would present a different experience.