Mustafa Suleyman has once again found himself at the forefront of an artificial intelligence revolution.
As a cofounder of DeepMind, a UK-based company purchased by Google in 2014, he played a crucial role in creating a novel approach for computers to solve complex issues by using a combination of practice along with positive and negative feedback. DeepMind showcased this technique by introducing a superhuman Go-playing AI known as AlphaGo, which triumphed over the world’s top Go player in 2016.
Now, Suleyman is promoting a new type of AI innovation.
As the CEO of Microsoft AI, Suleyman is responsible for initiatives aimed at embedding the same AI technology that powers ChatGPT into software, including the Windows operating system, which operates the majority of personal computers globally.
In a recent announcement, Microsoft revealed that its AI assistant, Copilot, now features a humanlike voice, the capacity to observe a user’s screen, and enhanced reasoning capabilities.
Mustafa Suleyman mentioned that this endeavor is aimed at rekindling users’ affection for the PC. He engaged in a conversation with WIRED senior writer Will Knight from Redmond, Washington—via Microsoft Teams, of course. The dialogue has been lightly edited for clarity.
Will Knight: What’s the new vision for Copilot?
Mustafa Suleyman: We are truly at a remarkable turning point. AI companions now possess the ability to see what we see, hear what we hear, and communicate in the same manner that we engage with one another.
There is a new type of design material focused on persistence, relationships, and emotions. I am in the process of creating experiences that foster lasting interactions with a companion.
You transitioned to Microsoft from Inflection AI, where the emphasis was on developing supportive and empathetic AI. It appears you have integrated that approach into your current role.
I have always believed in AI’s ability to offer support, a conviction I held even before my days at DeepMind. Providing emotional support was among the first projects I undertook at the age of 19 when I launched a telephone counseling service.
This is the fascinating aspect of our current technological era. Experiencing a sustained interaction with an entity that truly understands you—offering coaching, encouragement, support, and education—will eventually make it feel less like interacting with a machine.
What’s the concept behind Copilot Vision, the experimental feature available for Pro users to explore?
The vision mode allows users to inquire, “What’s that thing over there [on my screen]?” or “Hold on, what’s that? What do you think about it? Is it interesting?”
There are countless small moments when you’re at your computer. It’s incredible to have this AI assistant perceive exactly what you’re viewing and engage with you in real time about it. It transforms the way you navigate your digital experience, removing the hassle of needing to type anything.
This resembles Recall, the contentious and currently opt-in Windows feature that tracks what users see on their screens.
Currently, the Copilot Vision tool does not retain any materials once you close your browser. Everything is completely erased. However, I’m contemplating the possibility of incorporating a feature in the future, as many individuals are interested in having that capability. Imagine asking, ‘What was that picture I saw online the other day? What about that meme?’ It seems like something worth exploring down the line.
For now, though, the Copilot Vision experience is temporary. We will need to experiment as time goes by to determine what truly makes sense for users.
What about the privacy risks when users share sensitive information with Copilot?
We do keep the logs generated from your conversations, and we store them with the utmost security, adhering to the highest standards of Microsoft Security. This is necessary because, naturally, you may want to refer back to your conversation history.
You’re also introducing Think Deeper, which will allow Copilot to address more complex challenges. This is based on OpenAI’s o1 model, known as Strawberry, correct?
Yes, it’s similar to Strawberry. We’ve adapted an OpenAI model for our consumer-oriented objectives, ensuring it aligns more closely with our AI companion concept.
What are the distinctions?
OpenAI’s model prioritizes mathematical and scientific problem-solving. In contrast, we’ve aimed to focus on comparative analysis and various consumer insights instead.
When faced with a challenging problem or when you’re looking to reason through a complex topic, having a tool that lays out a side-by-side comparison or provides large-scale analysis can be incredibly helpful.
Is the new version of Copilot already in use at Microsoft?
Absolutely, everyone is utilizing it. We just rolled it out for general use across the company a few days ago. So, everyone is engaging with it and providing extensive feedback. Our feedback channels are incredibly busy right now. It’s a great experience.
Many people remember Clippy, Microsoft’s previous AI assistant for Windows. Do people at Microsoft see any similarities?
Recently, I encountered Bill Gates, and he humorously remarked about how we’ve misnamed the entire AI phenomenon. He suggested it ought to be called Clippy. I couldn’t help but respond, “Come on, really!”
This exchange highlights the extraordinary vision of individuals like Bill. They possess an ability to look far beyond the present, forecasting not merely two years into the future but rather two decades.
Do the new features signify progress towards what we refer to as AI agents, capable of performing useful tasks on a computer?
Indeed, they do. The initial phase involves AI processing the same information as a human—perceiving what you see, listening to what you hear, and consuming the same texts. The next step consists of AI developing a long-term, persistent memory, which fosters a shared understanding over time. The final stage is where AI engages with external parties by sending commands and taking actions—whether it’s making purchases, booking reservations, or organizing schedules. Currently, we are testing two features within an experimental research and development phase.
Wait, you have an AI agent for Windows that can go off and buy things for you?
It’s a bit of a journey, but indeed, we’ve managed to complete transactions. The challenge with this technology is achieving high reliability; getting it to work effectively 50 or 60 percent of the time is easier than hitting that 90 percent mark, which requires significant work. I’ve witnessed some impressive demonstrations where it can autonomously make purchases. On the flip side, there have been moments where it really struggles to understand what it’s supposed to do.
Tell me more about one of those struggles. Did it end up purchasing a Lamborghini using Bill’s credit card?
If it had indeed used Bill’s credit card, that would certainly be amusing! However, as mentioned, we are still navigating through these challenges step by step. This technology is still very much in the experimental phase. There’s a considerable distance to cover, but I believe we can measure progress in quarters rather than years.
What is going to be the biggest challenge for you in turning the envisioned AI future into reality?
The primary challenge lies in developing technology that users can trust, as this will be a highly personal and intimate experience. It’s essential to ensure that the security aspect is robust and that privacy is prioritized. Ultimately, the goal is to design interactions where the AI can clearly express limitations, indicating when it is not prepared to engage in certain matters.
Achieving this would establish the groundwork for a trusted experience. Once that is in place, we can delve into more complex functionalities, such as allowing the AI to make purchases on your behalf, negotiate terms, or enter into contracts for you. It could even plan a detailed schedule for your Saturday outings with multiple stops, prompting you to think, “I trust you, Copilot. You’ve got this, right?” This is the direction we are aiming towards.