OpenAI Reveals Their Plans For 2025 In An Exclusive Interview...
- Sam Altman’s AMA on Reddit covered significant insights about AGI and hardware.
- The 01 model shows marked improvements over the 01 preview, indicating a new paradigm.
- OpenAI emphasizes advancements in algorithms over hardware for AGI development.
- Hallucinations in AI models remain a challenge, affecting their reliability in critical domains.
- Future developments focus on agents and improving AI adaptability in a rapidly changing world.
So Sam Oman recently had an AMA on Reddit and there were 10 significant things that I think you should all know. So that's not waste any time.
The first thing is titled Wait, what? Because this is a Wait, what moment. You can see here that someone asks, is artificial general intelligence achievable with known hardware, or will it take something entirely different? And I think this is arguably one of the most important questions you could ask.
When we look at AGI, there's a lot of things that people say when it comes to what we're going to need in terms of hardware requirements. Some people have said we're going to need biological hardware; some people have said we're going to need quantum computing to achieve it.
But Sam Altman clearly states that we believe it is achievable with current hardware. Now, it's important to note that with this statement, he does say that we believe it is achievable, but that doesn't mean that it is 100% achievable.
But the implications for this are stark. Because of course what it does mean is that they have a roadmap specifically to AGI with their current hardware. And it's not like they're thinking, okay, we're going to need new hardware to make that a possibility.
Now, what I think this also does mean is that it doesn't mean that we won't benefit from new hardware. Because I was thinking, okay, so AGI can be achieved with the current hardware, but it doesn't mean that we can't improve AGI to get even faster.
Think about what we've done with Transformers, how quick they are. But of course, we've improved the hardware to get more specialized AI chips, and now we can get inference that's completely insane.
So AGI being achievable with current hardware, I want to say I'm a little bit surprised, but somehow I'm not that surprised.
I guess this is just assuming the right algorithms and the models are in place. And what this does imply is that, you know, the bottleneck with AGI, it just isn't hardware but rather the advancements in AI research, algorithm optimization, and perhaps data quality.
This means that, of course, OpenAI probably has seen some things and are like, you know what? Hardware isn't the issue. Maybe it's just our algorithms and maybe we just need to scale this up, fix a few issues here, and things are going to get really crazy.
But of course, once AGI is achieved with the current hardware, AI specialized hardware is probably going to speed things up 10, 20 times. So the future is probably going to get pretty crazy a lot faster than we did initially think.
Number two was of course the 01 versus the 01 preview. I'm pretty sure all of you guys know by now what the 01 model is. This is the new paradigm of models that think before they talk.
Apparently someone asked the question, is the full 01 really a noticeable improvement from 01 preview? And the OpenAI VP of Engineering said yes.
For those of you who are currently using the Zero1 model right now, I’m guessing that when we do get full access to 01 and crazily enough, guys, a little side story for you all. 01 was released accidentally today. I actually got to use it. It's pretty cool.
Someone actually tweeted a link, and you were able to use 01 with images. It was pretty crazy. But apparently 01 is an insane noticeable improvement from 01 Preview.
If you have looked at the benchmarks, that is a true statement because 01, I think it's about like 10 to 15% better across a range of different benchmarks.
So, those areas where you think, ah, this model isn't that good, just trust me, it is good. And for those of you guys who are struggling with 01 preview 01 mini, I would say just try your hardest to give that model as much context as possible.
Not too much context, but like if you have a problem that you’re reasoning through. When I say context, I mean like, okay, give it your relevant context. That’s going to help it solve the problem.
For example, if you’ve got a health issue, it is probably good to have your age, your lifestyle habits, your ethnicity—anything that may contribute to certain lifestyle factors or influences—anything that could influence that is always important to include.
And you’ll see that there are really nice reasoning jumps now.
Coming in at number three, they also talk about the future of scaling. Someone says, how will 01 influence scaling LLMs? Will we continue scaling LMS as per scaling laws or will inference compute time scaling mean that smaller models with faster and longer inference will be the main focus?
This is where they say that we're not just moving to that paradigm; it's both paradigms. Kevin, the OpenAI CPO, says it’s not either; it’s both better base models plus more strawberry scaling inference time compute.
So when you have both models, because when you think about it, right guys, like if we say that humans are AGI in a sense that like, you know, we are the base standard because it's human-level reasoning, we need to think about that for a second.
One thing that I keep telling you guys is that humans don’t just have one chain of reasoning. We have system one thinking, which is, you know, the quick kind of thinking that is immediate. Like where I say, what's your favorite food? And you say pizza.
Then I say, okay, what's the fastest way to get to the healthiest food spot in town? Then we have to plan a journey. System one thinking is slow.
System two thinking is where you have to figure it out slowly. And when you think about it, humans have both. We inherently have both and we use different ones for different scenarios.
So I think that if we ever do get to AGI, I think that we're going to have to have like a miniature system that just thinks, is this question long or is this question short? Just like humans have.
Because we wouldn’t route that question to our short brain; we would route it to our long deliberate thinking. And that’s exactly what these AI systems are going to do in the future.
That actually does make sense because we can't just expect it to be, you know, inference time reasoning. And we can't just expect it to be instant zero short responses.
Coming in at number four, we have the fact that hallucinations do persist.
Someone said that thanks to the great work, love you, and so on, are hallucinations going to be a permanent feature? Why is it that even 01 preview when approaching the end of thought, hallucinates more and more?
And how will you handle old data, even two years old that is now no longer true continuously TR trade models or some sort of garbage collection? It's a big issue in the truthfulness aspect.
And I'm not going to lie, this was one thing that I genuinely didn't think about. The fact that like these models, if you've ever used certain models from the past, one person said, and this was like a really interesting thing that I saw, were that these models are essentially time capsules.
Like they're trained from this date to this date. Like they've got knowledge cut off until like April or May of this month or June, and of course after that they just don't have any more information.
Now of course you've got search and stuff like that. But it is a real issue to have like old information that is no longer true where you’re trying to do reasoning based on a paradigm that doesn’t exist.
And remember we're in the AI era now, which means that a lot of things are changing quickly. OpenAI SVP of research responds, saying we're putting a lot of focus on decreasing hallucinations but it's fundamentally a hard problem.
Our models learn from human-written text and humans sometimes confidently declare things they aren’t sure about. Our models are improving at CING which grounds their answers in trusted sources, and we believe that reinforcement learning will help with hallucinations as well.
When we can programmatically check whether models hallucinate, we can reward them for not doing so. Now, I think this is good news for those of you who don’t want AI to take your jobs, but bad news for those of you who want ChatGPT for certain applications.
One of the reasons that ChatGPT, GPT-4, whatever you want to call it, generative AI isn't suitable in some applications is because if you make a mistake in certain fields, the fallout from that is ridiculous.
What you are placing on the line is too far. Like if a generative AI model makes a mistake when it comes to healthcare, someone could die, and we're not going to put lives at stake.
And I know, of course, you do look at the research and you say that okay, these models are a lot better than certain doctors summarizing notes and diagnosing this and that because they're able to look at all the data.
But I think because of how certain industries are and because of the kind of rules and regulations that we do have, it's quite unlikely that we will get a situation where until these models are really reliable and have a ridiculous level of reliability, it's unlikely that they're going to have this large wide-scale commercial use case.
Now it could just be a situation where someone develops a software and then boom, we or re off after the raid. That’s going to be a huge moment because then there's going to be wide-scale adoption.
But with a hallucination rate of like 5 to 3%. Imagine like every third plane or fifth plane crashes. That just wouldn't be something that you travel in.
We just honestly wouldn't. Or like imagine every third or fifth engine exploded. That’s just not something we’re going to put in our cars.
So I think this is probably one of the biggest problems. And if OpenAI are saying that fundamentally this is a hard problem, it’s probably a hard problem.
Now of course we do have the next breakthrough. You can see right here that it says, what’s the next breakthrough in GPT liner products and what's the expected timeline? He says we’ll have better and better models, but I think that the thing will feel like the next giant breakthrough will be agents.
So if you want to figure out where to go next and where to focus your attention, it seems like OpenAI are working on some incredible agents. We’ve already seen a bunch of different agents with Microsoft and Google's Project Astra.
But I think that the next breakthrough being agents is going to be really interesting because that requires reliability over a longer period of time and it's going to be really interesting to see what OpenAI manages to do.
Usually they are at the frontier of this, and if they're at the frontier, they usually come out with something pretty crazy, and we have other companies spending two years or so trying to catch up to what they are doing.
So that's going to be really interesting because they said agents are the next giant breakthrough.
Coming in at number six, we do have Surviving in the age of AI. This is good for those of you who are worried about post AGI economics.
He said regarding in the future, if you were 15 today, what skills would you focus on to succeed in the future? And the co-host says, being adaptable and learning to learn is probably the most important thing.
I would agree because in an ever-changing world, you have to be adaptable. You can't be rooted in one belief or the other. You have to adapt to the environment. You know, as they say, adapt or die, and that is something that I truly believe.
Yes, you may have existing skills, but learning how to learn quickly and efficiently without wasting time and translating knowledge into a skill is going to be something that's really important as many different industries spring up and many older ones disappear.
Now one of the biggest things that most people were wondering was, of course, what did Ilya Sutskever see? This was a question that was trending on Twitter for quite some time.
What did Ilya see? What did Ilya see? What did Ilya see? That was literally all I saw in my timeline when I was scrolling on Twitter for, I think it was like the first three weeks after Reuters broke the news that there was some advanced AI that could end the world.
Sam Altman responded this saying he saw the transcendent future. Ilya is an incredible visionary and sees the future more clearly than almost anyone else. His early ideas, excitement, and vision were critical to so much of what we have done.
For example, he was one of the key initial explorers and champions for some of the early ideas that eventually became 01. So it's clear that Iliya Sutskever saw something that was so far ahead that, you know, I’m guessing that he’s clearly going to start his own company now.
And of course, I do wonder if he can get to Superintelligence before OpenAI. It will be interesting to see if they can do that. I think running your own company is very hard but considering the fact that they now have complete focus just to do that, I think they definitely have a shot.
They don't have to wait on any products; they don't have to abide by certain deadlines. All of their compute is going to be going to Superintelligence which apparently skips past AGI surprisingly.
But yeah, it's going to be a really interesting time because if they come out and say, hey, you know, we've done it—we did the Superintelligence thing—it’s going to be like I think the world is going to change on that day.
So that’s what everyone’s racing towards, which is pretty crazy.
Now one of the things that most people got, coming at number eight, is the fact that most people forgot about this, and I still haven’t forgotten about this because I think this is really cool.
Basically, okay, there is advanced Voice mode vision. Someone said anytime, any timeline on when we’ll get advanced voice mode vision? Why is GPT-25 taking so long? What about the full 01?
Sam Altman says we’re prioritizing shipping 01 and its success. All of the models have gotten quite complex, and we can’t ship as many things in parallel as we’d like to.
We have a lot of limitations and hard decisions about how we basically allocate compute towards many great ideas, and we don’t have a date for Advanced Voice Mode with the Vision yet.
Basically, what they’re saying is that like look, advanced Voice mode is a cool feature, really amazing, but there’s not really a return on investment considering the fact that most people might not use it as much as we initially thought.
So we’re going to focus on shipping 01 because 01 is our frontier model. It’s much smarter; it can do a lot more and I’m guessing some of their big enterprise clients are going to be using those a lot more.
Considering the fact that 02, 03, or 04 or 05—that’s probably going to be bordering on the line of AGI, maybe edging on ASI.
And I know that sounds like a crazy thing, but if we look at GPT-1 to GPT-4, where it is now, it’s definitely pretty crazy and inference scaling sounds like there’s a lot to go, so I will be intrigued to see what goes on there.
Within voice, Advanced Voice mode vision. If you don’t remember what this is, basically there was a short demo of Be My Eyes. Be My Eyes is basically an application where if you are visually impaired, you can use this and take a picture of something.
Then other people are basically your eyes, and they tell you exactly what it is you're looking at. Now, with Advanced Voice mode, instead of using random people around the Internet that have vision to see for you, what you can actually get is, you get Advanced Voice mode, which is basically ChatGPT Advanced Voice mode, but with Advanced Vision mode.
So it’s like a live stream. You’re basically on FaceTime with an AI and the AI is just watching what you’re watching. It’s pretty cool. It’s actually really innovative, and I'm actually glad that these are the kinds of applications that a lot of people are going to get out of AI.
A lot of people say AI is boring, it is that. But this person was able to book a taxi, able to just hold it up, and Advanced Voice mode was able to say, hey, hold your hand out now. This taxi is coming now.
And this person was able to get the taxi. So a lot of people with disabilities are going to have, you know, a much easier life, all things considered, once this AI technology gets completely rolled out.
Now for the next update, they said, when will you guys give us an update for a new text image model? DALL-E 3 is kind of outdated. Sam Altman said that look, the next update is going to be worth the wait, but we don’t have a release plan yet, which means it's not at the top of the agenda.
As they said before, the top of the agenda is 01. So I’m guessing that one of the things we can expect is, of course, 01.
And remember, they are focusing on agents right now, so that is probably why we haven't seen much from there yet.
Now also, they did state that they're going to be working on a longer context window. Someone said, hello, I would like to ask when the token context field for GPT-4 gets increased. In my opinion, 32k, especially for longer coding tasks or writing tasks, is way too small compared to other AI models out there.
They agree, we're working on it. And I agree 32,000 context window just isn’t enough. A lot of times I’m trying to write some long things, and it just really doesn't help.
So this is going to be something that’s really cool. Now this last one here, coming in at number 11, is insane because I forget about this from time to time and then I often get reminded and I'm like I can’t believe it’s still on that out yet.
So where will we get information about GPT-4.0 and image and 3D model generation? They said soon, and they actually showed a screenshot.
We basically get a look at this HTML real-time editor. So it seems like this is going to be one of the features they’re shipping out from GPT-4.0 first. If you’re not familiar with what I'm talking about, basically GPT-4.0 has a bunch of advanced features that just weren't included with the release.
Well, you might not know, but GPT-4.0 is actually an omnimodal, basically meaning that it is audio, basically anything in, anything out.
That means audio, image, video, 3D models—absolutely anything in, anything out. With that, of course, some people are wondering when are the 3D models coming? When are the manipulatives coming?
But they actually showed us this, so it seems that we're going to get a real-time HTML renderer where you can just simply enter anything, and we're going to be able to see that in real-time and manipulate that.
So, that’ll be cool. But I’m not sure when that is going to be released.
Now if there’s anything that you guys wanted to discuss, leave a comment down below, and I'll see you guys on there.