OpenAI Podcast Revisited: The AI Coding War! Developers Are the Most Fortunate: Specialized Code Models Will Emerge! Host Leaks: "I Like Claude the Most!"

Compiled and organized by Yi Feng

The second OpenAI podcast is here! The lineup is quite impressive!

Guests are OpenAI Chief Research Officer Mark Chen and ChatGPT Lead Nick Turley. The host remains OpenAI researcher Andrew Mayne.

Speaking of Mark Chen, it feels like he is currently serving as OpenAI's CTO. Just these past two days, as Meta aggressively poached talent and the AI talent war in Silicon Valley raged, Mark was the first to step forward to steady the ship, sending a strongly worded memo on internal Slack:

"I now feel a strong, visceral sensation, as if someone broke into our home and stole something." "Please believe that we are by no means standing idly by."

Left: Nick Turley, Right: Mark Chen

There was a shocking little interlude in this podcast: when asked about the strongest programming model, host Andrew Mayne actually said he liked Claude Sonnet… Is this podcast really that candid??

Back to the main topic, this podcast still has a lot of programming insights worth checking out:

🔹OpenAI can be called the pioneer of AI programming: when GPT-3 first emerged, OpenAI discovered it could generate complete React components and recognized the potential and demand for AI programming tools.

🔹Agentic programming should be asynchronous: you give the model a complex task, let it process in the background for a period, and then it returns a near-optimal solution.

🔹 "Writing good code" still requires "taste": OpenAI is also training models to understand code style.

🔹 OpenAI's product development philosophy: start with technology, observe who can find value in it, and then iterate around those users.

🔹Codex is an AI tool for professional programmers, and more general consumer products will be launched in the future.

🔹 Stop focusing on "job disappearance": Typewriter repairmen indeed disappeared, and some traditional development jobs will be replaced, but "what code can do" far exceeds imagination.

"The space for writing code is far vaster than you or I can imagine."

No Single Programming Model Will Dominate; "Agentic Programming" Has Great Potential

Host Andrew Mayne:

Among the many surprising capabilities, code has always been a very interesting direction. I remember when GPT-3 first came out, we suddenly found it could generate complete React components, and at that time, we realized it had practical use. After that, we specifically trained a model focused on code, which led to Codex and Code Interpreter. Now Codex is back in a new form, with the same name but stronger capabilities.

We have witnessed code functionality evolve from its initial integration into VS Code, then to Cursor, and now to Windsurf, which I use frequently. There is also significant competitive pressure in the code domain. If we ask who has the strongest code model, there might be different answers.

Mark Chen:

That's right, and it also reflects the fact that the term "programming" itself covers a very broad range. For example, writing code in an IDE, requiring function completion, is completely different from submitting a Pull Request and having AI complete the entire task—these are two very different "programming styles."

Host Andrew Mayne:

Could you elaborate on what "Agentic Programming" specifically means?

Mark Chen:

Certainly. You can compare it to "real-time response models," where the traditional ChatGPT approach is that you give a prompt, and it responds quickly. But Agentic programming is more like: you give the model a complex task, let it process in the background for a period, and then it returns a near-optimal solution.

We are increasingly convinced that the future will be this type of asynchronous interaction: you submit a complex request, and then let the model take time to think and reason, ultimately providing you with the best possible result. We are seeing this evolution in the code domain as well. In the future, you might simply describe the functionality you want to achieve, and the model will process it and return a complete solution.

Our first version of Codex embodied this direction: it received large, PR-level tasks, such as new features or broad bug fixes, expecting the model to genuinely take time to solve them, rather than just respond quickly.

Nick Turley:

Yes, ultimately, "programming" covers such a broad scope that it can almost be compared to a grand term like "knowledge work." So I don't think there will be an absolute winner or best model; instead, there will be many options. Developers are actually the luckiest group of people; they now have access to a wide variety of tools. For us, "Agentic programming" is currently one of the most exciting directions.

When I develop products, I often use an evaluation criterion: "If the model becomes 2 times stronger, will the product also become 2 times better?" In the past, ChatGPT indeed achieved this. But as models become smarter, users no longer just want to chat with a model like a PhD student; they start to care about the model's personality, practical abilities, and so on.

Codex represents an ideal product form: you define the task, the model spends time thinking, and then it produces results. This interaction method is particularly suitable for more powerful future models. Although it is still in the early research preview stage, just as we predicted with ChatGPT, we believe that obtaining feedback as early as possible is more meaningful, and we are very optimistic about the future.

Host Andrew Mayne:

I used Sonnet (Claude) a lot before, and I really liked it. Then I tried Windsurf, using the 0.4 mini or medium setting, and I felt it was excellent; the response speed was fast, and the experience was smooth. Everyone has their preferred model, and I understand that, but for the tasks I use it for, this was the first time I was truly satisfied.

Mark Chen:

When we release this, we actually hope users can experience it, and we also know there are many "low-hanging fruits" to explore in the code domain. Code is a key area of focus for us, and in the future, you will see more and more code models adapted to specific needs.

Host Andrew Mayne:

If I just want to quickly look up syntax or a specific way of writing something now, I'll directly ask GPT-4.1. But for more complex tasks, the situation is different. Evaluation methods are actually a bit saturated, and everyone's standards are different, so adapting to various needs is a challenge.

Mark Chen:

Especially in the code domain, it's not just about "can it generate the correct answer." Users also care about code style, the thoroughness of comments, whether the model proactively does extra things, like writing helper functions, etc. People's preferences vary greatly, and we need to do better in many aspects.

Nick Turley:

In the past, if you asked me which fields would be revolutionized by AI first, I would definitely say code. Because, similar to mathematics, it is verifiable and scorable, making it very suitable for reinforcement learning.

The "Taste" of Programming Remains Most Valuable

Nick Turley:

I still believe that, but what surprised me is that code still contains a lot of "taste" aspects. A person becomes a professional software engineer not because their IQ increased, but because they learned how to build software in a team—what kind of tests to write, how to write good documentation, how to respond when others disagree with your code…

These are the components required for "true software engineering," and we also need to teach models to understand these. So I believe AI's progress in code will be rapid; code remains a very suitable area for Agentic products, but "style," "taste," and "real development practices" must also be considered.

Host Andrew Mayne:

Another interesting point is that one of the challenges ChatGPT currently faces is finding a balance between "consumers" and "professional users." I tell my friends they can connect ChatGPT to a code model, and they find it amazing, for example, letting it control an IDE, automatically create folders, write documentation—all these things can actually be done.

Now we have image tags, Codex tags, and can even use models via GitHub. Sora is also integrated. You can see these functionalities gradually merging: so how do we differentiate between consumer-grade, professional-grade, and enterprise-grade features?

Nick Turley:

What we are building is very general technology, and it will be used by all kinds of people. Unlike many companies that usually start from a specific user type and use technology to solve that group's problems, we more often start from the technology, observe who can find value in it, and then iterate around those users.

Taking Codex as an example, our original intention was to build this product for professional software engineers. But we also know that this "spillover effect" will affect many other user groups, who can also benefit from it, and we will work hard to make it more convenient for them to use.

In fact, there are many opportunities for non-engineer users. I personally would very much like to help build a world where "everyone can code." Codex is not a mass-market product, but you can imagine such tools appearing in the future. However, generally speaking, it's difficult for us to accurately predict who the target users are from the outset; we must release these general technologies and then empirically observe where the value truly lies.

Mark Chen:

Yes, to delve a bit deeper, some users might primarily use ChatGPT to write code, but they might occasionally want to chat with the model or generate a beautiful image. So while different user personas exist, in practice, we find that everyone wants the model to have multiple capabilities.

Internal Use of Codex: Goal is to Increase Engineer Efficiency Tenfold

Host Andrew Mayne:

When Codex was launched, I had a very strong impression: some tools generate immense interest because the internal teams themselves have strong needs. Do you use this tool much internally?

Mark Chen:

Increasingly so.

Nick Turley:

I am very excited about the internal use of Codex. The application scenarios are very broad, for example, developers use it to reduce the burden of writing tests, and analysts use it to process log errors, automatically tag issues, and notify relevant personnel on Slack. Some even use it as an entry point for a "to-do list," sending future tasks they wish to complete to Codex in advance.

I believe this type of tool is very suitable for "dogfooding," and it indeed significantly improves development efficiency. Our goal is: without expanding headcount, to increase the output efficiency of existing engineers, ideally making one engineer ten times more efficient than before. From this perspective, internal usage is a very important indicator for the future direction of our products.

Mark Chen:

Yes, we wouldn't release a tool we wouldn't want to use ourselves. Before Codex went live, we had some super users internally who could write hundreds of Pull Requests daily with Codex, which shows that we genuinely gained immense value from it internally.

Nick Turley:

Moreover, internal use also serves as a "reality check": people are busy, and introducing new tools requires a certain "activation cost." You'll find that when we try to promote a certain tool, it takes time for everyone to adapt to new workflows, and this process is actually quite humbling. We can learn about the limitations of the technology and also see the practical resistance in the "adoption path."

Secret Weapons for Adapting to the AI Era: Curiosity and Proactivity

Host Andrew Mayne:

Indeed, when building these kinds of tools, team members need to spend time learning and adapting to them. There's a hot topic now: what skills will be needed in the future? What capabilities do you prioritize when building your team?

Nick Turley:

I've thought a lot about this question. Hiring is truly difficult, especially when you want to build a small, strong, humble, and highly execution-oriented team. The ability I value most is curiosity.

Many students ask me what to do in a rapidly changing world. My advice to them is: curiosity is key. We ourselves have many unknowns, and we must delve into research with a humble attitude to understand what is valuable and what is risky.

Especially in AI-related work, whether it's code or other areas, the real bottleneck is not getting answers, but asking the right questions. So I strongly believe we need to hire people who are curious about the world and what we do.

As for whether someone has an AI background, I don't really care that much. Of course, Mark (Mark Chen:) might have a different view. But for me, the success of a product team largely depends on the strength of its curiosity.

Mark Chen:

I'm actually pretty similar. Even in research directions, there's less and less emphasis on needing an AI PhD. I myself joined as a "resident researcher" and didn't have any formal AI training background at the time.

What Nick mentioned is very true, and another important quality is "proactivity." OpenAI isn't a place where someone tells you every day: "Today you need to do these three things." Instead, you have to see a problem and proactively jump in to solve it.

And then there's adaptability. This industry changes very quickly, so you must be able to quickly judge what's important, when to adjust direction, and how to shift your work priorities.

Nick Turley:

Indeed, we are often asked, "How can OpenAI continuously release products?" Many people feel we release new things almost every week. But we ourselves never feel we're fast enough; on the contrary, we often feel we could be faster.

The real reason is that we have a large number of people with execution power; whether it's the product, research, or policy teams, everyone can drive projects forward. Although "release" means different things in different teams, we reduce unnecessary bureaucratic processes, except in very few areas that require strict procedures. Most of the time, we can move quickly, and this is the ideal type of talent we want to hire.

Host Andrew Mayne:

I joined the company because I got early access to GPT-3 and started making all sorts of demos, posting videos every week. Sometimes it might have been annoying (laughs), but I was genuinely very excited.

Mark Chen:

Not annoying, rather interesting.

Host Andrew Mayne:

That period was truly exciting. When I described it to others, I'd say it was like they built a UFO, and I got to play with it. For example, I saw it could "hover," and I thought: "Wow, they really made it fly!" I just pressed a button, but the power I felt was real.

I taught myself to code, watched many Udemy courses, and then, as part of the engineering team, I was encouraged to "go do something myself." While it wasn't core work, and I didn't break anything (laughs), that freedom made me feel empowered.

I think that spirit is still present today, and it's one reason why OpenAI can continue to advance its products. After all, GPT-4 was accomplished by 150 to 200 people working together.

Host Andrew Mayne:

I think people often forget that.

Nick Turley:

Completely agree. In fact, even ChatGPT was "pieced together" this way. Initially, the research team was working on "instruction following" related research, and subsequent models continued along this path, making them more suitable for dialogue through post-training. And ChatGPT's product prototype was actually a hackathon project.

I remember clearly, we asked everyone then: "Who wants to build a consumer-grade product?" And people from various backgrounds joined in—for example, someone from the supercomputing team said, "I've done iOS apps before, I'll do it"; researchers also came to write backend code. It was a convergence point for people who "wanted to get something done."

I believe that the organizational culture that allows this to happen is key to giving birth to the next ChatGPT, and it's something we need to maintain as our organization continues to expand.

Host Andrew Mayne:

We mentioned earlier that the team looks for "curiosity" when hiring, and Mark agrees. If I were someone outside the AI industry, say 25 or 50 years old, facing rapid technological development, I might feel a bit scared, especially with ChatGPT performing so well in writing and programming.

But I personally believe that there will never be enough "people who write code," because what code can achieve is far more than we can imagine.

Don't Worry About Being Replaced: The Space for Writing Code is Vaster Than You Can Imagine

Host Andrew Mayne:

So, what advice would you give to ordinary people? Regardless of their stage in life, how should they prepare, adapt, and participate in this AI future?

Mark Chen:

I think the most important thing is to realize that your abilities can be augmented—AI can make you more efficient and more impactful.

In the future, experts will still exist, but AI helps most those who do not possess advanced skills themselves. For example, if a model can provide better medical advice, it helps most those who cannot access high-quality medical resources.

Image generation is the same; it doesn't replace professional artists, but it allows people like Nick and me to make creative expressions.

AI provides a "capability-enhancing infrastructure," allowing ordinary people to excel at many things they weren't good at originally.

Nick Turley:

The future will indeed change significantly. I think almost everyone will have a moment when AI does something they originally considered "sacred and uniquely human."

Host Andrew Mayne:

I know an investor who felt very threatened when his job content was replaced by AI.

Mark Chen:

That's right, some models are indeed better than me at writing code and solving problems now.

Nick Turley:

That's a very human reaction. When facing such powerful systems, we feel awe, and perhaps fear.

As Mark said, you need to use it yourself to truly understand it and reduce that fear.

Many of us grew up hearing about "artificial intelligence," but back then it was about algorithms, ad recommendations, or robots in movies; AI meant different things to different people. So, feeling scared is not surprising at all.

To have a constructive conversation with others, the best way is to personally use AI.

As for how to prepare for the future, I don't think it necessarily means delving into prompt engineering or AI programming principles.

Instead, what's more important are essential human abilities, such as "how to delegate tasks."

You will have an "agent in your pocket" that can be your mentor, advisor, engineer. The real key is: do you know yourself, know what problems you want to solve, and how others can help you?

Nick Turley:

As I mentioned before, curiosity is very important. The questions you can ask determine the results you get. In addition, the ability to continuously learn is also extremely crucial.

The faster you can learn new fields and new knowledge, the better you will adapt to this world with an unprecedented pace of change. I myself am prepared to accept that someday in the future, my current role as a product manager might disappear. But I also look forward to learning something new. If you have this mindset, you will be able to make good use of AI.

Host Andrew Mayne:

Sometimes we overemphasize the disappearance of certain jobs, like how we no longer need typewriter repairmen. Certain types of programming jobs will also disappear.

But I'll say it again: the space for writing code is far vaster than you can imagine.

You just mentioned the healthcare industry, and some people worry about AI replacing doctors. But I'd actually be happy for AI to diagnose me, perform surgery, and even handle many things.

But I still want someone to talk to me, explain the surgery process, and hold my hand. I take a lot of vitamins every day, but I can't go ask a doctor about these small things every day, right?

OpenAI Podcast Revisited: The AI Coding War! Developers Are the Most Fortunate: Specialized Code Models Will Emerge! Host Leaks: "I Like Claude the Most!"

Share Short URL