Deep Dive | Interview with Character.AI CEO: The Best Applications Haven't Been Invented Yet, AI Field is Like Alchemy, Nobody Knows Exactly What Will Work

Image

Image source: 20VC

If you have a very general technology that can be applied to billions of different scenarios, and ordinary people can use it, then it should be pushed to billions of people.

Instead of directly engaging in medical research or similar work, I have more leverage to advance AI technology, which can help solve many of the remaining problems.

What we want to build is something that is both usable and very general.

We can focus on making our product better while also advancing AI development, and these two aspects are complementary.

Noam Shazeer is the co-founder and CEO of Character.AI. Shazeer is a former member of the Google Brain team, where he led the development of Gmail's spell correction feature and the core algorithm for AdSense. Character.AI is a full-stack AI computing platform that offers people access to flexible super-intelligence.

Google Experience and Character.AI Introduction

Harry Stebbings: Welcome to 20VC, a show that interviews the world's best founders and investors. Today we're joined by Noam Shazeer, a top expert in the field of AI and NLP (Natural Language Processing). Noam is the co-founder and CEO of Character.AI, a full-stack AI computing platform designed to provide people with flexible super-intelligence.

Noam, I'm so excited to chat with you! I've heard so many good things about you from different people; Eric Schmidt, Sarah Wang, Prajit, among others, have all recommended you. Thank you so much for joining us today.

Noam Shazeer: Thank you. It's great to be here, Harry!

Harry Stebbings: I want to start with some background, as few people stay at a rapidly expanding company like Google for 20 years. First, I'd like to look back at how you joined Google. I heard your joining story is a bit special, can you tell me about the “spelling corrector” story?

Noam Shazeer: Yes, that was the first project I did at Google. Back then, Google used third-party software for spell correction, similar to what you might encounter in word processing software at the time. It was based on a manually compiled dictionary of about 50,000 words. If a query had a word not in the dictionary, it would say: “Did you mean…?” This method was very effective for spell correction.

But for web search, it was terrible because people searched for a wide variety of content, and many words simply weren't in the dictionary. For example, if you searched for “turbotax,” it would show “Did you mean ‘***’?” People gradually learned to ignore this prompt. So our first project was to investigate why people were dissatisfied when using Google, and spell correction became the biggest problem. I thought, OK, I'll help solve this problem. At the time, there was one person working on this, Paul Buchheit, who later did many excellent things, and today he is also one of our investors at Character.AI. He was preparing to take a few weeks off for winter break.

Harry Stebbings: Staying at Google for 20 years, you must have gained a lot. Can you tell me one or two of the most important lessons? How has 20 years impacted you?

Noam Shazeer: I think a big takeaway is that if you have a very general technology that can be applied to billions of different scenarios, and ordinary people can use it, then you should push it to billions of people.

I remember when I first joined Google, many people were working on an enterprise search appliance. While it was decent, perhaps it was generally believed at the time that B2B was the only way to make money. It turned out that the real bigger opportunity was B2C, serving something to everyone.

Harry Stebbings: How did that idea change your mindset? Does it mean pursuing bigger goals?

Noam Shazeer: Yes. Now I've founded Character.AI, and we are bringing large language model technology to consumers, directly to users. Here's a technology that is more flexible and easier to use than web search; you can use it to make friends, do homework, brainstorm, get ideas, or even do a thousand different things. We haven't even thought of the best use cases yet. It's incredibly easy to use; you just talk to it.

So it has two characteristics: first, it is very general; second, it is very easy to use. For me, this means it should be pushed to the whole world, for everyone in the world to use it. Whereas I think some other companies have taken a more B2B approach, first building a foundational model company, and then building industry-specific application companies on top of that. I'm really inspired by the Google model, from fundamental research to product launch, and then directly to consumers.

This is very interesting and empowering because engineers love building things, then launching them and letting everyone use them immediately. And it also allows you to co-design across the entire tech stack, which feels both powerful and fun.

Harry Stebbings: Can I ask, you just mentioned why you joined Google. Many people are influenced by past experiences. When you look back at your experiences, how do you view the things you once “avoided”?

Noam Shazeer: Well, why did I start working in AI? Partly because it's fun, and it's what I like to do anyway. After all, what's more fun than trying to get computers to do things they can't do yet? Another reason is to advance technology. There are many technical problems in the world that can be solved.

Approximately 15 million people die each year from aging, cancer, heart disease, and various other illnesses, for which we might find cures. So, instead of directly engaging in medical research or similar work, I believe I have more leverage to advance AI technology, which can help solve many of the remaining problems.

Character.AI's Vision, Growth, and Ethical Considerations

Harry Stebbings: So how do you view the company's mission, vision, and values? When you talk about the world's most serious problems, from climate change to wealth inequality to natural resource scarcity, how do you understand mission, vision, and values? I think if we're honest, many people misunderstand this.

Noam Shazeer: What are our mission and vision? I think we need to remain humble, because we are not the ones in control of this world; God is. We can't even control governments, nor can we control individual behavior.

Harry Stebbings: When you stand in front of the company and share, and now there are many people on the team, how do you express the mission to them?

Noam Shazeer: I like the slogan: “A billion users invent a billion use cases,” because that's the superpower of this technology. It puts our company in the right position; we can't fully guess the best uses for this technology. We've observed time and again that when you launch something, it's not what people wanted, and others discover better uses for it.

For example, we created a psychologist character, and you might want to chat with it to feel better. There were some applications, but we heard more user feedback saying: “I'm talking to a video game character, and it's now become my new therapist, which makes me feel better.” We completely didn't anticipate this. This demonstrates a huge demand for entertainment, companionship, and emotional support. We are completely unprofessional in these areas. Our job is just to launch something general and respect user choice, allowing everyone to use it freely.

Harry Stebbings: When you think about the astonishing growth you've experienced, sending 450 million messages daily, with 20 million users, what do you think are the two main factors driving this growth?

Noam Shazeer: One factor is that we finally launched, which was definitely a hurdle in the past. Previously, large companies might have felt there was too much brand risk in launching something like this; another aspect is that we launched a general product, allowing people to find use cases themselves; and another reason is that there's huge demand in the world. For example, billions of people feel they need someone to talk to. Combining these factors, we provide users with a general tool, they have a need, and they will find it.

Harry Stebbings: I'm very interested in your thoughts and completely agree with your view on horizontal use cases. I'm fascinated by how people interact with it in different ways. Aren't you concerned that people might lose connection with others, and only when there's no one around them do they turn to talk to machines?

Noam Shazeer: Human connection is incredibly valuable and has moral worth. The last thing I want to do is make people lose connection with other humans. We want to help people build human connections. Many people who don't have friends or good social connections often struggle due to social anxiety. A lot of people feel uncomfortable, and we've received many reports of people saying they used to feel uncomfortable talking to others. But now, by interacting with this system, they can practice social interaction, which helps them become more confident in social settings.

Harry Stebbings: Or do you think it's actually just building a habit? You get used to talking to a non-human entity, and ultimately it will depend on the user. What is the most difficult product challenge you currently face? This product model is very complex, with many use cases and user needs. As a team, what do you consider your biggest challenge?

Noam Shazeer: The main thing we need to do is keep the product general. This way, we don't limit use cases and make it usable. Many people think generality and usability are opposing, and it's very difficult to be both flexible and easy to use. Like when we first talked to some potential product managers, they all said the same thing: Oh, pick your vertical, narrow it down to make it more usable. But we weren't going to hire those people. That's actually the opposite of what we want to do. We want to build something that is both usable and very general. So there's a dichotomy there.

Harry Stebbings: I don't quite understand how you think, I'm a bit naive, so I'm asking for your insights: your view is that the more specialized something is, the deeper and richer the conversational value it can provide. So, how do you maintain sufficient quality across such a wide range of domains?

Noam Shazeer: Yes, that's the magic of neural language models. Previous systems were rule-based systems, very complex, with millions of handwritten rules, truly very complex, requiring knowledge of linguistics, psychology, and various other fields. But now, with neural language models, it's completely different: you can know almost nothing about language, except that it's a sequence of words. So, it has nothing to do with understanding language itself, and there are no millions of rules. It's actually relatively simple, like a big black box. Everything boils down to a simple goal — you have a sequence of words. That's the beginning of your document. Guess what the next word is, give me a probability, and see what the next word is. This problem is called language modeling. It's guessing the next word based on the preceding words. Around 2015, I started getting into this problem, and some Google colleagues were also working on it. They were asking: How good can we get?

I think this problem is fantastic because it's expressed very simply, and there's a huge amount of free training data. You can download text from the internet, just grab it. You have billions to trillions of training samples to teach you how to guess the next word. If you do it well, this system can talk to you. The better it does, the smarter it gets. It's very general, very useful, and very simple. Now we just need to do it well. People started building better and better neural networks, and later they were renamed deep learning—this name was for repackaging neural networks because the hardware wasn't good enough.

So people use deep learning for language modeling; bigger is better, more powerful is smarter. Around 2016, the most useful application was machine translation. It was smart enough to translate English into French or similar languages, which was very useful. It allowed everyone in the world to communicate with each other, but it wasn't smart enough to have interesting conversations, do homework, or handle other things. But there seemed to be a clear path: make this system smarter, bigger, better, and more intelligent, and it would gain these capabilities.

AI and Machine Learning Technology and Business

Harry Stebbings: Can I ask? You're talking about the situation around 2015, 2016, when the excitement cycle was very different from now. I remember in 2015, 2016, we had a chatbot phase, there was a super exciting brief period, but there wasn't the sustained belief then as there is today that AI would fundamentally change how society operates. I want to ask, is the current situation a result of technological progress, or have investors and society caught up with long-term technological developments in recent years?

Noam Shazeer: I think it's both. I believe technology has progressed a lot, both quantitatively and qualitatively. Models around 2016 were too笨; at that time, neural networks and chatbot technology were rule-based systems, and these systems were very fragile, with no room for progress. You just needed more rules.

And there was no way to predict all possible situations, nor could they be widely applied. So these methods didn't work. But at the same time, we were making progress on neural network solutions, and these solutions are scalable. It took some time, around 2020, impressive results started appearing in labs, but they hadn't been launched yet. So my partner Daniel De Freitas, he's a very smart, very driven person. Ever since he was a kid in Brazil, he's had one goal—to build an open-domain chatbot.

Harry Stebbings: You said the more you do, the better the results, the faster and more accurate the response. I always have a bit of a question, though not too worried, I just want to ask, is the size of the data more important, or the size of the model?

Noam Shazeer: Yes, perhaps the size of the model is the bigger challenge. We can get a lot of data, but actually the most important thing is how much computational power you use to train it. So you want to train a bigger model, and you want to train it longer. Both are important, but the real limiting factor is how many computational operations it takes to train it. Because if you make it bigger and train it longer, these two will compound, leading to very long training times. So people have been building increasingly powerful supercomputers to train these models.

Harry Stebbings: What do you think is the biggest limitation facing your models currently?

Noam Shazeer: Basically, it's computational power. The model we are currently using was trained last summer, costing about $2 million in computing resources to complete the training. If we can get better hardware and spend more time training the model, we will do even better in the future. In 2016, you could train a model smart enough for language translation, but it wasn't smart enough to answer questions or be entertaining.

Harry Stebbings: From the perspective of models and data, how do you view proprietary data versus non-proprietary data? You just mentioned that in the early days, internet data could be downloaded. So for a company like Character.AI, a lot of proprietary data is generated in your conversations, as is the case for many vertical solutions in fields like healthcare and finance. So, what is the value of proprietary data? How does its ownership relate to whether data can always be downloaded by everyone?

Noam Shazeer: Data obtained from users is very good because it can tell us what users like, or in certain specific applications, what users like. It's a bit like training a person. Most of the important things are that you have decades of experience, training your brain to process things not directly related to your task or profession, but you've gained a general understanding of the world and become smart enough. Then, you can significantly enhance this by doing a small amount of training on the task you're currently working on, and both will contribute. We do have a lot of data flowing in from users, and obviously we are very careful not to violate anyone's privacy, but simply based on aggregated data of user service usage, we can learn how to improve it.

Harry Stebbings: I totally understand. Doing it well and ensuring compliance is important. You have to ensure compliance.

Noam Shazeer: This is very important, because if you just take the simple approach of taking every conversation people have and using that data to train models, then you might leak someone's private life. People communicate and confide in these chatbots, and you certainly don't want to share that content with everyone randomly.

Harry Stebbings: Why is Character.AI an independent company, and not part of Facebook? If you consider Facebook's natural expansion into the metaverse, as an extension of physical friendship groups into the metaverse or non-physical world, why does Character need to be an independent company, rather than part of Facebook, Snapchat, or other social platforms?

Noam Shazeer: My experience from Google is that startups can move faster than large companies, launching products in a quicker way, while large companies tend to be slowed down by concerns about impacting existing products.

Harry Stebbings: I completely understand the speed difference you're talking about. So, do you think startups will win in the next wave of AI innovation and entrepreneurship? Because many people on the show now are saying that this is no longer about companies like Facebook, but more like Microsoft and Adobe. If you had to pick a side, who would win, startups or traditional enterprises?

Noam Shazeer: I think ultimately the end-users will win; users will have many choices. From a business perspective, I think there will be many winners; there will be multiple large companies doing what they're good at, and startups will do what they're good at. We will try our best to transition our company from a startup to a large company, but many individuals, universities, etc., hardware is advancing so quickly that what large companies could do a few years ago, you can do in a university lab or your own garage a few years later.

Business Strategy and AI Philosophy

Harry Stebbings: I completely understand and agree with the point about individuals and universities. I've had Yann LeCun on the show, and he was excellent. He talked about the future of open vs. closed, and why he's so supportive of open. Do you agree that open is a community-driven approach and mechanism, or do you think closed will ultimately win? Many people believe closed will win.

Noam Shazeer: I think there will be a large open and closed ecosystem, with both people who don't share their “secret sauce” and people who share all their “secret sauce.” The ability to experiment on a small scale will ultimately drive more research releases, even if some large entities no longer release research. Of course, closed also has its economies of scale; if you want to provide a product, if you serve thousands of people at once, you can process in batches more efficiently than one person, or running a language model in your basement with your own equipment.

Harry Stebbings: Can I ask you a question? You've been at the center of this ecosystem for many years. What are some perceptions about AI that you wish society would change? You've done many interviews, Noam. You always get asked the same questions, see the same headlines. How do you feel about this, oh my gosh, I really wish people would change their perception that AI will kill everyone, or AI will replace all jobs, or those clickbait headlines everywhere? What do you hope society changes about how people view AI?

Noam Shazeer: The best applications haven't even been invented yet; we are currently at a moment similar to the invention of electricity, or the invention of computers. We still don't know what the coolest things will be.

Harry Stebbings: Sometimes in conversation, your friends do some crazy things. Yes, they do some pretty cool stuff. Models can do the same, bringing in a bit of creativity through hallucinations. Is hallucination a feature or a bug?

Noam Shazeer: We consider them features, or at least that's the case. Basically, our goal and strategy is to release something general and let people use it however they want. If these models hallucinate, which they do, and we state this publicly, then the first use cases that emerge will be those where hallucination is a feature. I'm happy to see entertainment, emotional support, and fun become the earliest use cases, and I'm also happy to see productivity become the earliest use cases; let's let these aspects naturally evolve based on what the technology excels at.

Harry Stebbings: If you consider Google, its role is to help people find information faster, better, and more efficiently.

Noam Shazeer: Yes.

Harry Stebbings: What do you think Character.AI will be like? Because as you said, a million people doing a million things, I say that respectfully, you know? Will we know?

Noam Shazeer: You can ask any company that sells very general tools the same question, like companies selling computers, mobile phones, or telephone services, or even electricity. What is electricity used for? Is it for entertainment? Is it for productivity?

Harry Stebbings: What's been the hardest aspect of building characters? Is there any element that particularly stands out?

Noam Shazeer: Yes, thankfully, everything has gone smoothly. My team is fantastic, all the best researchers and engineers in the industry.

Harry Stebbings: Do you enjoy the transition from CEO to Scaling CEO?

Noam Shazeer: Yes, I still do a lot of technical work and leadership work now, which is very important. I will continue to serve as CEO because I want the company to make the right decisions. So, I don't measure what I do by how interesting it is, but by how useful it is. So, I am very happy to do what I can do.

Harry Stebbings: Can you elaborate on that for me? Sorry, I'm just interested. I don't judge it by how interesting it is, but by how useful it is.

Noam Shazeer: Yes, it's not a question of “Is being a startup CEO more interesting than being an ML researcher at Google?” It's more like, “Hey, I want to advance this technology, what's most effective?”

Harry Stebbings: I guess I don't know if you find that practical and interesting sometimes combine. I love what I do, and I enjoy doing it, and it's most useful.

Noam Shazeer: Many aspects of being a parent are amazing, super fun, but I think it's made me more religious. I decided to change my attitude from “Is what I'm doing now fun?” to “I should be grateful for the opportunity to do something important and meaningful,” which is a big attitude shift in growing up.

Harry Stebbings: If you could call yourself — the day before your first child was born, and give yourself one piece of advice, what would you say?

Noam Shazeer: Get some sleep first.

Harry Stebbings: Really? What do you know now that you would tell yourself: “You know what, I should tell myself…”

Noam Shazeer: Neural language models. Deep dive into neural language models.

Harry Stebbings: One example is, I just had one of the world's most famous hedge fund managers as a guest. He said the only important thing is my wife. Children are not important, I am not important. The only important thing is to take care of my wife. If I take care of her, she will take care of the children and she will take care of me.

Noam Shazeer: Yes, not everything in the world is your responsibility. You should understand what your responsibilities are and what are not your responsibilities, and I think this is very effective in marriage and parenting. I think religion also has a lot to say about this. For example, about the responsibilities you should and shouldn't take on, what you should do, what you should care about, and what things shouldn't worry you.

Harry Stebbings: I'd like to move into a quick final segment. I'll say a short phrase, and you give me your immediate thoughts. You might say, dude, what's going on? I came here to talk about AI.

Noam Shazeer: Oh no no, this is good. Luckily we recorded a lot of content, so you can cut out the crazy parts.

Harry Stebbings: So, what is true that others don't know, but you do?

Noam Shazeer: This technology will become increasingly intelligent. We are currently at a moment similar to the Wright brothers' first flight in the field of AI, driven by both better hardware and research progress. So, no matter how amazing the applications you see now, what will happen in the future might be completely incomparable.

Harry Stebbings: What do you think the timeline for the widespread adoption of this technology is? Is it one to three years? Or is it like the Industrial Revolution, which actually took 20 to 30 years to see the widespread adoption of mechanization and optimization?

Noam Shazeer: Things will develop very quickly; in the next 1 to 3 years, we will see many very cool things happen.

Harry Stebbings: What misconceptions do you think people have about Character.AI, and what do you hope they understand?

Noam Shazeer: From the outside, it looks like an entertainment application, but in fact, we are a full-stack company, first and foremost an AI company, and also a product-first company. Having this positioning means choosing one most important product—which is the quality of the AI. This way, we can focus on making our product better while also advancing AI development, and these two aspects are complementary.

Harry Stebbings: Tell me, what's one aspect of the AI community you'd most like to change? I wish there was a transparent log where you could see the legitimacy of authors, and check AI publications or blogs. But I think many people quickly self-proclaim themselves as experts, and it's really hard to tell who is Yann LeCun, who is Yoshua Bengio, who is Noam Shazeer, and who has transitioned from Web3 and crypto and is now an AI expert.

Noam Shazeer: Too much is being published now, making it hard to know what's good content. I think this has a lot to do with the current state of the field, which is akin to alchemy. No one knows exactly what will work.

So, many people try many different things, and you can come up with some effective solutions through good intuition about machine learning combined with a mathematical understanding of hardware, whether it's hardware you can buy or build yourself. This will lead to some successful applications that people will adopt, but at the same time, it comes with a lot of noise. Because negative results aren't necessarily useful; it might just be because someone made a mistake, or there was a bug, leading to no work.

What's interesting are positive results verified by experiments. For example, if someone can say, “Hey, I did something and got better results on this famous problem,” that becomes very interesting. Then people start to think, why does this work? How can I adopt it?

Harry Stebbings: What did you once believe that later turned out to be wrong?

Noam Shazeer: When I first got into deep learning, around 2012, I experienced some early failures trying to do sparse computation. I thought, surely I can do something better and more efficient by building a sparse network. But I was completely wrong because I didn't understand that the reason this field is so successful now is that we have hardware that is excellent at dense matrix multiplication. So, using this hardware, you can be orders of magnitude faster than any memory-bound task.

When I first started with deep learning, no one could explain this to me. It wasn't until I understood this part that I realized, okay, we can do sparsity, but we have to build it with these dense building blocks so it can run faster. Then, I released the idea of mixture of experts models, or sparsely-gated mixture of experts, an idea that is only now starting to see widespread adoption, but that was back in 2016. Since then, I've had a string of successes, which I attribute to divine intervention but also to a quantitative understanding of hardware mechanics and the computational domain.

Harry Stebbings: What will Character.AI look like in 2033, 10 years from now?

Noam Shazeer: On Mars? I have no idea. We'll just have to wait and see what the technology looks like then, but for us, it's important to remain flexible. It's like if you asked a company in 1900 what it would look like in 2000; there would be such immense technological progress before then that it would be almost impossible to predict where any company would be.

Harry Stebbings: I feel like this interview was completely different from what you've done before. I feel these questions pushed the boundaries of parental roles, questions no one had asked you before. I'm so glad to have you, Noam. I hope you enjoyed the interview too.

Noam Shazeer: Very much enjoyed it!

Original Address: Noam Shazeer: How We Spent $2M to Train a Single AI Model and Grew Character.ai to 20M Users | E1055

https://www.youtube.com/watch?v=w149LommZ-U

Translation: Christine Liu

Please note that this article is translated from the original link at the end, and does not represent the stance of Z Potentials. If you have any thoughts or insights on this article, feel free to leave comments and interact.

Z Potentials will continue to provide more high-quality content on artificial intelligence, robotics, globalization, and other fields. We sincerely invite those who are full of aspirations for the future to join our community, share, learn, and grow with us.

Image

We are recruiting for a new batch of interns

Image

We are looking for creative post-00s startups

ImageImageImage

About Z Potentials

Image

Main Tag:Artificial Intelligence

Sub Tags:Large Language ModelsFuture of AITech EntrepreneurshipAI Ethics


Previous:The Smarter AI Gets, The Less Obedient It Becomes! New Study: Strongest Reasoning Models Only Follow Instructions 50% of the Time

Next:Zhejiang University Team: AI Makes People Lack Motivation, Feel Bored! Research Featured in Nature Sub-Journal

Share Short URL