Jen-Hsun Huang's latest 10,000-word interview: AGI is coming, AI will revolutionize productivity

Character7mos agoupdate AiFun
342 0

On October 4, NVIDIA CEOHuang Jen-hsun (1944-), Chinese-American physicistGuest on the talk show Bg2 Pod for a wide-ranging conversation with hosts Brad Gerstne and Clark Tang.

They focused on topics such as how to scale intelligence to AGI, NVIDIA's competitive advantage, the importance of inference and training, future market dynamics in the AI space, the impact of AI on various industries, Elon's Memphis Supercluster and X.ai, OpenAI, and more.

Jen-Hsun Huang emphasized the rapid evolution of AI technology, especially the breakthroughs on the path to general artificial intelligence (AGI). He stated.AGI Assistants are coming in some form soon and will get better over time.

Jen-Hsun Huang also shared NVIDIA's leadership in the computing revolution, noting that by lowering the cost of computing and innovating hardware architectures, NVIDIA has taken a significant advantage in driving machine learning and AI applications.He specifically mentioned NVIDIA's "moat", a decade-long ecosystem of hardware and software that makes it difficult for competitors to surpass with a single chip improvement.

In addition, Jen-Hsun Huang praisedxAI and Musk's team build 100,000 GPU Memphis supercluster in just 19 days, calling it an "unprecedented" achievement. This cluster is undoubtedly one of the fastest supercomputers in the world and will play an important role in AI reasoning and training tasks.

Talking about the impact of AI on productivity, Jen-Hsun Huang was optimistic that AI will greatly improve the efficiency of enterprises and bring more growth opportunities, and will not lead to mass unemployment. At the same time, he also called on the industry to strengthen its focus on AI security to ensure that the development and use of the technology benefits society.

The key points of the full text are summarized below:

  • (AGI Assistant) in some form soon!...... At first it will be very useful, but not perfect. Then over time it will become more and more perfect.
  • We have reduced the marginal cost of computing by a factor of 100,000 in 10 years.Our entire stack is growing, and our entire stack is innovating.
  • People think the reason for designing a better chip is that it has more triggers, more bits and bits ......But machine learning isn't just about software; it's about the entire data pipeline.
  • It's the machine learning flywheel that counts.You have to think about how to make this flywheel faster.
  • Simply having a powerful GPU does not guarantee a company's success in AI.
  • MuskUnique understanding of engineering and building large systems and resource provisioning... 100,000 GPUs as a cluster... in 19 days.
  • AI won't change every job.But it will have a huge impact on the way people work.When companies use AI to improve productivity, it usually shows up as better earnings or growth.

The Evolution of AGI and AI Assistants

Brad Gerstner:

This year's theme is Extending Intelligence to AGI. when we did this two years ago, we did it in the age of AI, and that was two months before ChatGPT, which is incredible considering all these changes. So I think we can start with a thought experiment and a prediction.

If I think of AGI colloquially as a personal assistant in my pocket, if I think of AGI as that spoken assistant, I'm used to it. It knows everything about me. It has a perfect memory of me and can communicate with me. They can book a hotel for me or make a doctor's appointment for me. Look at the speed of change in today's world, when do you think we will have personal assistants?

Jen-Hsun Huang:

Soon it will be in some form or another. And this assistant will get better and better as time goes on. That's the wonderful technology we know. So I think at first it's going to be very useful, but not perfect. And then over time it will become more and more perfect. Like all technology.

Brad Gerstner:

When we look at the pace of change, I think Musk said that the only thing that really matters is the pace of change. We do feel like the pace of change has dramatically accelerated, and it's the fastest pace of change we've ever seen on these issues because we've been poking around in AI for a decade, if not longer. Is this the fastest rate of change you've seen in your career?

Jen-Hsun Huang:

This is because we reinvented computing. A lot of this happened because we reduced the marginal cost of computing by a factor of 100,000 in 10 years. Moore's Law should be about 100 times that. We achieved this in a number of ways. First, we introduced accelerated computing, putting less efficient work on the CPU on the GPU. We achieved this by inventing new numerical precision. We did this by inventing new architectures, inventing tensor cores, building MV Link in a systematic way, as well as very, very fast memory, and scaling and working across the stack using MV Link. Basically, everything I've described about how NVIDIA does things has led to a rate of innovation that is beyond Moore's Law.

Now what's really amazing is that since then, we've moved from manual programming to machine learning. The amazing thing about machine learning is that machine learning can be learned very quickly. That's proven to be true. So when we reformulated the way we allocate computation, we did a lot of, various kinds of parallelism. Tensor parallelism, all kinds of pipeline parallelism. We're good at inventing new algorithms and new training methods on top of that, and all of these techniques, all of these inventions are the result of stacking on top of each other.

In retrospect, if you look at how Moore's Law works, software is static. It was pre-compiled, like a shrink raft put into a store. It was static, and the hardware underneath grew at the rate of Moore's Law. Now, our whole stack is growing, and the whole stack is innovating. So I think now we're suddenly seeing scaling.

That's certainly remarkable. But we used to talk about pre-training models and scaling at that level and how we doubled the size of the model and therefore doubled the data size accordingly. As a result, the amount of computing power required quadruples every year. That's a big thing. But now we're seeing scaling for post-training, we're seeing scaling for inference. So people used to think that pretraining was hard and inference was easy. Now everything is hard. That makes sense, but the idea that all human thinking is disposable is kind of ridiculous. So there has to be a concept of fast thinking, slow thinking, reasoning, reflection, iteration and simulation. And now it's emerging.

NVIDIA's competitive moat

Clark Tang:

I think one of the most misunderstood things about NVIDIA is how deep the true NVIDIA model goes. I think there's a perception that if someone invents a better chip, they've won. But the truth is, you spend a decade building the full stack from GPU to CPU to networking, and especially the software and libraries that support the applications. It runs on NVIDIA. So I think you talked about that, but when you think about the moat that NVIDIA has today, do you think the video model is bigger or smaller today than it was three or four years ago?

Jen-Hsun Huang:

Well, I appreciate that you recognize how computing has changed. In fact, it was thought (and many people still think) that the reason a better chip was designed was because it had more triggers, more bits and bits. Do you see what I mean? You'll see the slides from their keynote. It has all these triggers and bar graphs and that sort of thing. These are great. I mean, look, horsepower does matter. It does. So these things fundamentally matter.

Unfortunately, however, it's all about ideas. It's all ideas in the sense that the software is some application running on Windows and the software is static, right? That means the best way to improve the system is to build faster and faster boats. But we realize that machine learning is not human programming. Machine learning isn't just about software, it's about the entire data pipeline. In fact, it's the flywheel of machine learning that matters. So how do you see me enabling this flywheel? On the one hand, it's about enabling data scientists and researchers to work efficiently in this flywheel, which has been in place since the beginning. A lot of people don't even realize that they need AI to manage the data to teach AI. and AI itself is pretty complex.

Brad Gerstner:

Is AI itself improving? Is it also accelerating? Again, when we think about competitive advantage, yes, yes. It's a combination of all of those.

Jen-Hsun Huang:

Exactly, it's the availability of smarter AI to manage data that has led to this. We even now have synthetic data generation and all sorts of different ways of managing data and presenting data to it. So you're involved in a lot of data handling before you're even trained. So people are like, oh.Pytorch, this is the beginning of the world and the end of the world. It's very important.

But don't forget, before and after Pytorch, the flywheel is about the way you have to think about it, how I should think about the whole flywheel, how I should design a computational system, a computational architecture, that will help you to utilize that flywheel and make it as efficient as possible. It's not the size of an app training. Does that make sense? It's just one step. Okay. Every step on the flywheel is hard. So the first thing you should do is not think about how to make Excel faster, how to make doom faster, that's in the past, isn't it? Now you have to think about how do you make this flywheel faster? There are a lot of different steps in this flywheel, and machine learning is not easy, as you all know.

What the OpenAI or X or Gemini teams do is not easy, and they think deeply about us. I mean, what they do is not easy. So we decided, look, this is what you should be thinking about. It's the whole process, and you want to accelerate every part of it. You want to respect Doles' Law, and Doles' Law suggests that if this is 301 TP4T of time, and I've accelerated it three times, then I haven't really accelerated the whole process. Does that make sense? You really want to create a system that accelerates every step of the way, because only by doing the whole thing can you really materially improve the cycle time and the flywheel, which is the learning rate, which is ultimately what leads to exponential growth.

So, what I'm trying to say is that our view of what the company is really doing will be reflected in the product. Notice I keep talking about this flywheel, the whole site. Yeah, that's right. We accelerate everything.

Right now, the main focus is on video. A lot of people are focused on physical AI and video processing. Imagine the front end. There are terabytes of data coming into the system every second. As an example, a pipeline will receive all this data. It would first be ready for training. Yes, so that the whole process can be accelerated.

Clark Tang:

Today people only think about text models. Yes, but the future is that this video model, using some textual model like o1, really handles a lot of data before we get there.

Jen-Hsun Huang:

Yes. So language modeling would be involved in everything. But we, the industry has spent a tremendous amount of technology and effort to train language models, to train these large language models. Now, we use large language forms at every step of the way. That's pretty remarkable.

Brad Gerstner:

What I hear you saying is that in a combined system, yes, the advantage grows over time. So I hear you saying that we have a greater advantage today than we did three to four years ago because we're improving each component. That's the combination, and when you think about, for example, as a business case study, Intel, relative to where you are now, it has the dominant model, the dominant position in the stack. Maybe, to summarize it a little bit more briefly, compare your competitive advantage to the competitive advantage that they had at the peak of their cycle.

Jen-Hsun Huang:

Intel is different because they were probably the first company to excel in manufacturing process engineering and manufacturing. Manufacturing as stated above is making chips. Designing chips, building them in x86 architectures, and making faster and faster x86 chips is where their talent lies, and they blend it with manufacturing.

Our company is a little different, and we recognize that, in fact, parallel processing does not require every transistor to perform well, serial processing requires every transistor to perform well. Parallel processing requires a lot of transistors to be more cost effective. I'd rather have 10 times more transistors and be slower 20%. then 10 times fewer transistors and be faster 20%. does that make sense? They want the opposite. Thus, single-threaded performance, single-threaded processing, and parallel processing are very different. So we observe that, in fact, our world is not getting better as we go along. We want to do very well, as well as we can, but our world is really getting better.

Parallel computing, parallel processing is hard because each algorithm requires a different way of refactoring and rebuilding the architecture of the algorithm. What people don't realize is that you can have 3 different CPUs. they all have their own C compilers. You can compile software to that axis.

This is not possible in accelerated computing. The company that comes up with the architecture has to come up with its own Open GL. so we revolutionized deep learning because we have a domain-specific library called cuDNN (deep neural network library), a domain-specific library called optical. We have a domain specific library called cuQuantum.

Brad Gerstner:

For the industry-specific algorithms located below, you know, the Pytorch layer that everyone focuses on. Like I hear all the time.

Jen-Hsun Huang:

If we don't invent it, none of the apps on it will work. Do you guys understand what I'm saying?So it's the algorithms that NVIDIA is really good at. The propagation of science to science on top of the underlying architecture, that's what we're really good at.

NVIDIA is building a complete AI computing platform, including hardware, software and an ecosystem

Clark Tang:

All the attention is now focused on reasoning. But I remember, two years ago, I asked you a question when I had dinner with Brad, do you think your moat will be as strong on the reasoning side as it is on the training side?

Jen-Hsun Huang:

I'm not sure I said it would be stronger.

Clark Tang:

You just mentioned a lot of these elements, the combinability between the two or, we don't know, the overall combination. It's very important for clients to be able to maintain flexibility between the two. But, since we're in the age of reasoning now, can you talk about that?

Jen-Hsun Huang:

Reasoning training is reasoning on that scale. I mean, you're right. So if you train it properly, then there's a good chance that you're going to reason properly, and if you build it on this architecture without any thought, it's going to run on this architecture. Well, you can still go ahead and optimize it for other architectures, but at least, because it's been built on NVIDIA, it will run on NVIDIA.

Now, the other aspect of course is just kind of the capital investment aspect, which is that when you train a new model, you want to train it on your best new equipment. That's going to leave behind the devices that you used yesterday that were perfect for inference. So there's a range of free devices that are compatible behind the new infrastructure. So we're very strict about making sure that we're always compatible so that everything we leave behind will continue to be excellent.

Now, we also put a lot of effort into constantly reinventing new algorithms so that when the time comes, the Hopper architecture is twice, three times, four times better than it was when they purchased it, so that this infrastructure continues to be really effective. So all the work that we do, improving new algorithms, new frameworks. Note that it helps every installed base that we have. hopper is better for it, Ampere is better for it, even Volta is better for it.

I think Sam Altman just told me that they just recently decommissioned OpenAI's Volta infrastructure. So I think we left a trail of this installed base, just like all compute installed bases are important. NVIDIA is involved in every cloud, both local and at the edge.

VILA's visual language models are created in the cloud without modification and work perfectly at the robot edge. They all have good compatibility. So I think architectural compatibility is very important for large devices, as well as for the iPhone and other devices. I think the installation base is very important for reasoning.

Jen-Hsun Huang:

But what really benefited me was the fact that we were trying to train these large language models in new architectures. We're able to think about how to create architectures that will perform well in reasoning when the time comes someday. So we've been thinking about iterative models for reasoning models and how to create very interactive reasoning experiences for that, right, your personal agent. You don't want to leave and think for a while after you finish talking. You want to interact with you very quickly. So how do we create something like that?

MVLink so that we can employ these systems that are perfect for training. But when you're done with it, the inference performance will be excellent. So you want to optimize this time to first Token. And time to first Token is actually very hard to do because time to first Token requires a lot of bandwidth. But if your context is also rich, then you need a lot of FLOPS. so you need an infinite amount of bandwidth and at the same time an infinite amount of FLOPS in order to achieve a response time of a few milliseconds. So this architecture is really hard to implement. We invented the great Blackwell MVLink for that.

Brad Gerstner:

Had dinner with Andy Jassy (President and CEO of Amazon) earlier this week and Andy said we have Tranium, Inferentia coming up. Again, I think most people are looking at these as NVIDIA issues. But the next thing he said was that NVIDIA is a key partner for us and will continue to be a key partner for us. As far as I can see, the world will rely on NVIDIA in the future.

So when you think about the custom ASICs that are being built, they're going to be used for the target application. Maybe it's Meta's inference gas pedal, maybe it's Amazon's training, or Google's TPU. and then you think about the supply shortages that you're facing today, do those factors change that dynamic? Or will they complement the systems they're buying from you?

Jen-Hsun Huang:

We're just doing different things. Yes, we're trying to accomplish different things. Now NVIDIA is trying to build a computing platform for this new world, this machine learning world, this generative AI world, this agent-based AI world. We're trying to create, and one of the things that's so profound in computing is that after 60 years of development, we've reinvented the entire computing stack. From programming to machine learning, from CPUs to GPUs, from software to AI, applications from software to AI. from software tools to AI. so every aspect of the compute stack and the technology stack has changed.

What we want to do is create a computing platform that is available everywhere.That's really the complexity of what we're doing, and the complexity of what we're doing is that if you think about what we're doing, you realize that we're building an entire AI infrastructure, and we think of it as a computer. As I said before, the data center is now the unit of computing. For me, when I think about computers, I'm not thinking about chips. I'm thinking about this thing. It's my thought model of all the software, all the orchestration, all the machines that are in there. This is my computer.

We try to build a new one every year. Yeah, it's crazy. No one's ever done that before. We try to build a brand new one every year. Every year, we deliver two to three times the performance. So every year, we reduce the cost two to three times. Every year, we improve energy efficiency two to three times. So, we ask our customers not to buy everything at once, but just a little bit each year, right? Right. The reason for this is that we want their costs to stay even in the future. Right now, everything is architecturally compatible, so it would be very difficult to build these things individually at the rate we're going.

Now, the doubly difficult part is that we accept all of that, rather than selling it as infrastructure or as a service, we don't agree with all of that. We integrate it into GCP, AWS, Azure, X. So everybody's integration is different. We have to integrate all of our architectural libraries, all of our algorithms, and all of our frameworks into their frameworks. We integrate our security system into their system, we integrate our network into their system, right? And then we basically do 10 integrations, and now we do that every year. That's the miracle.

Brad Gerstner:

We, I mean, you try to do this every year, which is crazy. So what drives you to do this every year?

Jen-Hsun Huang:

Yeah, that's when you break it down systematically. The more you break it down, the more each person breaks it down, the more surprised they are. Yes. How the entire electronics ecosystem today can commit to working with us to ultimately build a computer cube that integrates into all of these different ecosystems and coordinates so seamlessly. So clearly we're spreading APIs, methods, business processes, and design rules backward, and methods, architectures, and APIs forward.

Brad Gerstner:

That's what they were supposed to be.

Jen-Hsun Huang:

They've been at it for decades. Yes, and evolving as we do. But these APs have to be integrated.

Clark Tang:

All someone has to do is call the OpenAI API and it works. That's it.

Jen-Hsun Huang:

Yeah. Yeah, it's a little crazy. It's a whole. It's what we invented, this massive computing infrastructure that the whole planet is working with. It blends in everywhere. You can sell it through Dell, you can sell it through HP. It's hosted in the cloud. It's everywhere and nowhere. People are using it now in robotic systems, robots and human robots, they're in self-driving cars. They're all architecturally compatible. Pretty crazy.

Brad Gerstner:

This is crazy.

Jen-Hsun Huang:

I don't want you to get the impression that I didn't answer the question. In fact, I answered it. When we really layered the foundation, I mean the way of thinking. We're just doing some different things. Yes, as a company, we want to be informed, and I'm very knowledgeable about everything around the company and the ecosystem, right?

I know everyone else is doing something else with what they're doing. Sometimes that works against us, sometimes it doesn't. I'm very aware of that, but it doesn't change the goal of the company. Yes, the company's only goal is to build a platform architecture that can be used everywhere. That is our goal.

We're not trying to take any share from anyone. NVIDIA is a market maker, not a share taker. If you look at the slides that we don't show as a company, you'll see that this company doesn't talk about market share one day, internally. All we talk about is how do we create the next thing?

What is the next question we can address in this flywheel? How can we serve people better? How can we shorten a flywheel that used to take about a year to about a month? Yes. And what is the speed of light? Isn't it?

So we're thinking about all these different things, but one thing we don't, we don't, we know everything about everything, but we're sure that our mission is very unique. The only question is whether that mission is necessary. Does it make sense? All companies, all great companies should have this at their core. It's about what are you doing?

Of course. The only question is, is it necessary? Does it have value? Yes. Does it have impact? Does it help people? I'm sure you're a developer, you're a generative AI startup, and you're about to decide how you want to be a company.

One choice you don't have to make is which A6 I support?If you only support CUDA, you can go anywhere. You can always change your mind later. But we are the gateway to the AI world, aren't we?

Once you decide to join our platform, you can postpone all other decisions. You can always build your own foundation later. We are not against it. We don't get mad about it. When I work with all GCPs, GCP Azure, we show them our roadmap years in advance.

They didn't show us their basic roadmap, and that never offended us. Does that make sense? We create, we are in one. If you have a singular purpose, your goals make sense, and your mission is precious to you and others, then you can be transparent. Note that my roadmap is transparent on GTC. My roadmap goes deeper for our friends at Azure, AWS, and others. We have no problem doing any of these things, even if they are building their own assets.

Brad Gerstner:

I think when people look at the business, you recently said that the demand for Blackwell is crazy. You said that one of the hardest parts of the job is saying "no" to people with emotional tools when the world lacks the computing power that you can produce and deliver. But that's what the critics say. Wait a minute. They say it's like Cisco in 2000, where we overbuilt fiber. It's going to be boom and bust. I think about when we had dinner in early '23. at that dinner in January '23, NVIDIA's prediction was that revenue in 2023 would be $26 billion. You made it to $60 billion.

Jen-Hsun Huang:

Just let the facts come out. This is the biggest prediction failure in the history of the world. Right. We can at least admit it.

GPUs play an increasingly important role in AI computing

Brad Gerstner:

That was, that was something that we were very excited about on November 22nd because we had people like Mustafa from Inflection and we didn't have people from Character coming to our office talking about investing in their company. They said, well, if you can't invest in our company, then buy Nvidia, because everyone in the world is trying to get Nvidia chips to build these apps that are going to change the world. Of course, the Cambrian moment happened on ChatGPT. Nonetheless, these 25 analysts remain so focused on cryptocurrency winners that they can't imagine what's happening in the world. So it ended up being much bigger. In very plain English, the demand for Blackwell is insane, and it's going to stay that way for as long as you can, for as long as you can foresee. Of course, the future is unknown and unknowable. But why are the critics so wrong in thinking this won't be overbuilt like Cisco was in 2000.

Jen-Hsun Huang:

The best way to think about the future is from first principles, right? Okay, so, to the question, what are the first principles of what we're doing? Number one, what are we doing? The first thing we're doing is reinventing computing, isn't it? We just said that the future of computing will be highly machine-learning. Yes, highly machine-learning. Okay, almost everything we do, almost every application, Word, Excel, Powerpoint, Photoshop, Premier, AutoCAD, your favorite application is designed by hand. I promise you, in the future it will be highly machine-learning. Right? So all of these tools will be so, and most importantly, you'll have machines, agents to help you use them. Right. So now we know that's true. Right? We've reinvented computing. We're not going back. The entire computing technology stack is being reinvented. Okay. Now that we've done that, we said software would be different. What software can write will be different. The way we use software will be different. So now let's acknowledge that. So those are my basic facts right now. Yes.

The question now is what happens? Let's look back at home computing of the past. There was $1 trillion invested in computing in the past. Let's look at it, just open the door, look at the data center, look at it. Are these computers the future you want? The answer is no. You have all these CPUs in there. we know what it can and cannot do. All we know is that we have $1 trillion that we need to modernize our data centers. So right now, as we speak, if we're going to modernize these old things over the next four or five years. That's not unreasonable.

So we have a trend where you're talking to people who have to modernize it. Yes, they are modernizing it on the GPU. That's it.

Let's do another test. You have $50 billion in capital expenditures. You like to spend Option A, Option B, to build capex for the future, right?

Or build capital expenditures as you have in the past, and now you have the capital expenditures of the past, right? Yeah, right. It's right there. It's not much better anyway. Morse's Law is basically over. So why rebuild it?

We only took out $50 billion and put it into generative AI, right? So now your company is better. Right? Now how much of that $50 billion would you put in? Well, I would put $50 billion into 100% because I've had four years of infrastructure that's past that.

So now you're just, I'm just reasoning from the perspective of somebody thinking from first principles, and that's what they're doing. Smart people doing smart things. Now, the second part is this. Then we have a trillion dollars worth of capacity. Go for it, Bill.

Trillions of dollars worth of infrastructure. It's about $150 billion. OK. So we have a trillion dollars of infrastructure that needs to be built over the next four or five years. Well, the second thing we observed is that software is written differently, but software is used differently.

In the future, we'll have agents. Our companies will have digital employees. In your inbox, you'll see these little dots on low profile faces. In the future, things mean low profile icons for AIS. Right? I'll send those to them.

I'm not programming computers in C++ anymore, I'm going to program AI with hints. right? Now, that's no different than me talking to me this morning.

I wrote a lot of emails before I came here. I was certainly prompting my team. I would describe the background, I would describe the basic limitations that I knew, I would describe their tasks. I'd leave enough space, I'd give enough direction so that they understood what I needed. I wanted to be as clear as possible about what the outcome should be, but I left enough room for ambiguity, a little room for creativity, so they could surprise me.

Right? It's no different than the way I prompted AI today. Yes, that's exactly how I'm suggesting AI. So there's going to be a new infrastructure on top of the infrastructure that we're going to modernize. This new infrastructure is going to be operating thesedigital personof AI factories, they will run 24/7.

We will have them for all companies around the world. We will have them in our factories and we will have them in our autonomous systems. Right? So there's a whole layer of computing architecture. This whole layer I call the AI factory, which the world has to make, but simply doesn't exist today.

So the question is, how big is this. It's not known yet. It could be in the trillions of dollars.I know what's going on right now, but as we sit here and build it, the wonderful thing is that the modern architecture of this new data center is the same as the architecture of the AI factory. That's a good thing.

Brad Gerstner:

Can you make it clear that you have a trillion dollars of old stuff. You have to modernize. You have at least a trillion new AI workloads coming. Yes, you will have $125 billion in revenue this year. You were once told that the market capitalization of this company would never exceed $1 billion. As you sit here today, is there any reason for that? Yes, if you only have $125 billion out of trillions of Tam, then your future revenue will not be 2x or 3x what it is now. Is there any reason why your revenue is not growing? There isn't.

Jen-Hsun Huang:

As you know, not everything is like that, companies are only limited by the size of their fishponds, and goldfish ponds can only be so big. So the question is, what is our fish pond? What is our pond? It takes a lot of imagination, which is why market makers think about the future without creating new fishponds. It's hard to figure that out looking back and trying to capture market share. Right. Share takers can only be so big. Certainly. Market makers can be very big. Sure.

So I think the good fortune that our company has is that from the very beginning of our company we had to create the market in order to swim in it. People didn't realize it at the time, but now people do, but we were at the beginning of creating the 3D gaming PC market. We basically invented that market and all the ecosystems and graphics card ecosystems that we invented. So the need to invent a new market to serve it later is a very comfortable thing for us.

Jen-Hsun Huang: I'm happy for OpenAI's success

Brad Gerstner:

As you know, OpenAI raised $6.5 billion this week at a $150 billion valuation. We're all in.

Jen-Hsun Huang:

Yeah, really happy for them, really glad they came together. Yeah, they did a great job and the team did a great job.

Brad Gerstner:

They are reportedly going to have revenues or operating income of about $5 billion this year and possibly $10 billion next year. If you look at the business today, it's about twice as much revenue as it was at the time of Google's initial public offering. They have 250 million, yes, 250 million average weekly users, which we estimate is twice what Google had at the time of its initial public offering. If you look at the P/E of this company, if you believe it's going to be $10 billion next year, it's about 15 times the expected revenue, which is what Google and Meta had at the time of their IPO. Imagine a company that 22 months ago had zero revenue and zero average weekly users.

Talk to us about how important OpenAI is to you as a partner and the power of OpenAI as a driver of public awareness and use of AI.

Jen-Hsun Huang:

Okay.This is one of the most important companies of our time, a pure AI company pursuing an AGI vision. Whatever the definition of it is. I hardly think it matters at all what the definition is, and I don't think timing matters either. One thing I do know is that AI will have a capability roadmap over time. And that capability roadmap is going to be spectacular and peculiar. And in the process, long before it reaches anyone's definition of AGI, we'll make the most of it.

All you have to do is, right now, as we speak, go talk to digital biologists, climate technology researchers, materials researchers, physical scientists, astrophysicists, quantum chemists. You can go ask video game designers, manufacturing engineers, roboticists. Pick your favorite. Whichever industry you want to choose, you're going to have to dig deeper, talk to the people who matter, and ask them if AI has revolutionized the way you work. You collect those data points and then go back and ask yourself how skeptical you want to be. Because they're not talking about the conceptual benefits of AI. They're talking about using AI in the future. right now, agtech, materials tech, climate tech, you choose your technology, you choose your field of science. They're advancing. ai is helping them advance their work.

Now, as we said, every industry, every company, every height, every university. Unbelievable. Right? Absolutely. I'm gonna change business somehow. We know that. I mean, we know it's so tangible.

Today. It's happening. It's happening. So, I think the awakening of ChatGPT triggered it, which is totally incredible.I love their speed and their unique goal of moving the field forward, which is really important.

Brad Gerstner:

They build the economic engine that can fund the next modeling frontier. I think there was a consensus emerging in Silicon Valley that the whole modeling layer, the commoditization of Llama, enabled a lot of people to build models very inexpensively. So early on we had a lot of modeling companies. These, features, tone and cohesion were on the list.

A lot of people question whether these companies will be able to build escape velocity on the economic engine that will continue to fund the next generation. My own feeling is that's why you see consolidation. openAI clearly reached velocity. They can fund their future. I'm not sure many other companies can. Is that a fair assessment of the current state of the modeling layer? We're going to do what we've done in many other markets, and we're going to bring this integration to the market leaders who can afford it, who have the economic engines and the applications that will allow them to continue to invest.

Just having a powerful GPU doesn't guarantee a company's success in AI

Jen-Hsun Huang:

First, there is a fundamental difference between models and AI. It is. Models are essential elements. Right. It is necessary but not sufficient for AI. Right. So AI is a capability, but for what, right? And what is it used for? Right? AI for software-driven cars is related to, but not the same as, AI for human robots, which is related to, but not the same as, AI for chatbots.

So you have to understand the taxonomy. Yes, the taxonomy of the stack. At every level of the stack, there are opportunities, but not every level of the stack offers unlimited opportunities for everyone.

Now, I just made a comment that what you're doing is replacing the word model with GPU. In fact, that was a great observation that we made as a company 32 years ago, that there is a fundamental difference between GPUs, graphics chips, or GPUs, and accelerated computing. Accelerated computing is not the same as what we're doing with AI infrastructure. They are related, but not identical. They are superimposed on each other. They are not identical. And each of these abstraction layers requires a completely different skill set.

People who are really good at building GPUs don't know how to become an accelerated computing company.I can give you an example, there's a lot of people that make GPUs. i don't know which one came later, we invented GPUs, but you know we're not, we're not the only company that makes GPUs today, right? There are GPUs everywhere, but they're not accelerated computing companies. There are a lot of them that do. They have gas pedals that do application acceleration, but that's not the same as an accelerated computing company. For example, a very specialized AI application, right, that could be a very successful thing, right?

Brad Gerstner:

This is MTIA (Mata's self-developed next-generation AI acceleration chip).

Jen-Hsun Huang:

Right.But it may not be the kind of company that brings influence and capability.So you have to decide what you want to be. There may be opportunities in all these different areas. But just like building a company, you have to be mindful of how the ecosystem changes and what gets commoditized over time, recognizing what's a feature, what's a product, and yes, what's a company. OK. I just talked about, well, there's a lot of different ways you can think about this.

xAI and the Memphis supercomputer cluster have reached "the age of 200,000 to 300,000 GPU clusters."

Brad Gerstner:

Of course, there is one new entrant with money, smarts and ambition. That would be xAI. Yes, right. And there are reports that you had dinner with Larry Ellison and Musk. They convinced you to give up 100,000 H100 chips. They went to Memphis and built a large coherent supercluster in a matter of months.

Jen-Hsun Huang:

Three o'clock. Don't equate, okay? Yes, I had dinner with them.

Brad Gerstner:

Do you think they have the capacity to build this supercluster? There are rumors that they want another 100,000 H200s, right, to scale up this super cluster. First of all, talk to us about X and their ambitions and what they've accomplished, but at the same time, have we reached the age of 200,000 to 300,000 GPU clusters?

Jen-Hsun Huang:

The answer is yes.And then first, recognizing the achievement. From the moment of conception to the moment the data center was ready for NVIDIA to install our equipment there to the moment we fired it up, hooked it up, and did our first training session, it was all worth it.

Jen-Hsun Huang:

Okay.So the first part of it is building a huge plant in such a short period of time, water-cooling it, electrifying it, getting permits, I mean, it's like Superman.Yes, as far as I know, there is only one person in the world who can do that. I mean, Musk's understanding of engineering and building large systems and marshaling resources is unique. Yes, it's incredible. And of course, his engineering team is great. I mean, the software team is great, the network team is great, the infrastructure team is great. Musk understands that.

From the moment we decided to start planning with the engineering team, the network team or the infrastructure computing team, the software team, all the preparations were ahead of schedule. And then all of the infrastructure, all of the logistics, the amount of technology and equipment that was shipped that day, the video infrastructure and the computing infrastructure, and all of the technology that was needed for the training, 19 days were up in the air, do you want anything? Did.

Take a step back and think about it. Do you know how many days are 19, how many weeks are 19? Right? If you look at it in person, the amount of technology is incredible. All the wiring and the networking, the networking of the NVIDIA devices is very different than the networking of a hyperscale data center. Okay, how many wires does a node need. The backs of computers are full of wires, and to integrate this whole bunch of technology and all the software together is incredible.

So I think what Musk and the X-team have done, and I'm very grateful to him for recognizing the engineering work that we've done with him and the planning work and so forth. But what they've accomplished is unique and has never been done before. Just from that perspective. A hundred thousand GPUs, as a cluster, that's easily the fastest supercomputer on the planet. The supercomputers you build usually take three years of planning. Then they deliver the equipment and it takes a year to get them all up and running. Yes, we're talking 19 days.

Clark Tang:

What is NVIDIA's credit?

Jen-Hsun Huang:

Everything is already working. Yes, of course, there's a whole bunch of X algorithms, X frameworks, X stacks, and so on. We said we had tons of reverse integration to do, but the planning was excellent. Just pre-planning.

Large-scale distributed computing is an important direction for future AI development

Brad Gerstner:

One end is right. Musk is on one end. Yes, you, but you answered that question by starting off by saying, yes, there are 200 to 300,000 GPU clusters here. Yes, right. Can this scale to 500,000? Can it scale to a million? Do your product requirements depend on it scaling to 2 million?

Jen-Hsun Huang:

The last part is a negative.My feeling is that distributed training has to work. My feeling is that distributed computing will be invented. Some form of federated learning and distributed computing, asynchronous distributed computing will be discovered.

I'm very enthusiastic and optimistic about this, but of course, the thing to realize is that the scaling laws used to be about pre-training. Now we've moved to multimodality, we've moved to synthetic data generation, and post-training has now scaled incredibly well. Synthetic data generation, reward systems, reinforcement based learning, and then now inference scaling has peaked. A model has done an incredible 10,000 internal inference before it answers your answer.

This may not be unreasonable. It may have completed a tree search. It may have done reinforcement learning based on that. It may, it may have done some simulations, certainly done a lot of reflection, probably looked up some data, looked at some information, didn't it? So he's probably got quite a lot of background. I mean, this type of intelligence is. Well, that's what we did. That's what we've done. Isn't it? So for the capability, this extension, I just did the math and compounded it with the model size and the computational size four times per year.

On the other hand.Demand continues to grow in terms of usage.Do we think we need millions of GPUs? No doubt. Yes, now that's a yes. So the question is, how do we build it from a data center perspective? A lot of it has to do with whether the data center is a few gigawatts at a time or 250 megawatts at a time. My sense is that you're going to get both.

Clark Tang:

I think analysts always focus on current architecture bets, but I think one of the biggest takeaways from this conversation is that you're thinking about the entire ecosystem and many years into the future. So, because NVIDIA is just scaling up or scaling out, it's to meet future demand. That's not to say that you can only rely on a world with 500,000 or even a million GPU clusters. When distributed training comes along, you write software to implement it.

Jen-Hsun Huang:

We developed Megatron seven years ago, and yes, scaling of these large training tasks happens. So we invented Megatron, so all the model parallelism that's going on, all the breakthroughs in distributed training and all the batch processing and all that stuff is because we did the early work and now we're doing the early work for the next generation.

AI changes the way work is done

Brad Gerstner:

So let's talk about Strawberry and o1. i think it's cool that they named it after o1. It means recruiting the best and brightest people in the world and bringing them to America. I know we're all passionate about that. So I love the idea of building a model of thinking that will take us to the next level of extended intelligence, right, and it's a tribute to the fact that it's these people who came to America through immigration that made us what we are, that brought their collective intelligence to America.

Jen-Hsun Huang:

Of course. And alien intelligence.

Brad Gerstner:

Of course. This was spearheaded by our friend Noam Brown. Reasoning about how important temporal reasoning is as a whole new vehicle for extending intelligence is separate from just building bigger models.

Jen-Hsun Huang:

It's a big deal. It's a big deal. I think a lot of intelligence can't be done a priori. Right. A lot of computations, even a lot of computations can't be reordered. I mean, unordered execution can be done a priori, and a lot of things can only be done at runtime.

So, whether you're thinking about it from a computer science perspective or from an intelligence perspective, too many things need context. Context, right. And quality, the type of answer you're looking for. Sometimes a quick answer is enough. It depends on the consequences of the answer, the impact. It depends on the nature of the use of the answer. So, some answers, please take an evening, some answers take a week.

Yes. Right? So I can totally imagine me sending a prompt to my AI telling it to think about it for a night. Think about it all night. Don't tell me right away. I want you to think about it all night and then tell me tomorrow. What's your best answer and reasoning for me. So, I think from a product standpoint, the quality now, the smart segmentation. There will be one-off versions. Absolutely. And some that take five minutes.

对吧?还有人类。所以如果你愿意的话,我们将成为一个庞大的员工群体。他们中有些是AI中的数字人,有些是生物人,我希望有些甚至是超级机器人。

Brad Gerstner:

I think that's a grossly misunderstood thing from a business perspective. You've just described a company that has the equivalent amount of output of a company with 150,000 people, but you've done it with only 50,000 people. That's right. Now, you're not saying I'm going to fire all my employees. No. You're still increasing the number of employees in the organization, but the amount of output from that organization will increase dramatically.

Jen-Hsun Huang:

This, this is often misunderstood. ai is not me. ai is not going to change every job. ai is going to have a huge impact on the way people work. Let's recognize that.AI has the potential to do incredible good. It also has the potential to cause harm. We must build safe AI.Yes, let's lay that foundation. YES. All right.

Jen-Hsun Huang:

The part that people overlook is that when a company uses AI to improve productivity, it will likely show up as better earnings or better growth, or both. When that happens, the CEO's next email will likely not be about layoffs.

Brad Gerstner:

Of course it's an announcement, because you're growing.

Jen-Hsun Huang:

The reason is that we have more ideas that we can explore and we need people to help us think carefully before we automate. So the automation part, AI can help us do that. Obviously, it will also help us think about it, but it still requires us to figure out what problem am I trying to solve. There are trillions of problems that we can solve. So what are the problems that companies need to solve, pick those ideas and figure out ways to automate and scale them. So as we become more productive, we're going to hire more people. People forget that, and if you go back in time, obviously we have more ideas today than we did 200 years ago. That's why the GDP is bigger and more people are employed. Even though we're automating like crazy at the bottom.

Brad Gerstner:

It's a very important point in this period that we're entering a period where almost all human productivity, almost all human prosperity is a byproduct of automation. The technology of the last 200 years. I mean, you can look at the creative destruction of Adam Smith and Shem Peter, you can look at the graph of GDP per capita growth over the last 200 years, and now it's accelerating.

Yeah, that got me thinking about this. If you look at the 90s, we had productivity growth in the U.S. of about 2.5% to 3% per year, okay? And then in 2010, it slowed down to about 1.8%. and then the last 10 years have been the slowest decade for productivity growth. So that's our fixed amount of labor and capital or amount of output that's actually the slowest on record.

A lot of people are debating the reason for this. But if the world is really as you describe, and we are going to utilize and make intelligence, then aren't we on the verge of a dramatic expansion of human productivity?

Jen-Hsun Huang:

It's our hope. This is our hope. Of course, we live in this world, so we have direct evidence.

We have direct evidence of either isolated cases or individual researchers who are able to utilize AI to explore science at unimaginable mega-scales. That's productivity. One hundred percent measures productivity, or the fact that we're designing such incredible chips at such a high rate. The complexity of the chips we're building and the complexity of the computers we're building is growing exponentially, and a company's employee base is not a measure of productivity, right.

The software we develop is getting better and better because we use AI and supercomputers to help us. The number of employees has grown almost linearly. Another reflection of productivity.

So, I can delve into that and I can sample a lot of different industries. I can check it out myself. Yeah, you're right. Commerce. That's right.

So I can, of course, you can't, we can't, we might overfit. But the art of it is of course to generalize what it is that we're observing and whether that's going to be reflected in other industries.

There is no doubt that AI is the most valuable commodity the world has ever known. Now we have to mass produce it. We, we, all of us have to be good at what happens if you're surrounded by these AIs that are doing very well, much better than you. When I look back, this was my life. I had 60 direct reports.

They're world-class in their field, and they do it better than me. Much better than me. I have no trouble interacting with them, and I can design them effortlessly. I can program them effortlessly as well. So I think what people have to learn is that they're all going to be CEOs.

They're all going to be CEOs of AI agents. they have the ability to have the creativity, well, some knowledge, and how to reason, how to break down problems, so that you can program these AIs to help you achieve the same goals that I do. That's running a company.

AI Security Requires a Multi-Party Effort

Brad Gerstner:

Now. You mentioned something about uncoordinated, secure AI. you mentioned the tragedy that's happening in the Middle East. We have a lot of autonomy and a lot of AI being used around the world. So let's talk about the bad guys, secure AI, coordination with Washington. How are you feeling today? Are we on the right path? Do we have an adequate level of coordination? I think Mark Zuckerberg has said that the way we defeat bad AI is by making good AI better. How would you characterize your view of how we can make sure that this has a positive net benefit for humanity rather than this anti-utopian world that we're living in.

Jen-Hsun Huang:

The discussion about security is really important and good. Yes, the abstract view, the conceptual view of AI as a giant network of neurons, is not so good, right. Good. The reason for that is that it's well known that AI and large language models are related, not the same thing. I think there's a lot of things that are being done that are very good.First, open-source models so that the entire research community, every industry and every company can participate in AI, theYes, and learn how to utilize that ability for applications. Very good.

Second, people underestimate the number of technologies dedicated to inventing AI to keep it safe.Yes, AI can organize data, carry information, train, create AI to coordinate AI, generate synthetic data to expand AI's knowledge and make it less illusory. All AI systems that are created for vectorization or graphing or any other AI system that is used to inform the AI, protect the AI to monitor other AIs, the secure AI created by these AI systems is being praised, right?

Brad Gerstner:

Then we've established it.

Jen-Hsun Huang:

That. We're building it all. Yes, across the industry, the methodologies, the red teams, the processes, the model cards, the evaluation systems, the benchmarking systems, all of that, all of that is being built at an unbelievable rate in the harness. I want to know, celebrate. Do you guys understand? Yes, you do.

Brad Gerstner:

And, no, there's no, there's no government regulation that says you have to do this. Yes, the players building these AIs in the space today are taking these key issues seriously and harmonizing around best practices. That's right.

Jen-Hsun Huang:

So this has not been fully appreciated or understood. It is. There needs to be somebody, there needs to be, everybody needs to start talking about AI, which is an AI system, which is an engineered system, which is carefully designed, built from first principles, fully tested, and so on. Remember, AI is a capability that can be applied. I don't think it's necessary to regulate important technologies, but it's also important not to over-regulate to the point where some regulation is done for most applications. All of the different ecosystems that already regulate technology applications must now regulate technology applications that now incorporate AI.

So, I think it's important not to misunderstand and ignore the vast amount of regulations that have to be initiated in the world for AI. Don't rely on just one cosmic galaxy. the AI Council may be able to do that because all of these different agencies are in place for a reason. All of these different regulatory bodies were created for a reason. Going back to the original principle, I would.

Open source vs. no open source is a false dichotomy

Brad Gerstner:

You have launched a very important, very large, very powerful open source model.

Jen-Hsun Huang:

Nemotron.

Brad Gerstner:

Yes, it's clear that Meta has made a significant contribution to open source. I find when I read Twitter, there's a lot of discussion about open vs. closed. How do you view open source, your own open source model, and can you keep up with the cutting edge? That's the first question. The second question is, you know, having open-source models and closed-source models, which power commercial operations, is that your view of the future? Do those two things create a healthy tension for security?

Jen-Hsun Huang:

Open source and closed source are related to security, but not only security. For example, there is absolutely nothing wrong with having closed source models, they are the engine that sustains the economic modeling necessary for innovation. Well, I totally agree with that.I think the dichotomy of closed versus open is wrong.

Because openness is necessary for many industries to be activated, now, if we didn't have open source, how could all of these different fields of science be activated, activate AI. because they have to develop their own domain-specific AI, they have to develop their own AI using open source models to create domain-specific AI. they're related, they're not, again, not the same. Just because you have an open source model doesn't mean you have AI. so you have to have that open source model to create AI. so financial services, healthcare, transportation, the list of industries, fields of science that have now been enabled because of open source.

Brad Gerstner:

Unbelievable. Do you see much demand for your open source model?

Jen-Hsun Huang:

Our open source model? First. llama download. Obviously, yes, Mark and the work they've done is incredible. Beyond belief. It is. It completely activates and engages every industry, every field of science.

Okay, sure. The reason we did the Nemotron was to generate synthetic data. Intuitively, an AI would somehow sit there and loop and generate data to learn about itself. That sounds fragile. It's doubtful how many times you can go around this infinite loop, this loop. However, the picture I have in my head is kind of like if you take someone super smart, put them in a padded room, close the door for about a month, and what comes out might not be a smarter person. So, so, but you can have two or three people sitting together, we have different AIs, we have different distributions of knowledge, and we can go back and forth for quality assurance. All three of us can get smarter.

So the idea that you could have AI models swapping, interacting, passing back and forth, discussing reinforcement learning, synthetic data generation, etc., makes intuitive sense to suggest and make sense.Therefore, our model Nemotron 350B is is the best reward system model in the world. Therefore, it is best criticized.

Interesting. It's a great model to enhance other people's models. So no matter how good someone else's model is, I would recommend using the Nemotron 340B to enhance and improve it. We have seen Llama get better and make all other models better.

Brad Gerstner:

As someone who took delivery of a DGX1 in 2016, it really has been an incredible journey. Your journey has been both incredible and unbelievable. It's as remarkable as just surviving the early days. You delivered the first DGX1 in 2016, and we have a Cambrian moment in 2022.

So I'm going to ask you a question I've often wanted answered, and that is, how long can you maintain your current job with 60 direct reports? You are everywhere. You are driving the revolution. Are you having fun? Is there anything else you'd rather be doing?

Jen-Hsun Huang:

It's about the last hour and a half.The answer is "I enjoyed it".Great times. I can't imagine anything else I'd rather be doing. Let's see. I don't think it's right to give the impression that our work is always fun. My work isn't always fun, and I don't expect it to always be fun. Do I ever expect it to always be fun? I think it's always important.

Yes, I don't take myself too seriously. I take my job very seriously. I take our responsibilities very seriously. I take our contributions and our moments very seriously.

Is it always fun? No, it's not. But have I always enjoyed it? Yes. Like everything, whether it's family, friends or kids. Is it always fun? No, it wasn't. Did we always enjoy it? Absolutely.

So I think, me, how long can I do this? The real question is, how long can I stay relevant? That's what matters, and the answer to that question can only be how will I continue to learn? Today I am more optimistic. And I'm not just saying that because of our topic today. I'm more optimistic about my ability to say relevance and continue to learn because AI. I use it every day, I don't know, but I'm sure you all do. I use it almost every day.

There isn't a single piece of research I do that doesn't involve AI. and yes, there isn't a single question that I double-check with AI even if I know the answer. Yes, surprisingly, the next two or three questions I asked revealed something I didn't know. You choose your topic. You choose your subject. I consider the AI a mentor.

AI is an assistant, AI is a partner that can brainstorm with me and check in with me, and guys, it's totally revolutionary. I am an information worker. I output information. So I think their contribution to society is amazing. So I think that if that's the case, if I can maintain that relevance and continue to contribute, and I know that this work is important enough, yes, I want to continue to pursue it, I have an incredible quality of life. So I will.

Brad Gerstner:

Say I can't imagine missing this moment when you and I have been working in this field for decades. This is the most important moment of our careers. We are very grateful for this partnership.

Jen-Hsun Huang:

Nostalgia for the next decade.

Brad Gerstner:

Partnership of ideas. Yes, you make things smarter. Thank you. I think it's really important for you to be part of the leadership, right, that will optimistically and safely lead this forward. So thank you.

Jen-Hsun Huang:

With you guys. Really happy. Really. Thanks for that.

© Copyright notes

Related articles

No comments

none
No comments...