NVIDIA III (2022 - 2023)
NVIDIA's dramatic decline and meteoric rise to become one of the world's most valuable companies unfolded over an 18-month period from 2022 to 2023.
In an 18 month period over 2022 and 2023 NVIDIA weathered one of the steepest stock crashes in history ($500B+ market cap wiped away peak-to-trough!). And, it has of course also experienced an even more fantastical rise — becoming the platform that’s powering the emergence of perhaps a new form of intelligence itself… and in the process becoming a trillion-dollar company.
Today we dive into another chapter in the amazing NVIDIA saga: the dawn of the AI era.
Company Overview
Company Name: NVIDIA Corporation
Founding Year: 1993
Headquarters Location: Santa Clara, California
Core Business: NVIDIA designs and manufactures graphics processing units (GPUs), central processing units (CPUs), and networking solutions, primarily for data centers and gaming, with its Compute Unified Device Architecture (CUDA) platform enabling accelerated computing for artificial intelligence and scientific workloads.
Significance: NVIDIA has transformed from a graphics card manufacturer to the leading platform for AI computing, powering generative AI models like ChatGPT and achieving a trillion-dollar market cap by leveraging its hardware and software ecosystem in the data center market.
Timeline
1993: NVIDIA is founded by Jensen Huang, focusing on graphics processing for gaming.
2006: NVIDIA launches CUDA, a parallel computing platform, enabling GPUs for scientific computing beyond graphics.
2012: The AlexNet breakthrough at the University of Toronto, using NVIDIA GTX 580 GPUs, marks the “big bang” for AI, demonstrating GPUs’ power for neural network training.
2015: OpenAI is founded by Elon Musk, Sam Altman, and others, with Ilya Sutskever as chief scientist, aiming to break the Google-Facebook AI duopoly.
2017: Google publishes the Transformer paper (“Attention is All You Need”), introducing a parallelizable architecture for language models, significantly boosting AI capabilities on NVIDIA GPUs.
2019: NVIDIA acquires Mellanox for $7 billion, gaining high-speed InfiniBand networking critical for AI data centers. OpenAI pivots to a for-profit entity, securing a $1 billion investment from Microsoft.
2020: NVIDIA releases Megatron, an 8.3 billion-parameter Transformer model, showcasing AI training scale. Microsoft licenses GPT-3 for commercial use.
September 2022: NVIDIA announces the H100 GPU (Hopper architecture) and Grace CPU, optimizing for AI data centers. US export controls limit H100/A100 sales to China, leading to A800/H800 variants.
November 2022: OpenAI launches ChatGPT, becoming the fastest app to reach 100 million users, driving massive demand for NVIDIA GPUs.
January 2023: Microsoft invests $10 billion in OpenAI, integrating GPT into its products.
May 2023: NVIDIA reports Q1 FY24 earnings with $7.2 billion revenue, forecasting $11 billion for Q2 due to AI demand. CUDA reaches 4 million developers.
August 2023: NVIDIA’s Q2 FY24 earnings reveal $13.5 billion total revenue, with $10.3 billion from the data center segment, up 171% year-over-year.
Narrative
NVIDIA’s journey from a graphics card manufacturer to the epicenter of the AI revolution is a story of resilience, foresight, and seizing an unprecedented opportunity. Founded in 1993, NVIDIA initially focused on graphics processing, surviving near-death experiences against Intel’s dominance in the early 2000s. The introduction of CUDA in 2006 marked a pivotal shift, enabling GPUs to handle scientific computing tasks beyond graphics. Ben and David highlight the 2012 AlexNet breakthrough as the “big bang” for AI, where University of Toronto researchers used NVIDIA GTX 580 GPUs to achieve a step-change in image recognition, reducing error rates from 25% to 15%. This moment, powered by CUDA, revealed GPUs’ potential for massively parallel neural network training, setting NVIDIA on a path to dominate AI compute. The episode emphasizes NVIDIA’s strategic patience, waiting for the right moment to challenge Intel’s CPU-centric data center architecture, which they began rearchitecting over the past five years.
The narrative accelerates with the 2017 Google Transformer paper, which introduced a parallelizable architecture for language models, leveraging NVIDIA’s GPUs to process vast datasets efficiently. Ben explains the attention mechanism, allowing models to weigh entire input contexts, making them ideal for large language models (LLMs) like GPT. This shift required immense computational power, which NVIDIA was uniquely positioned to provide. The 2019 acquisition of Mellanox for $7 billion gave NVIDIA InfiniBand, a high-speed networking standard critical for linking GPU clusters in data centers. By 2020, NVIDIA’s Megatron model demonstrated the scale of Transformer-based AI, trained on 512 GPUs, foreshadowing the compute demands of future LLMs. The episode’s tone grows exuberant as it recounts ChatGPT’s November 2022 launch, described as the “AI heard around the world,” which triggered explosive demand for NVIDIA’s H100 GPUs and DGX systems. Ben and David marveled at NVIDIA’s preparation, having built a full-stack data center platform (Hopper GPUs, Grace CPUs, Mellanox networking) over five years, perfectly timed for the generative AI boom.
NVIDIA’s financial trajectory underscores this narrative. In 2022, the company faced a $500 billion market cap decline due to crypto’s collapse and inventory write-downs, which Ben and David note were ironically tied to over-ordering AI hardware just before ChatGPT’s breakthrough. By May 2023, NVIDIA’s Q1 FY24 earnings showed a 19% revenue increase to $7.2 billion, with a Q2 forecast of $11 billion, driven by AI demand. The August 2023 Q2 earnings, described as a “historic event,” reported $13.5 billion in revenue, with the data center segment alone at $10.3 billion, up 171% year-over-year. This growth, fueled by hyperscalers, consumer internet companies, and enterprises, reflects NVIDIA’s strategic shift to a platform company, akin to Microsoft or IBM, with CUDA as its developer ecosystem. Ben and David’s tone is both awestruck and analytical, noting NVIDIA’s ability to capitalize on generative AI’s compute demands while questioning whether the market’s trillion-dollar valuation is sustainable.
The episode also weaves in NVIDIA’s cultural and leadership strengths, centered on CEO Jensen Huang. At 60, Huang’s relentless drive—described in a quote where he finds relaxation in solving problems—sets NVIDIA’s pace, with a six-month product cycle and a lean 26,000-employee workforce. Ben and David contrast this with larger tech firms like Microsoft (220,000 employees), highlighting NVIDIA’s efficiency ($46 million market cap per employee). The narrative concludes with NVIDIA’s trillion-dollar total addressable market (TAM) vision, reframed as capturing a share of the $1 trillion data center market, growing at $250 billion annually. This vision, coupled with strategic moves like DGX Cloud and a robust developer ecosystem, positions NVIDIA as the “modern IBM” for AI, though Ben and David caution about future competition and market volatility.
Notable Facts
Market Leadership: NVIDIA’s data center segment, nearly nonexistent five years ago, generated $10.3 billion in Q2 FY24, surpassing gaming as the company’s primary revenue driver.
CUDA Ecosystem: CUDA, launched in 2006, now has 4 million registered developers, growing from 100,000 in 2010 to 3 million in 2022, creating a significant moat.
Mellanox Acquisition: The $7 billion acquisition of Mellanox in 2019 provided InfiniBand, enabling NVIDIA to network GPU clusters for AI training, a critical differentiator.
H100 Innovation: The H100 GPU, launched in 2022, features 18,500 CUDA cores, 640 Tensor cores and 80GB of on-chip memory, purpose-built for LLM training, costing $40,000 per unit.
Cultural Efficiency: NVIDIA operates with 26,000 employees, achieving $46 million in market cap per employee, compared to Microsoft’s $9 million per employee.
Financial / User Metrics
Q1 FY24 Revenue (ended April 2023): $7.2 billion, up 19% quarter-over-quarter.
Q2 FY24 Revenue (ended July 2023): $13.5 billion, up 88% quarter-over-quarter and 101% year-over-year.
Data Center Segment Revenue (Q2 FY24): $10.3 billion, up 141% quarter-over-quarter and 171% year-over-year.
Gross Margin (Q2 FY24): 70%, with a forecast of 72% for Q3, compared to 24% pre-CUDA.
Market Cap (August 2023): Over $1 trillion, up from $660 billion in April 2022 and a low of $300 billion in 2022.
CUDA Developers: 4 million registered developers as of May 2023.
Data Center TAM: $1 trillion installed base, with $250 billion annual CapEx, per Jensen Huang.
H100 Pricing: $40,000 per GPU; DGX H100 system starts at $500,000; DGX GH200 SuperPOD priced in the hundreds of millions.
DGX Cloud Pricing: Starts at $37,000/month for an A100-based system.
Bear Case and Bull Case
Bear Case
Competition from Big Tech: Companies like Amazon (Trainium/Inferentia), Google (TPUs) and Meta (PyTorch) are incentivized to develop alternatives to NVIDIA’s GPUs, leveraging their resources to reduce dependency. PyTorch’s move to a foundation model could aggregate developers and enable disintermediation.
Market Overhype: The generative AI market may face a “crisis of confidence” if applications underdeliver, slowing enterprise spending. Ben notes a potential “crypto-like” bubble burst, impacting NVIDIA’s growth.
Inference Shift: As AI workloads shift to inference, which is less NVIDIA-dominated, competitors with cheaper solutions could erode market share.
China Restrictions: US export controls limit NVIDIA’s access to China (25% of 2022 revenue), and Chinese firms are developing homegrown alternatives, potentially closing off a major market.
Market Size vs. Valuation: NVIDIA’s trillion-dollar market cap assumes sustained AI growth, which may not materialize if GPT-like experiences falter.
Bull Case:
Accelerated Computing Dominance: Jensen Huang’s vision of shifting workloads to accelerated computing (from 5–10% to 50%+) favors NVIDIA, as most compute remains CPU-bound.
Generative AI Growth: Real economic value from ChatGPT ($1–3 billion run rate) and Google’s Bard suggests broad adoption, driving NVIDIA’s hardware demand.
Execution Speed: NVIDIA’s six-month product cycle and cultural efficiency (26,000 employees) enable it to outpace competitors.
Data Center TAM: NVIDIA could capture a significant share of the $1 trillion data center market, with $250 billion annual CapEx, already achieving 15–18% with $40 billion annualized data center revenue.
Platform Differentiation: Unlike Intel or Cisco, NVIDIA’s full-stack platform (hardware, software, services) mirrors Microsoft or IBM, with CUDA’s 4 million developers creating a durable moat.
Tech Trends
Ben and David discuss several technological trends, using terminology from the episode:
Accelerated Computing: NVIDIA’s vision, articulated by Jensen Huang, of shifting workloads from CPU-bound to GPU-accelerated architectures. GPUs’ parallel processing (e.g., 18,500 CUDA cores in H100) enables faster, more efficient AI training, leveraging Moore’s Law by “hundreds or thousands of times.”
Transformer Architecture: The 2017 Google paper introduced a parallelizable model for language processing, enabling LLMs like GPT by using attention mechanisms to process large contexts efficiently. This trend, running on NVIDIA GPUs, unlocked generative AI.
Data Center as the Computer: Jensen’s concept that entire data centers function as single compute units, requiring high-speed networking (Mellanox’s InfiniBand) and integrated systems (Hopper GPUs, Grace CPUs). This redefines data center architecture for AI workloads.
Generative AI: The emergence of user-facing AI products like ChatGPT, driven by LLMs, requiring massive GPU compute. Ben and David call this the “iPhone moment” for AI, suggesting a new interaction paradigm (English as a programming language).
These trends strengthen NVIDIA’s market position by aligning its hardware and software with AI’s compute demands. CUDA’s compatibility across 500 million GPUs ensures developer lock-in, while InfiniBand and COWOS technology provide competitive advantages in training LLMs. Risks include competitors developing alternative architectures (e.g., Google’s TPUs), but NVIDIA’s lead in parallel computing and developer ecosystem mitigates this.
Powers
Scale Economies: CUDA’s 4 million developers and 10,000 person-years of investment create a moat, as NVIDIA amortizes fixed costs (1,600+ CUDA engineers) across a vast ecosystem. This mirrors Apple’s iOS, with NVIDIA’s tight hardware-software integration outpacing open-source alternatives like PyTorch (supported by “dozens” of engineers).
Switching Costs: Enterprises adopting NVIDIA’s DGX systems and CUDA-based code face high switching costs due to organizational momentum and data center architecture lock-in (5–10 years). Ben and David note that current AI enthusiasm drives purchases, locking in NVIDIA’s platform.
Cornered Resource: NVIDIA’s exclusive access to TSMC’s 2.5D COWOS packaging capacity (10–15% of TSMC’s footprint) limits competitors like AMD. This capacity, secured partly for crypto, became critical for H100 production.
Network Economies: CUDA’s ecosystem benefits from developers building libraries, reducing coding complexity. The 500 million CUDA-capable GPUs since 2006 make NVIDIA’s platform the default for AI development.
Branding: NVIDIA’s reputation as the “modern IBM” ensures CIOs face no risk in choosing NVIDIA, reinforced by its consumer graphics legacy and enterprise AI dominance.
Playbook
iPhone Moment for AI: Jensen’s analogy frames generative AI as a new computing paradigm, akin to the iPhone, with NVIDIA as the vertically integrated platform (hardware, software, services) like Apple. This drives developer and customer adoption, targeting B2B enterprises for higher margins.
Systems Company: NVIDIA’s shift from chips to full-stack solutions (Hopper GPUs, Grace CPUs, Mellanox networking) redefines competition, focusing on integrated performance over chip-to-chip comparisons.
Do What Others Can’t: Jensen’s philosophy, articulated as “You build a great company by doing things other people can’t do. You don’t build a great company by fighting others to do something everybody can do.”
Strike When Timing is Right: NVIDIA’s patience in challenging Intel’s data center dominance, waiting for AI-driven demand, maximized its impact with DGX and SuperPOD solutions.
What It Would Take to Compete with NVIDIA?
Ben and David discuss the formidable barriers to competing with NVIDIA, emphasizing the near-impossibility of a head-on challenge due to its integrated hardware-software platform and ecosystem dominance. To compete effectively, a rival would need to overcome the following hurdles, as outlined in the episode:
Design Comparable GPUs: A competitor must develop GPU chips matching or surpassing NVIDIA’s H100, which features 18,500 CUDA cores, 640 Tensor cores, and 80GB of on-chip memory. While AMD, Google (TPUs), and Amazon (Trainium/Inferentia) are designing AI accelerators, none match NVIDIA’s performance or ecosystem integration.
Build Advanced Networking: Competitors need chip-to-chip (like NVLink) and server-to-server networking capabilities equivalent to Mellanox’s InfiniBand, which NVIDIA owns. InfiniBand’s 3200 Gbps bandwidth is critical for AI training across racks, and no other provider matches this scale.
Secure Manufacturing Capacity: Access to TSMC’s 2.5D COWOS packaging (10–15% of TSMC’s capacity) is essential for high-performance GPUs with integrated high-bandwidth memory. NVIDIA’s reserved capacity, initially for crypto, gives it a “cornered resource” unavailable to competitors like AMD.
Develop a Software Ecosystem: A rival must replicate CUDA’s 4 million-developer ecosystem, which has 10,000 person-years of investment and supports 500 million GPUs. Alternatives like AMD’s ROCm or Meta’s PyTorch lack comparable adoption, with PyTorch supported by only “dozens” of engineers versus NVIDIA’s thousands.
Convince Customers and Developers: Competitors must offer a solution that is significantly better or cheaper to sway CIOs and developers from NVIDIA’s “nobody gets fired for buying NVIDIA” brand. Ben and David note this requires a “10X better” product to overcome NVIDIA’s trust and lock-in.
Match NVIDIA’s Pace: NVIDIA’s six-month product cycle (e.g., two GTCs annually) demands that competitors innovate at a similar speed, which Ben compares to Apple hypothetically hosting two WWDCs yearly—an infeasible task for most.
Integrate Full-Stack Solutions: NVIDIA’s DGX systems and SuperPODs combine GPUs, CPUs, networking, and software into a seamless package. Competitors like Foxconn could assemble similar servers, but lack NVIDIA’s integrated software and networking expertise.
Ben and David conclude that competing head-on is “nearly impossible” due to NVIDIA’s lead in hardware, software, and ecosystem scale. They suggest that any successful challenge would likely come from an “unknown flank attack” (e.g., a novel computing architecture) or a future where accelerated computing and AI are less dominant, though they deem this unlikely given current trends. Jensen’s philosophy—“You build a great company by doing things other people can’t do. You don’t build a great company by fighting others to do something everybody can do”—underscores NVIDIA’s focus on unique, high-margin opportunities, making replication exceptionally difficult.
Carveouts
Ben’s Carveout: Alias, a 2000s TV show starring Jennifer Garner, described as “campy” but enjoyable “junk food” viewing. Ben notes its explicit, on-the-nose style contrasts with modern TV’s subtlety.
David’s Carveout: Moana, a Disney animated film, praised as a return-to-form for Disney animation. Watched with his daughter and family, David highlights its universal appeal, featuring the Rock.
Additional Notes: The carveouts align with Acquired’s tradition of personal recommendations, reflecting Ben and David’s approachable style.
Additional Notes
Episode Metadata:
Number: Season 13, Episode 3
Title: NVIDIA Part III: The Dawn of the AI Era (2022-2023)
Duration: 2:53:40
Release Date: September 5, 2023
Related Episodes:
NVIDIA Part I: The GPU Company (1993-2006) (Season 10, Episode 5)
NVIDIA Part II: The Machine Learning Company (2006-2022) (Season 10, Episode 6)
TSMC (Season 9, Episode 3)