GPU's: What are they, where did they come from, why do I need one for AI?

Aug 10

The history of Graphics Processing Units (GPUs) is a fascinating journey that begins with their humble origins as tools for enhancing 3D graphics in video games. These early GPUs (then called 3D accelerators) such as the 3dfx Voodoo2 released in 1998, laid the foundation for what would become a transformative force in computing technology.

Early Beginnings and Gaming Acceleration

In the late 1990s, the gaming industry was on the cusp of a revolution. The demand for more realistic and immersive gaming experiences was growing, and traditional CPUs struggled to keep up with the computational demands of rendering complex 3D scenes in real time. This demand for enhanced graphics led to the birth of dedicated GPUs, designed to offload the intricate calculations required for 3D graphics rendering from the CPU to specialized hardware.

The NVIDIA GeForce 256 was a game-changer, introducing hardware support for transform and lighting capabilities. This innovation marked a significant milestone in the evolution of GPUs. By taking over the role of transforming and lighting objects in 3D scenes, GPUs freed up the CPU to handle other tasks, resulting in smoother and more detailed graphics. This shift fundamentally transformed the gaming experience and paved the way for further advancements in the GPU space. That process, offloading the CPU, a general purpose processor, has been happening ever since, and created additional market segments for FPGAs, ASICs and DPUs.

NVIDIA's Rise and 3dfx Acquisition

Whilst 3dfx was busy vertically integrating, primarily the acquisition of STB to insource the manufacturing/sales of their cards, NVIDIA was emerging as a dominant player, and acquired 3dfx in 2002. This acquisition not only expanded NVIDIA's intellectual property portfolio but also integrated 3dfx's innovative technologies into NVIDIA's GPUs, driving further advancements in graphics rendering *cough* SLI *cough* NVLINK

The GeForce series, including models like GeForce 2, 3, and 4, introduced features such as pixel shaders and vertex shaders, further enhancing visual quality and realism in games.

The Shift Towards General-Purpose Computing

As the computational power of GPUs continued to grow, researchers and developers recognized their potential for more than just graphics rendering. This realization led to the development of NVIDIA's Compute Unified Device Architecture (CUDA) in the mid-2000s. CUDA opened the doors to General-Purpose GPU Computing (GPGPU), enabling GPUs to perform a wide range of parallel computing tasks beyond graphics rendering.

With CUDA, developers could leverage the massive parallel processing capabilities of GPUs to accelerate scientific simulations, data analytics, and other compute-intensive tasks. This marked a significant step toward realizing GPUs as versatile processing units capable of tackling complex computational challenges beyond gaming.

With CUDA being developed for almost 20 years, and many developers using it exclusively, it gives NVIDIA quite the moat in the AI space.

AMD's Challenge and Expansion

While NVIDIA was making strides in the GPU space, AMD (Advanced Micro Devices) was also a notable contender. AMD's Radeon series competed directly with NVIDIA's GeForce line, offering gamers and professionals an alternative option. The acquisition of ATI Technologies in 2006 bolstered AMD's capabilities, enhancing their graphics technology and enabling them to compete more effectively with NVIDIA.

This competition between NVIDIA and AMD drove rapid innovation in the GPU industry, leading to continuous improvements in graphics rendering, compute capabilities, and overall performance. Gamers and professionals alike benefited from the ongoing advancements, as GPUs became more powerful and capable with each new generation.

The AI Revolution and GPUs

One of the most transformative developments in recent years has been the role of GPUs in powering artificial intelligence (AI) applications. The parallel processing architecture of GPUs is well-suited to the matrix calculations and deep learning algorithms that underpin AI training and inference. This realization led to the creation of specialized GPUs optimized for AI workloads.

NVIDIA's Tesla series, starting with the Tesla P100 in 2016, marked the company's focused entry into the AI and high-performance computing markets. These GPUs boasted impressive specifications, including high TFLOPS (TeraFLOPS) of computing power, large memory configurations, and enhanced memory bandwidth. GPUs like the Tesla V100 and A100 (and now H100) pushed the boundaries even further, enabling researchers and data scientists to train complex AI models at unprecedented speeds.

A Look at Generations: Performance and Innovation

Here's a detailed look at select NVIDIA GPU generations and their key specifications;

White are consumer cards, yellow are the Data Center chips

It's been quite a journey since that 220nm Geforce 256!

I've intentionally stopped at the A100 and will cover other chips (MI300X, H100, etc) in a follow up article.

A Glimpse into the Future

The journey of GPUs from dedicated graphics enhancers to versatile computing powerhouses is far from over. As chip fabrication technology continues to advance, we can expect to see even smaller chip sizes, greater energy usage (and efficiency), and higher performance in the coming years. GPUs are poised to reshape industries beyond gaming and AI, impacting areas such as scientific research, healthcare, autonomous vehicles, and more.

In conclusion, the evolution of GPUs is a testament to human ingenuity and the relentless pursuit of innovation. From powering captivating gaming experiences to propelling the frontiers of AI, GPUs have proven to be transformative tools that continue to shape our technological landscape. As we look to the future, the potential applications of GPUs are boundless, and their journey promises to be one of ongoing discovery and advancement.

Nick Hume