Featured Posts
NVIDIA's NVLink Fusion: True Innovation or Strategic Lock-in?
NVIDIA's NVLink Fusion True Innovation or Strategic Lock-in? Earlier this week at Computex, Jensen Huang introduced NVLink Fusion, positioning it as a means to "democratize scale-up" by allowing customers to mix and match compute architectures. On the surface, this suggests flexibility: integrating CPUs, GPUs, and specialized silicon, all interconnected via NVIDIA's high-performance NVLink. However, upon closer examination, this appears to be more of an illusion of choice.
From Silicon to Token
The world of large‑language‑model inference moves fast. Meta’s Llama 4 and DeepSeek's range of models turns yesterday’s “good enough” hardware into today’s bottleneck, so picking the right platform is more strategic than ever. I compared eight options that keep popping up in various engineering and sales conversations, including consumer RTX GPUs, Apple Silicon, NVIDIA’s H‑series, Groq’s purpose‑built LPU, Cerebras’ wafer‑scale engine, and turnkey DGX workstations. Each proves valuable in th
Our Latest Insights
OCP Global Summit 2024 (part 1)
It’s been a busy conference season, with the AI Hardware and Edge AI Summit, Yotta 2024, and OCP’s Global Summit all taking place in the past month or so. The OCP Global Summit has become a personal favorite of mine; the diversity of presenters and industry verticals is unmatched, along with more focus on technical and engineering talks, rather than sales pitches. As in previous years, I’ve been reflecting on the conference and poring over the 22 tracks and more than 430 presentations — thousan
AI for real life
As I’ve been busy with my day job(s) and various projects, like the Tech Insider Podcast, I haven’t put my hands to the keyboard for an article in a while. In this piece, while I’m still talking about AI, I want to demonstrate a few use cases that have significantly impacted my daily life. It’s easy to get caught up in the news cycle and hype about 100,000 GPUs being deployed, requiring nuclear power, and dealing with sustainability challenges—not to mention the cooling requirements now and in
To InfiniBand, maybe beyond?
Nvidia's latest roadmap was teased at Computex in Taiwan last month. Whilst details were a little light on PFLOPS and TDP for either the GPU or CPU, we did get some interesting information for the next-gen products. * GPU: Rubin (HBM3e to HBM4 memory) - TSMC 3N process * CPU: Vera (NVIDIA's 2nd gen ARM processor) - TSMC 3N process * Interconnect: NVLink6 (2x performance to 3600 GB/sec) * NIC: ConnectX9 (2x speed to 1.6Tb/sec) * Switch: SpectrumX1600 (2x speed to support CX9 NICs) NVIDIA
Apple, not Artificial, Intelligence
Just last month, Apple hosted their yearly WWDC - an event where they showcase all the updates to their platforms. Whilst a lot of it is very interesting, and AI centric, I'm going to mostly focus on Private Cloud Compute. But first…the first half. WWDC Regular Programming The first hour of the keynote provided great updates for the Apple ecosystem. I'm personally excited about Siri getting a huge kick in the pants, and into this decade, plus a bunch of quality-of-life upgrades across each