Featured Posts
NVIDIA's NVLink Fusion: True Innovation or Strategic Lock-in?
NVIDIA's NVLink Fusion True Innovation or Strategic Lock-in? Earlier this week at Computex, Jensen Huang introduced NVLink Fusion, positioning it as a means to "democratize scale-up" by allowing customers to mix and match compute architectures. On the surface, this suggests flexibility: integrating CPUs, GPUs, and specialized silicon, all interconnected via NVIDIA's high-performance NVLink. However, upon closer examination, this appears to be more of an illusion of choice.
From Silicon to Token
The world of large‑language‑model inference moves fast. Meta’s Llama 4 and DeepSeek's range of models turns yesterday’s “good enough” hardware into today’s bottleneck, so picking the right platform is more strategic than ever. I compared eight options that keep popping up in various engineering and sales conversations, including consumer RTX GPUs, Apple Silicon, NVIDIA’s H‑series, Groq’s purpose‑built LPU, Cerebras’ wafer‑scale engine, and turnkey DGX workstations. Each proves valuable in th
Our Latest Insights
Apple, not Artificial, Intelligence
Just last month, Apple hosted their yearly WWDC - an event where they showcase all the updates to their platforms. Whilst a lot of it is very interesting, and AI centric, I'm going to mostly focus on Private Cloud Compute. But first…the first half. WWDC Regular Programming The first hour of the keynote provided great updates for the Apple ecosystem. I'm personally excited about Siri getting a huge kick in the pants, and into this decade, plus a bunch of quality-of-life upgrades across each
Oh great, another podcast...
As you may have seen (or heard my "Ausmerican" accent) recently, I've started a podcast, and wanted to share a little insight into why. I created this, "Infrastructure as a Newsletter", for *checks calendar* just shy of a year ago, and grateful to now have over 1300 subscribers. For something I was quite hesitant to start, I have definitely seen the fruits of my labor, and thankful my wife Victoria Hume encouraged me to put hands to keyboard, and to wade into the uncomfortable. I have quite en
OCP 2024 Regional Summit wrap
The Open Compute Project (OCP) Regional Summit was hosted in Lisbon, Portugal last month, the 5th (and largest) regional summit the group has hosted. Whilst I wasn't able to make it in person, I’d be remiss if I didn't write a (very) quick summary about the conference, and pertinent updates to scaling digital infrastructure in a sustainable way. The hot topic continues to be GenAI, such that OCP has created a new track for Artificial Intelligence, and a strategic initiative for Open AI Systems
Here come the Inferencing ASIC's
The tidal wave of Generative AI (GenAI) has mostly consisted of training large language models (LLM's), like GPT-4, and the huge amount of compute needed to process these enormous datasets, e.g. GPT-4 has 1.76 trillion parameters. This compute has mainly looked like NVIDIA's GPUs, but you also need... 1. power 2. networking 3. capital, AND 4. a nice cool place to host them (data center) The looooooong tail of AI Inferencing will dictate that compute is installed closer to where it's neede