Where AI Inference Hits the Memory Wall
Episode Date: April 28, 2026ZeroPoint’s Nilesh Shah explores why data movement, compression, and memory bandwidth now shape AI inference performance, and where heterogeneous syst...
In The Arena by TechArena is your source for authentic conversations with the leading innovators in technology. The show is hosted by long time industry insider Allyson Klein
225 episodes transcribedZeroPoint’s Nilesh Shah explores why data movement, compression, and memory bandwidth now shape AI inference performance, and where heterogeneous syst...
AI architect Tapan Shah joins the Data Insights podcast to discuss scaling AI in healthcare, governance, AI agents, and how technology can improve pat...
Maher Hanafi of Betterworks joins TechArena Data Insights to discuss AI in enterprise SaaS, why many AI proof-of-concepts fail, and how engineering le...
Cloud expert Venkata Gopi Kolla joins Allyson Klein to discuss the CDN "single point of failure" and a new IETF protocol for sub-second edge recovery...
Ishween Kaur, Generative AI Lead at Salesforce, shares practical lessons on building multilingual AI systems, closing language gaps, and scaling with...
Fermilab’s Silvia Zorzetti explains how quantum computing and sensing are evolving, where they outperform classical systems, and what’s next for the f...
Hedgehog CEO Marc Austin joins Data Insights to break down open-source, automated networking for AI clusters—cutting cost, avoiding lock-in, and keepi...
Rose-Hulman Institute of Technology shares how Azure Local, AVD, and GPU-powered infrastructure are transforming IT operations and enabling device-agn...
From SC25 in St. Louis, Nebius shares how its neocloud, Token Factory PaaS, and supercomputer-class infrastructure are reshaping AI workloads, enterpr...
Runpod head of engineering Brennen Smith joins a Data Insights episode to unpack GPU-dense clouds, hidden storage bottlenecks, and a “universal orches...
Allyson Klein talks with author and Google/Intel alum Wanjiku Kamau on moving past AI skepticism, learning fast, and using new tools with intention—so...
Recorded at #OCPSummit25, Allyson Klein and Jeniece Wnorowski sit down with Giga Computing’s Chen Lee to unpack GIGAPOD and GPM, DLC/immersion cooling...
From #OCPSummit25, this Data Insights episode unpacks how RackRenew remanufactures OCP-compliant racks, servers, networking, power, and storage—turnin...
From CPU orchestration to scaling efficiency in networks, leaders reveal how to assess your use case, leverage existing infrastructure, and productize...
DataraAI CTO and co-founder Durgesh Srivastava unpacks a data-loop approach that powers reliable edge inference, captures anomalies, and encodes techn...
From the OCP Global Summit, hear why 50% GPU utilization is a “civilization-level” problem, and why open standards are key to unlocking underutilized...
Allyson Klein and co-host Jeniece Wnorowski sit down with Arm’s Eddie Ramirez to unpack Arm Total Design’s growth, the FCSA chiplet spec contribution...
Midas Immersion Cooling CEO Scott Sickmiller joins a Data Insights episode at OCP 2025 to demystify single-phase immersion, natural vs. forced convect...
From hyperscale direct-to-chip to micron-level realities: Darren Burgess (Castrol) explains dielectric fluids, additive packs, particle risks, and how...
From OCP in San Jose, PEAK:AIO’s Roger Cummings explains how workload-aware file systems, richer memory tiers, and capturing intelligence at the edge...