Orchestrate all the Things - AI chips in 2020: Nvidia and the challengers. ZDNet Article

Episode Date: June 4, 2020

Now that the dust from Nvidia's unveiling of its new Ampere AI chip has settled, let's take a look at the AI chip market behind the scenes and away from the spotlight. Few people, Nvidia's compe...titors included, would dispute the fact that Nvidia is calling the shots in the AI chip game today. The announcement of the new Ampere AI chip in Nvidia's main event, GTC, stole the spotlight. Let's put the new architecture into perspective by comparing against the competition in terms of performance, economics, and software. Article published on ZDNet in May 2020

Transcript
Discussion (0)
Starting point is 00:00:00 Welcome to the Orchestrate All the Things podcast. I'm George Amatiotis and we'll be connecting the dots together. This is episode 4 of the podcast featuring an analysis on NVIDIA's latest AI chip architecture onboard. It's trying to put it into perspective by comparing against the competition in terms of performance, economics and software. This is based on an article published on Zidimuth in May 2020. A range of AI-generated courses have been used to turn it into a podcast.
Starting point is 00:00:32 I hope you will enjoy the podcast. If you like my work, you can follow Link Data Registration on Twitter, LinkedIn, and Facebook. AI Chips in 2020. NVIDIA and the Challengers. Few people, NVIDIA's competitors included, would dispute the 2020. NVIDIA and the challengers. Few people, NVIDIA's competitors included, would dispute the fact that NVIDIA is calling the shots in the AI chip game today. The announcement of the new Ampere AI chip in NVIDIA's main event, GTC, stole the spotlight last week. There's been ample coverage, including here on ZDNet. Tiernan Ray provided an in-depth analysis of the new and noteworthy with regards
Starting point is 00:01:05 to the chip architecture itself. Andrew Brust focused on the software side of things, expanding on NVIDIA's support for Apache Spark, one of the most successful open-source frameworks for data engineering, analytics, and machine learning. Let's pick up from where they left off, putting the new architecture into perspective by comparing against the competition in terms of performance, economics, and software. NVIDIA's Double Bottom Line The gist of Ray's analysis is on capturing NVIDIA's intention with the new generation of chips to provide one chip family that can serve for both training of neural networks, where the neural networks operation is first developed on a set of examples, and also for inference, the phase where predictions are made based on new incoming data. Ray notes this is a departure from today's situation where different
Starting point is 00:01:49 NVIDIA chips turn up in different computer systems for either training or inference. He goes on to add that NVIDIA is hoping to make an economic argument to AI shops that it's best to buy an NVIDIA-based system that can do both tasks. You get all of the overhead of additional memory, CPUs, and power supplies of 56 servers, collapsed into one, said NVIDIA CEO Jensen Huang. The economic value proposition is really off the charts, and that's the thing that is really exciting. Jonah Albin, NVIDIA's senior VP of GPU engineering, told analysts that NVIDIA had already pushed Volta, NVIDIA's previous generation chip, as far as it could without catching fire. It went even further with Ampere, which features 54 billion transistors,
Starting point is 00:02:30 and can execute 5 petaflops of performance, or about 20 times more than Volta. So, NVIDIA is after a double bottom line, better performance and better economics. Let us recall that recently NVIDIA also added support for ARM CPUs. Although ARM processor performance may not be on par with Intel at this point, its frugal power needs make them an attractive option for the data center, according to analysts. On the software front, besides Apache Spark support, NVIDIA also unveiled Jarvis, a new application framework for building conversational AI services. To offer interactive, personalized experiences, NVIDIA notes, companies need to train their language-based applications on data that is
Starting point is 00:03:10 specific to their own product offerings and customer requirements. However, building a service from scratch requires deep AI expertise, large amounts of data and compute resources to train the models, and software to regularly update models with new data. Jarvis aims to address these challenges by offering an end-to-end deep learning pipeline for conversational AI. Jarvis includes state-of-the-art deep learning models, which can be further fine-tuned using NVIDIA NEMO, optimized for inference using TensorRT, and deployed in the cloud and at the edge using Helm charts available on NGC, NVIDIA's catalog of GPU-optimized software. Intel and GraphCore. High-profile challengers.
Starting point is 00:03:49 Working backward, this is something we have noted time and again for NVIDIA. Its lead does not just lay in hardware. In fact, NVIDIA's software and partner ecosystem may be the hardest part for the competition to match. The competition is making moves too, however. Some competitors may challenge NVIDIA on economics, others on performance. Let's see what the challengers are up to. Intel has been working on its Nirvana technology for a while. At the end of 2019, Intel made waves when it acquired startup Urbana Labs for $2 billion. As analyst Karl Freund notes, after the acquisition Intel has been working on switching its AI acceleration from Nirvana technology to Abana Labs. Freund also highlights the importance of
Starting point is 00:04:30 the software stack. He notes that Intel's AI software stack is second only to NVIDIA's, layered to provide support, through abstraction, of a wide variety of chips, including Xeon, Nirvana, Movidius, and even NVIDIA GPUs. Abana Labs features two separate AI chips, GoThee for training, and Goya for inference. Intel is betting that GoThee and Goya can match NVIDIA's chips. The MLPerf inference benchmark results published last year were positive for Goya. However, we'll have to wait and see how it fares against NVIDIA's Ampere and NVIDIA's ever-evolving software stack. Another high-profile challenger is GraphCore.
Starting point is 00:05:04 The UK-based AI chip manufacturer has an architecture designed from the ground up for NVIDIA's ever-evolving software stack. Another high-profile challenger is GraphCore. The UK-based AI chip manufacturer has an architecture design from the ground up for high-performance and unicorn status. GraphCore has been keeping busy too, expanding its market footprint and working on its software. From Dell's servers to Microsoft Azure's cloud and Baidu's Paddle Paddle hardware ecosystem, GraphCore has a number of significant deals in place. GraphCore has a number of significant deals in place. GraphCore has also been working on its own software stack, Poplar. In the last month, Poplar has seen a new version and a new analysis tool. If Intel has a lot for catching up to do, that certainly also applies to GraphCore. Both vendors seem to be on a similar trajectory.
Starting point is 00:05:44 However, aiming to innovate on the hardware level, hoping to be able to challenge NVIDIA with a new and radically different approach, custom-built for AI workloads. At the same time, working on their software stack, and building their market presence. Fractionalizing AI hardware with a software solution by Run AI. Last but not least, there are a few challengers who are less high-profile and have a different approach. Startup Run AI recently exited stealth mode, with the announcement of $13 million in funding for what sounds like an unorthodox solution. Rather than offering another AI chip, Run AI offers a software layer to speed up machine learning workload execution, on-premise and in the cloud. The company works closely with AWS and is a VMware technology partner. Its core value proposition is to act as a management
Starting point is 00:06:26 platform to bridge the gap between the different AI workloads and the various hardware chips and run a really efficient and fast AI computing platform. Run.ai recently unveiled its fractional GPU sharing for Kubernetes deep learning workloads. Aimed at lightweight AI tasks at scale such as inference, the fractional GPU system gives data science and AI engineering teams the ability to run multiple workloads simultaneously on a single GPU, thus lowering costs. Omri Geller, RunAI co-founder and CEO told ZDNet that NVIDIA's announcement about fractionalizing GPU or running separate jobs within a single GPU is revolutionary for GPU hardware.
Starting point is 00:07:03 Geller said it has seen many customers with this need. I hope you enjoyed the podcast. If you like my work, you can follow Link Data Orchestration on Twitter, LinkedIn, and Facebook.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.