The AI Daily Brief: Artificial Intelligence News and Analysis - The LLM for Coding Competition Heats Up!

Episode Date: August 9, 2023

On the Brief: Stability AI releases StableCode as Google opens the waitlist for its in-browser AI-powered coding environment; Google is in talks with major record labels around AI music and AI discove...rs an asteroid. On our main episode: NVIDIA announces its state of the art Grace Hopper superchip is coming next year, alongside a slew of other generative AI updates. NLW explores how the AI chip competitive space is developing, including the latest from AMD, startups like Tenstorrent, and efforts from the big tech giants. Today's Sponsor: Supermanage - AI for 1-on-1's - https://supermanage.ai/breakdown ABOUT THE AI BREAKDOWN The AI Breakdown helps you understand the most important news and discussions in AI.  Subscribe to The AI Breakdown newsletter: https://theaibreakdown.beehiiv.com/subscribe Subscribe to The AI Breakdown on YouTube: https://www.youtube.com/@TheAIBreakdown Join the community: bit.ly/aibreakdown Learn more: http://breakdown.network/

Transcript
Discussion (0)
Starting point is 00:00:01 Today on the AI breakdown, we're looking at the latest from NVIDIA and the state of the AI chip wars. Before that on the brief, some new competition in the LLM for coding space. The AI Breakdown is a daily podcast and video about the most important news and discussions in AI. Go to Breakdown.network for more information about our newsletter, our Discord, and our YouTube channel. Welcome back to the AI Breakdown Brief, all the AI headline news you need in around five minutes. Today we kick off with one of the hottest parts of the AI space, which is, of course, the competition around LLMs for coding. Now, it is perhaps no surprise that developers are some of the earliest adopters of artificial
Starting point is 00:00:43 intelligence. I mean, it's the cohort that built the tool, so it's not surprising that they're using them. What might be surprising is just how profligate they already are. In June, GitHub released the results of a developer survey where they found that 92% of developers in the U.S. are already using AI coding tools both in and outside of work. On top of that, 70% of developers say AI coding tools will offer them an advantage at work, thinking that it will lead to better code quality, faster completion time, and an easier ability to resolve incidents, and four out of five developers also think that AI coding tools will
Starting point is 00:01:15 help make their team more collaborative. Given all that, it's not a surprise that there is a lot of competition around which LLM is best for coding. Two interesting announcements on that front. The first comes from Stability AI who keep up their absolutely relentless release schedule with the announcement of Stable Code. The company claims that StableCode is the first LLM product entirely focused on coding. The announcement post writes, StableCode offers a unique way for developers to become more efficient by using three different models to help in their coding. The base model was first trained on a diverse set of programming languages from the stack dataset from BigCode, and then trained further with popular languages like Python Go Java, JavaScript, C, Markdown, and C++.
Starting point is 00:01:53 After the base model had been established, the instruction model was then tuned for specific use cases to help solve complex programming tasks. 120,000 code instruction response pairs in alpaca format were trained on the base model to achieve this result. Still, the big cell for stable code is what they claim is the biggest token context window for any open LLM coding model. The version of stable code that is available as their long context window model has a context window of 16,000 tokens, which they say can handle two to four times more code than previously
Starting point is 00:02:23 released open models. Now, of course, as soon as this was released, developers started looking at how it compared in terms of coding benchmarks. Prince Canuma tweets, the instruction-tuned stable code variant performs competitively with the Lama 2 75 billion perams variant on the human avowal benchmark. It also outperforms all other Lama 2 variants while significantly smaller at only 3 billion parameters. Interestingly, Stability AI CEO Ahmad Mostok didn't love what evaluation models the company had presented. He said, Human Aval isn't the best benchmark for a code completion. model, team are working on better ones for this and other code tasks as well as specific models for use cases. Still, can only beat the benchmark in front of you. Clam, the CEO of Hugging Face, however,
Starting point is 00:03:03 responded and suggested that Stability AI add the Stable Code model to HuggingFace's multilingual cod evals, which Imod dutifully agreed to. That said, Lubna Ben Ala, an ML engineer at HuggingFace, wrote, I added stable code completion Alpha 3B to the leaderboard. It's competitive with smaller size models on Python, Java, JavaScript, and C++, but has poor performance on the rest. Now, this is less than 24 hours old, so I think it's not exactly clear where this sits relative to other LLMs that are used for coding, but what is clear is just how big an area of competition this is. Another AI-related coding announcement yesterday came from Google. Lior at Alpha Signal AI writes, just in, Google announces a new browser-based code environment. It will bring the entire
Starting point is 00:03:44 full stack and app development workflow to the cloud. It also includes generative AI features based on Palm 2, code generation, code completion, translating code between languages, code explanation. To learn more about this, you can go to IDX.dev. Google describes the project by saying, These days, launching applications means navigating an endless sea of complexity. We felt this pain at Google so we started project IDX, an experimental new initiative aimed at bringing your entire full-stack multi-platform app development workflow to the cloud. And while this isn't just about AI, that is definitely one of the big features they're pushing. They write, work quickly in a with AI assistance from Google built-in, including code generation, code completion, translating
Starting point is 00:04:23 code between programming languages, explaining code, and more, all powered by Cody, a foundational AI model trained on code and built on Palm 2. Project IDX is currently available by waitlist only. Next up on the brief, what has to be one of the most obvious and predictable about faces in history, after throwing an absolute hissy fit around the Drake track that came out earlier this year, hard on my sleeve, which was, of course, an AI track, now it appears that labels including Universal Music and Warner are in conversations with Google about licensing artists IP, including their voices, melodies, and more to create an IP-approved platform for people to develop AI music. The reporting comes from the Financial Times. The way they characterize it is
Starting point is 00:05:03 discussions between Google and Universal Music are at an early stage and no product launch is imminent, but the goal is to develop a tool for fans to create these tracks legitimately and pay the owners of the copyrights for it, said people close to the situation. Artists would have the choice to opt in the people said. Now, this reminds me quite a bit of an idea posted by product hunt founder Ryan Hoover in April. Back then, he wrote, free startup idea that will likely get you sued. AI Spotify. How it works. AI Spotify hosts AI generated music of your favorite artists. Anyone can submit music in the best song's surface based on listens and likes. Music with the most listens earns a pro rata share of subscription revenue reserved for the
Starting point is 00:05:38 original artists. For example, Drake could claim money generated from his likeness on the platform. Artists that do not want to participate can opt out entirely banning any music that uses their lightness, or individually allow songs they endorse. Of course, there are many ethical and legal issues with this model, especially with labels, but maybe this is a germ of a shower thought that has potential. Now, this to me seems absolutely like the only path that makes any sort of sense. TLDR, you can't put the genie of AI music back in the lamp once it's released. There's simply going to be too many models, too many tools, and too much publicly available content to train those models on, to not see just a ton of AI tracks. In that context, to use the classic phrase,
Starting point is 00:06:16 If you can't beat them, join them. It appears that Warner and Universal are thinking in exactly the same way and are out to get their cut. A quick one from the world of policy and health AI. Reports have been that Google has been testing its Med Palm 2 model in hospitals for some of this year, and at least one senator, Senator Mark Warner, is not happy about it. The Virginia Democrat sent a letter to the CEO of Google on Tuesday, basically warning them off a further rollout.
Starting point is 00:06:43 The letter said, well, the technology has shown some of the president. some promising results, there are also concerning reports of repeated inaccuracies and of Google's own senior researchers expressing reservations about the readiness of the technology. Warner basically accuses Google of sort of the same thing that Jeffrey Hinton has accused the entire big tech space of, which is racing to get AI models that aren't fully tested and aren't fully safe out because of the pressure of competition. So far, Google is standing its grounds, saying that the rollout is extremely limited and that the company doesn't control any private data. Lastly, a nice positive one to end the AI breakdown brief today, a new AI model called Heliolink 3D has been used to spot asteroids that
Starting point is 00:07:21 could in the future pose a threat to Earth. The team from the University of Washington claims that the new AI model can identify asteroids with just half the observations that were needed before. Mario Jurek, the director of the DRAC Institute at the University of Washington that developed the model, writes, the solar system is home to millions of rocky bodies ranging from small asteroids a few feet in diameter to dwarf planets the size of our moon. Most of them are distant, but a number orbit close to the Earth. These are known as near-Earth objects. The closest of these, whose trajectories take them within 5 million miles of the Earth's orbit, warrant special attention. The large, very nearby objects are known as potentially hazardous asteroids, PHAs. They're systematically
Starting point is 00:07:58 searched for and monitored to ensure they won't collide with Earth. Astronomers search for PHAs using specialized telescope systems. The NASA-funded Atlas survey is a prime example. To find asteroids, Atlas takes images of parts of the sky at least four times every night. A discovery is made when they notice a point of light moving unambiguously in a straight line over the image series. Now from there, Jurek explains the development of the Rubin Observatory, which is an observatory that will be based in the Chilean Andes and which could increase the discovery rate of PHAs. Jurek writes, to be even more efficient, it will visit spots on the sky just twice each night rather than the four times needed by present telescopes.
Starting point is 00:08:30 But with this novel observing cadence, we need a new type of discovery algorithm to reliably spot space rocks. And that is, of course, where the Helio-Link 3D model came in. The big deal, Jurek says, is that it can identify asteroids with most. much fewer up to 50% and more dispersed observations than required by today's methods. Now, Dirk says that the asteroid they discovered, 2022 SF-289, while coming close to the Earth, its closest approach brings it within 140,000 miles of Earth's orbit, which is closer than the moon, it currently poses no danger of hitting Earth for the foreseeable future.
Starting point is 00:09:00 Still, the urgency is that while we've identified 2,350 PHAs, scientists believe there are more than 3,000 yet to be found. And so perhaps AI is not just for stealing our jobs, but for avoiding a real-life replay of Michael Bay's 1990s classic Armageddon. Thanks as always for listening or watching to the AI breakdown brief. I'll be back soon with the main AI breakdown. Before we get into the main AI breakdown, I want to tell you about today's sponsor, Supermanage. If you work in a professional setting, you probably have some version of a one-on-one meeting, either with the people that work for you or the people that you work with. Unfortunately, all too often those one-on-one meetings become glorified catch-up calls.
Starting point is 00:09:42 Don't you wish you could jump right to the stuff that really matters? That's where SuperManage comes in. Supermanage AI magically distills your team's public Slack channels into a real-time brief on any employee, any time. Catch up on contributions, work in progress, challenges their facing, sentiment, everything you need to show up ready for a truly meaningful conversation. And it's completely free. Visit supermanage. a.a.4-Breakdown today to start making the most of your one-on-ones. And thanks again to SuperManage for sponsoring the AI breakdown. Welcome back to the AI breakdown. Today, we are talking
Starting point is 00:10:17 all about the state of the AI chip wars. And obviously, any conversation about said chip wars has to start with Nvidia. Yesterday, Nvidia's CEO, Jensen Huang, made a set of announcements in L.A. about new generative AI initiatives from Nvidia. And so let's start by looking at what was announced and how it relates to this larger AI chip battle. First up, for those of you who think the metaverse is dead and just a hype cycle gone by, it's live enough that a company as big as Nvidia is still investing resources in it. Invita had some updates about its Omniverse platform yesterday. Specifically, they say they're advancing the development of Open USD. USD stands for universal scene description, and Nvidia describes it as a 3D framework enabling interoperability between
Starting point is 00:11:00 software tools and data types for the building of virtual worlds. And they actually, actually make an analogy to help understand the comparison point for these initiatives. Jensen said at the event, just as HTML ignited a major computing revolution of the 2D internet, OpenUSD will spark the era of collaborative 3D and industrial digitalization. NVIDIA is putting our full force behind the advancement and adoption of OpenUSD through our development of NVIDIA Omniverse and generative AI. The second big generative AI announcement from NVIDIA was their AI workbench. The NVIDIA AI Workbench is, they say, a unified, easy-to-use toolkit that allows.
Starting point is 00:11:34 developers to quickly create, test, and customize, pre-trained generative AI models on a PC or workstation, then scale them to virtually any data center, public cloud, or Nvidia cloud. So effectively, this is another entrant into the Enterprise AI workspace. If you're a regular listener of the show, you will have heard me talk about, for example, Amazon's Bedrock, which is making a bet that there won't be one winner-take-all model, but that enterprises who have a keen sense of needing to control their data end-to-end, and a real fear of losing that data for training purposes to some third party are likely to opt for something that's much more customized
Starting point is 00:12:06 and built on either A, open source tools, or B, Enterprise-grade AI platforms that are built by partners they already trust with their data. That's the play that Amazon is going for, and it seems like that might be something that Nvidia is trying to do as well. Manavir Doss, the vice president of Enterprise Computing at Nvidia said, enterprises around the world are racing to find the right infrastructure and build generative AI models and applications. Nvidia AI Workbench provides a simplified path for cross-organizational teams to create the AI-based applications that are increasingly becoming essential in modern business.
Starting point is 00:12:37 Giving more detail about what it actually does, they write, access through a simplified interface running on a local system. Invidia AI Workbench allows developers to customize models from popular repositories like HuggingFace, GitHub, and Invidia NGC using custom data. Invidia's Dr. Jim Fan, one of the mainstays of AI Twitter, tweeted yesterday, happy to share that Nvidia is partnering with Hugging Face, we love open source software community. Invidia DGX cloud will be accessible with HuggingFace to create and customize generative AI models for the enterprise. Yes, we do have a cloud. The official announcement reads,
Starting point is 00:13:09 integration of Nvidia DGX cloud and HuggingFace platform to speed LLM training and tuning simplifies customizing models for nearly every industry. CEO Jensen Huang again said, Researchers and developers are at the heart of generative AI that is transforming every industry. HuggingFace and Invidia are connecting the world's largest AI community with Nvidia's AI computing platform in the world's leading cloud. clouds. Now, as part of this hugging face is spitting up a new service they call training cluster as a service, which will help simplify the process of creating new custom models for the enterprise.
Starting point is 00:13:37 But maybe the biggest announcement, at least when it came to what Wall Street was thinking about, was around the forthcoming next generation GH200 Grace Hopper Super Chip. Basically, we had gotten information about the chip in the past, but at yesterday's event, we learned more about the platform built around it. Invidia says that this platform will deliver 3.5x more memory capacity and 3x more bandwidth than the current generation offering. The other big thing that we got was information about availability. Invidia says leading system manufacturers are expected to deliver systems based on the platform in Q2 of calendar year 2024. Now, one of the big use cases that NVIDIA is looking at is of course the next generation of data centers. In fact, Huang said in the
Starting point is 00:14:18 address, this processor is designed for the scaleout of the world's data centers. Now, invidia is by far the dominant player in the AI chip space. Estimits I've seen suggest that they have between 80 and 83% of the AI chip market. Their biggest competitor is, of course, AMD. And just a couple months ago, AMD announced its own new chip, the MI300X, which is meant to challenge Nvidia's place at the top of the AI heap. Now, not surprisingly, AMD has said that this new chip and the architecture surrounding it was specifically designed with large language models and other AI models in mind.
Starting point is 00:14:51 When AMD premiered the chip, they pointed out that it can use up to 192 gigabytes of memory compared to NVIDIA H100's 120 gigabytes of memory. The argument is that this would improve inference and that by adding memory on AMD chips, developers might not need as many GPUs in total. Importantly, AMD also added a software package around its AI chips, which is something that NVIDIA offers and has historically given them a lead. Now, outside of these chip giants, there are also a number of startups that are trying to elbow their way into the space.
Starting point is 00:15:20 One that made news recently for a nine-figure investment from Hyundai and Samsung was Ten Storrent. Part of what makes Ten Storrent notable is the extremely high-profile team that they brought on, including CEO Jim Keller, who has previously worked on AI chips at places like Apple. And one of the interesting things about this most recent funding announcement is that the participation of Hyundai was more than just financial. Hyundai Motor put in $30 million and Kia put in $20 million, and part of their plans are to partner with Ten Storrent to jointly develop chips that can be built into future vehicles. Other chip startups that have recently raised money include Seema.AI, IR Labs, and Ethernovia.
Starting point is 00:15:56 However, the other big chip player might be the big tech companies themselves. In April, reports came out that Microsoft had been working on its own AI chips for years. From The Verge, Microsoft is reportedly working on its own AI chips that can be used to train large language models and avoid a costly reliance on Nvidia. The information reports that Microsoft has been developing the chips in secret since 2019, and some Microsoft and OpenAI employees already have actually. access to them to test how well they perform for the latest large language models like GPT4. The project is apparently codenamed Athena. Google is also apparently making a push for its own
Starting point is 00:16:29 chips. Just a couple weeks ago, the Wall Street Journal published a piece called, In Race for AI Chips, Google DeepMind uses AI to design specialized semiconductors. Google has been working on their tensor processing unit chips since at least 2016, and the new report had researchers from Google DeepMind claiming that they had using AI discovered a more efficient and automated way of designing these chips. Amazon has also discussed making its own AI chips. In a conversation with CNBC, he said, I think of generative AI as having three macro layers and they are all really big and important. The bottom layer is the compute, all the machine learning training and inference. What matters
Starting point is 00:17:04 in the compute is the chip in there. There has really been one chip provider. Supply is more scarce and it's expensive. It's why we've invested over the last few years in our own customized training chips and inference chips, which will have much better price performance than anywhere else. We are quite optimistic that a lot of the machine learning training and inference will be done on AWS chips and compute. So TLDR, Nvidia is the 800-pound gorilla in this space, but everyone is coming after them. They're established competitors like AMD, novel startups trying to use partnerships to get a leg up like Tenthsorrent, and of course all the biggies like Amazon, Google, Microsoft OpenAI, who don't want to be at the behest of any other company.
Starting point is 00:17:43 And frankly, right now, it appears that there may be enough market to go around for everyone. Just three days ago, CNN wrote, The crushing demand for AI has also revealed the limits of the global supply chain for powerful chips used to develop and field AI models. CNN points to Microsoft's annual report, which identifies for the first time the availability of GPUs as a risk factor for investors. During his testimony before the Senate in May, OpenAI CEO Sam Altman joked about how ChatGPT was struggling to keep up with all of the demand for its service. At that time, Altman said, we're so short on GPUs, the less people that use the tool, the better. Now, part of the issue isn't just that the demand for AI increased so fast, but that the makers of GPUs also have bottlenecks around a number of the key supplies for actually manufacturing more chips. And finally, there is the political dimension of all of this.
Starting point is 00:18:30 Three weeks ago, representatives from Nvidia, Qualcomm, and Intel were at the White House to discuss U.S.-China policy when it came to the export of AI chips. Reuters writes, the chip industry is keen to protect its profits in China as the Biden administration considers another round of restrictions on chip exports to China, Last year, China accounted for $180 billion in semiconductor purchases, more than a third of the worldwide total of $555.9 billion, and the largest single market. Now, in the wake of the existing export restrictions, Nvidia started selling a tweaked and depowered AI chip specifically designed for the Chinese market, but now some politicians in Washington are saying even that should be restricted. As wonky and technical as it might seem on the outset, the story of the chip space, from an industry perspective, a competitive perspective, and a political perspective, is going to be absolutely integral to how AI develops and evolves. So hopefully this gave you a better sense about where the chips lie right now, pun absolutely intended,
Starting point is 00:19:25 and I, of course, will keep you posted as more developments happen. If you're enjoying the AI breakdown, please click that like or subscribe button. And if you're listening to the podcast, I would so appreciate it if you would take the time to leave a five-star rating. Those ratings go a long way to helping people discover the show, and I appreciate you for taking the time to do so. Appreciate you guys, as always.
Starting point is 00:19:46 Until next time, peace.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.