The AI Daily Brief: Artificial Intelligence News and Analysis - The LLM for Coding Competition Heats Up!
Episode Date: August 9, 2023On the Brief: Stability AI releases StableCode as Google opens the waitlist for its in-browser AI-powered coding environment; Google is in talks with major record labels around AI music and AI discove...rs an asteroid. On our main episode: NVIDIA announces its state of the art Grace Hopper superchip is coming next year, alongside a slew of other generative AI updates. NLW explores how the AI chip competitive space is developing, including the latest from AMD, startups like Tenstorrent, and efforts from the big tech giants. Today's Sponsor: Supermanage - AI for 1-on-1's - https://supermanage.ai/breakdown ABOUT THE AI BREAKDOWN The AI Breakdown helps you understand the most important news and discussions in AI. Subscribe to The AI Breakdown newsletter: https://theaibreakdown.beehiiv.com/subscribe Subscribe to The AI Breakdown on YouTube: https://www.youtube.com/@TheAIBreakdown Join the community: bit.ly/aibreakdown Learn more: http://breakdown.network/
Transcript
Discussion (0)
Today on the AI breakdown, we're looking at the latest from NVIDIA and the state of the AI chip wars.
Before that on the brief, some new competition in the LLM for coding space.
The AI Breakdown is a daily podcast and video about the most important news and discussions in AI.
Go to Breakdown.network for more information about our newsletter, our Discord, and our YouTube channel.
Welcome back to the AI Breakdown Brief, all the AI headline news you need in around five minutes.
Today we kick off with one of the hottest parts of the AI space, which is, of course, the competition
around LLMs for coding.
Now, it is perhaps no surprise that developers are some of the earliest adopters of artificial
intelligence.
I mean, it's the cohort that built the tool, so it's not surprising that they're using them.
What might be surprising is just how profligate they already are.
In June, GitHub released the results of a developer survey where they found that 92% of
developers in the U.S. are already using AI coding tools both in and outside of work.
On top of that, 70% of developers say AI coding tools will offer them an advantage at work,
thinking that it will lead to better code quality, faster completion time, and an easier
ability to resolve incidents, and four out of five developers also think that AI coding tools will
help make their team more collaborative. Given all that, it's not a surprise that there is a lot
of competition around which LLM is best for coding. Two interesting announcements on that front.
The first comes from Stability AI who keep up their absolutely relentless release schedule with the announcement of Stable Code.
The company claims that StableCode is the first LLM product entirely focused on coding.
The announcement post writes,
StableCode offers a unique way for developers to become more efficient by using three different models to help in their coding.
The base model was first trained on a diverse set of programming languages from the stack dataset from BigCode,
and then trained further with popular languages like Python Go Java, JavaScript, C, Markdown, and C++.
After the base model had been established, the instruction model was then tuned for specific
use cases to help solve complex programming tasks.
120,000 code instruction response pairs in alpaca format were trained on the base model to achieve
this result.
Still, the big cell for stable code is what they claim is the biggest token context window
for any open LLM coding model.
The version of stable code that is available as their long context window model has a context
window of 16,000 tokens, which they say can handle two to four times more code than previously
released open models. Now, of course, as soon as this was released, developers started looking at how
it compared in terms of coding benchmarks. Prince Canuma tweets, the instruction-tuned stable code variant
performs competitively with the Lama 2 75 billion perams variant on the human avowal benchmark.
It also outperforms all other Lama 2 variants while significantly smaller at only 3 billion
parameters. Interestingly, Stability AI CEO Ahmad Mostok didn't love what evaluation models the company
had presented. He said, Human Aval isn't the best benchmark for a code completion.
model, team are working on better ones for this and other code tasks as well as specific models for
use cases. Still, can only beat the benchmark in front of you. Clam, the CEO of Hugging Face, however,
responded and suggested that Stability AI add the Stable Code model to HuggingFace's multilingual
cod evals, which Imod dutifully agreed to. That said, Lubna Ben Ala, an ML engineer at HuggingFace,
wrote, I added stable code completion Alpha 3B to the leaderboard. It's competitive with
smaller size models on Python, Java, JavaScript, and C++, but has poor performance on the rest.
Now, this is less than 24 hours old, so I think it's not exactly clear where this sits relative
to other LLMs that are used for coding, but what is clear is just how big an area of competition
this is. Another AI-related coding announcement yesterday came from Google. Lior at Alpha Signal AI
writes, just in, Google announces a new browser-based code environment. It will bring the entire
full stack and app development workflow to the cloud. It also includes generative AI features based
on Palm 2, code generation, code completion, translating code between languages, code explanation.
To learn more about this, you can go to IDX.dev. Google describes the project by saying,
These days, launching applications means navigating an endless sea of complexity. We felt this pain at
Google so we started project IDX, an experimental new initiative aimed at bringing your entire full-stack
multi-platform app development workflow to the cloud. And while this isn't just about AI, that is definitely
one of the big features they're pushing. They write, work quickly in a
with AI assistance from Google built-in, including code generation, code completion, translating
code between programming languages, explaining code, and more, all powered by Cody, a foundational
AI model trained on code and built on Palm 2. Project IDX is currently available by waitlist only.
Next up on the brief, what has to be one of the most obvious and predictable about faces in history,
after throwing an absolute hissy fit around the Drake track that came out earlier this year,
hard on my sleeve, which was, of course, an AI track, now it appears that labels including
Universal Music and Warner are in conversations with Google about licensing artists IP,
including their voices, melodies, and more to create an IP-approved platform for people to develop
AI music. The reporting comes from the Financial Times. The way they characterize it is
discussions between Google and Universal Music are at an early stage and no product launch is imminent,
but the goal is to develop a tool for fans to create these tracks legitimately and pay the
owners of the copyrights for it, said people close to the situation. Artists would have the choice to opt in
the people said. Now, this reminds me quite a bit of an idea posted by product hunt founder
Ryan Hoover in April. Back then, he wrote, free startup idea that will likely get you sued.
AI Spotify. How it works. AI Spotify hosts AI generated music of your favorite artists.
Anyone can submit music in the best song's surface based on listens and likes.
Music with the most listens earns a pro rata share of subscription revenue reserved for the
original artists. For example, Drake could claim money generated from his likeness on the
platform. Artists that do not want to participate can opt out entirely banning any music that
uses their lightness, or individually allow songs they endorse. Of course, there are many ethical and
legal issues with this model, especially with labels, but maybe this is a germ of a shower thought that
has potential. Now, this to me seems absolutely like the only path that makes any sort of sense.
TLDR, you can't put the genie of AI music back in the lamp once it's released. There's simply
going to be too many models, too many tools, and too much publicly available content to train those
models on, to not see just a ton of AI tracks. In that context, to use the classic phrase,
If you can't beat them, join them.
It appears that Warner and Universal are thinking in exactly the same way and are out to get
their cut.
A quick one from the world of policy and health AI.
Reports have been that Google has been testing its Med Palm 2 model in hospitals for
some of this year, and at least one senator, Senator Mark Warner, is not happy about it.
The Virginia Democrat sent a letter to the CEO of Google on Tuesday, basically warning them
off a further rollout.
The letter said, well, the technology has shown some of the president.
some promising results, there are also concerning reports of repeated inaccuracies and of Google's
own senior researchers expressing reservations about the readiness of the technology. Warner basically
accuses Google of sort of the same thing that Jeffrey Hinton has accused the entire big tech space of,
which is racing to get AI models that aren't fully tested and aren't fully safe out because of the
pressure of competition. So far, Google is standing its grounds, saying that the rollout is extremely
limited and that the company doesn't control any private data. Lastly, a nice positive one to end the
AI breakdown brief today, a new AI model called Heliolink 3D has been used to spot asteroids that
could in the future pose a threat to Earth. The team from the University of Washington claims that
the new AI model can identify asteroids with just half the observations that were needed before.
Mario Jurek, the director of the DRAC Institute at the University of Washington that developed the model,
writes, the solar system is home to millions of rocky bodies ranging from small asteroids
a few feet in diameter to dwarf planets the size of our moon. Most of them are distant, but a number
orbit close to the Earth. These are known as near-Earth objects. The closest of these, whose trajectories
take them within 5 million miles of the Earth's orbit, warrant special attention. The large, very
nearby objects are known as potentially hazardous asteroids, PHAs. They're systematically
searched for and monitored to ensure they won't collide with Earth. Astronomers search for PHAs
using specialized telescope systems. The NASA-funded Atlas survey is a prime example. To find asteroids,
Atlas takes images of parts of the sky at least four times every night. A discovery is made when they
notice a point of light moving unambiguously in a straight line over the image series.
Now from there, Jurek explains the development of the Rubin Observatory, which is an observatory
that will be based in the Chilean Andes and which could increase the discovery rate of PHAs.
Jurek writes, to be even more efficient, it will visit spots on the sky just twice each night rather
than the four times needed by present telescopes.
But with this novel observing cadence, we need a new type of discovery algorithm to reliably spot
space rocks.
And that is, of course, where the Helio-Link 3D model came in.
The big deal, Jurek says, is that it can identify asteroids with most.
much fewer up to 50% and more dispersed observations than required by today's methods.
Now, Dirk says that the asteroid they discovered, 2022 SF-289, while coming close to the Earth,
its closest approach brings it within 140,000 miles of Earth's orbit, which is closer than the moon,
it currently poses no danger of hitting Earth for the foreseeable future.
Still, the urgency is that while we've identified 2,350 PHAs, scientists believe there are
more than 3,000 yet to be found.
And so perhaps AI is not just for stealing our jobs, but for avoiding a real-life replay of Michael Bay's 1990s classic Armageddon.
Thanks as always for listening or watching to the AI breakdown brief.
I'll be back soon with the main AI breakdown.
Before we get into the main AI breakdown, I want to tell you about today's sponsor, Supermanage.
If you work in a professional setting, you probably have some version of a one-on-one meeting, either with the people that work for you or the people that you work with.
Unfortunately, all too often those one-on-one meetings become glorified catch-up calls.
Don't you wish you could jump right to the stuff that really matters?
That's where SuperManage comes in.
Supermanage AI magically distills your team's public Slack channels into a real-time brief on any employee, any time.
Catch up on contributions, work in progress, challenges their facing, sentiment, everything you need to show up ready for a truly meaningful conversation.
And it's completely free.
Visit supermanage.
a.a.4-Breakdown today to start making the most of your one-on-ones. And thanks again to
SuperManage for sponsoring the AI breakdown. Welcome back to the AI breakdown. Today, we are talking
all about the state of the AI chip wars. And obviously, any conversation about said chip wars
has to start with Nvidia. Yesterday, Nvidia's CEO, Jensen Huang, made a set of announcements in
L.A. about new generative AI initiatives from Nvidia. And so let's start by looking at what was
announced and how it relates to this larger AI chip battle. First up, for those of you who think
the metaverse is dead and just a hype cycle gone by, it's live enough that a company as big as
Nvidia is still investing resources in it. Invita had some updates about its Omniverse platform yesterday.
Specifically, they say they're advancing the development of Open USD. USD stands for universal
scene description, and Nvidia describes it as a 3D framework enabling interoperability between
software tools and data types for the building of virtual worlds. And they actually,
actually make an analogy to help understand the comparison point for these initiatives.
Jensen said at the event, just as HTML ignited a major computing revolution of the 2D internet,
OpenUSD will spark the era of collaborative 3D and industrial digitalization.
NVIDIA is putting our full force behind the advancement and adoption of OpenUSD through our
development of NVIDIA Omniverse and generative AI.
The second big generative AI announcement from NVIDIA was their AI workbench.
The NVIDIA AI Workbench is, they say, a unified, easy-to-use toolkit that allows.
developers to quickly create, test, and customize, pre-trained generative AI models on a PC
or workstation, then scale them to virtually any data center, public cloud, or Nvidia cloud.
So effectively, this is another entrant into the Enterprise AI workspace.
If you're a regular listener of the show, you will have heard me talk about, for example,
Amazon's Bedrock, which is making a bet that there won't be one winner-take-all model,
but that enterprises who have a keen sense of needing to control their data end-to-end,
and a real fear of losing that data for training purposes to some third party
are likely to opt for something that's much more customized
and built on either A, open source tools,
or B, Enterprise-grade AI platforms that are built by partners they already trust with their data.
That's the play that Amazon is going for,
and it seems like that might be something that Nvidia is trying to do as well.
Manavir Doss, the vice president of Enterprise Computing at Nvidia said,
enterprises around the world are racing to find the right infrastructure
and build generative AI models and applications.
Nvidia AI Workbench provides a simplified path for cross-organizational teams to create the AI-based applications that are increasingly becoming essential in modern business.
Giving more detail about what it actually does, they write, access through a simplified interface running on a local system.
Invidia AI Workbench allows developers to customize models from popular repositories like HuggingFace, GitHub, and Invidia NGC using custom data.
Invidia's Dr. Jim Fan, one of the mainstays of AI Twitter, tweeted yesterday, happy to share that Nvidia is partnering with Hugging Face,
we love open source software community.
Invidia DGX cloud will be accessible with HuggingFace to create and customize generative
AI models for the enterprise.
Yes, we do have a cloud.
The official announcement reads,
integration of Nvidia DGX cloud and HuggingFace platform to speed LLM training and tuning
simplifies customizing models for nearly every industry.
CEO Jensen Huang again said,
Researchers and developers are at the heart of generative AI that is transforming every industry.
HuggingFace and Invidia are connecting the world's largest AI community
with Nvidia's AI computing platform in the world's leading cloud.
clouds. Now, as part of this hugging face is spitting up a new service they call training cluster as a
service, which will help simplify the process of creating new custom models for the enterprise.
But maybe the biggest announcement, at least when it came to what Wall Street was thinking
about, was around the forthcoming next generation GH200 Grace Hopper Super Chip. Basically,
we had gotten information about the chip in the past, but at yesterday's event, we learned more
about the platform built around it. Invidia says that this platform will deliver 3.5x more memory
capacity and 3x more bandwidth than the current generation offering. The other big thing that we got
was information about availability. Invidia says leading system manufacturers are expected to deliver
systems based on the platform in Q2 of calendar year 2024. Now, one of the big use cases that
NVIDIA is looking at is of course the next generation of data centers. In fact, Huang said in the
address, this processor is designed for the scaleout of the world's data centers. Now,
invidia is by far the dominant player in the AI chip space.
Estimits I've seen suggest that they have between 80 and 83% of the AI chip market.
Their biggest competitor is, of course, AMD.
And just a couple months ago, AMD announced its own new chip, the MI300X,
which is meant to challenge Nvidia's place at the top of the AI heap.
Now, not surprisingly, AMD has said that this new chip and the architecture surrounding it
was specifically designed with large language models and other AI models in mind.
When AMD premiered the chip, they pointed out that it can use up to 192 gigabytes of memory compared to
NVIDIA H100's 120 gigabytes of memory.
The argument is that this would improve inference and that by adding memory on AMD chips, developers
might not need as many GPUs in total.
Importantly, AMD also added a software package around its AI chips, which is something
that NVIDIA offers and has historically given them a lead.
Now, outside of these chip giants, there are also a number of startups that are trying to
elbow their way into the space.
One that made news recently for a nine-figure investment from Hyundai and Samsung was Ten Storrent.
Part of what makes Ten Storrent notable is the extremely high-profile team that they brought on,
including CEO Jim Keller, who has previously worked on AI chips at places like Apple.
And one of the interesting things about this most recent funding announcement is that the participation
of Hyundai was more than just financial.
Hyundai Motor put in $30 million and Kia put in $20 million,
and part of their plans are to partner with Ten Storrent to jointly develop chips that can be built into future vehicles.
Other chip startups that have recently raised money include Seema.AI, IR Labs, and Ethernovia.
However, the other big chip player might be the big tech companies themselves.
In April, reports came out that Microsoft had been working on its own AI chips for years.
From The Verge, Microsoft is reportedly working on its own AI chips that can be used to train
large language models and avoid a costly reliance on Nvidia.
The information reports that Microsoft has been developing the chips in secret since 2019,
and some Microsoft and OpenAI employees already have actually.
access to them to test how well they perform for the latest large language models like GPT4.
The project is apparently codenamed Athena. Google is also apparently making a push for its own
chips. Just a couple weeks ago, the Wall Street Journal published a piece called,
In Race for AI Chips, Google DeepMind uses AI to design specialized semiconductors.
Google has been working on their tensor processing unit chips since at least 2016,
and the new report had researchers from Google DeepMind claiming that they had using AI
discovered a more efficient and automated way of designing these chips.
Amazon has also discussed making its own AI chips. In a conversation with CNBC, he said,
I think of generative AI as having three macro layers and they are all really big and important.
The bottom layer is the compute, all the machine learning training and inference. What matters
in the compute is the chip in there. There has really been one chip provider. Supply is more scarce
and it's expensive. It's why we've invested over the last few years in our own customized training
chips and inference chips, which will have much better price performance than anywhere else.
We are quite optimistic that a lot of the machine learning training and inference will
be done on AWS chips and compute. So TLDR, Nvidia is the 800-pound gorilla in this space,
but everyone is coming after them. They're established competitors like AMD, novel startups
trying to use partnerships to get a leg up like Tenthsorrent, and of course all the biggies like
Amazon, Google, Microsoft OpenAI, who don't want to be at the behest of any other company.
And frankly, right now, it appears that there may be enough market to go around for everyone.
Just three days ago, CNN wrote,
The crushing demand for AI has also revealed the limits of the global supply chain for powerful chips used to develop and field AI models.
CNN points to Microsoft's annual report, which identifies for the first time the availability of GPUs as a risk factor for investors.
During his testimony before the Senate in May, OpenAI CEO Sam Altman joked about how ChatGPT was struggling to keep up with all of the demand for its service.
At that time, Altman said, we're so short on GPUs, the less people that use the tool, the better.
Now, part of the issue isn't just that the demand for AI increased so fast, but that the makers of GPUs also have bottlenecks around a number of the key supplies for actually manufacturing more chips.
And finally, there is the political dimension of all of this.
Three weeks ago, representatives from Nvidia, Qualcomm, and Intel were at the White House to discuss U.S.-China policy when it came to the export of AI chips.
Reuters writes, the chip industry is keen to protect its profits in China as the Biden administration considers another round of restrictions on chip exports to China,
Last year, China accounted for $180 billion in semiconductor purchases, more than a third of the worldwide total of $555.9 billion, and the largest single market.
Now, in the wake of the existing export restrictions, Nvidia started selling a tweaked and depowered AI chip specifically designed for the Chinese market, but now some politicians in Washington are saying even that should be restricted.
As wonky and technical as it might seem on the outset, the story of the chip space, from an industry perspective, a competitive perspective, and a political perspective,
is going to be absolutely integral to how AI develops and evolves.
So hopefully this gave you a better sense about where the chips lie right now,
pun absolutely intended,
and I, of course, will keep you posted as more developments happen.
If you're enjoying the AI breakdown,
please click that like or subscribe button.
And if you're listening to the podcast,
I would so appreciate it if you would take the time to leave a five-star rating.
Those ratings go a long way to helping people discover the show,
and I appreciate you for taking the time to do so.
Appreciate you guys, as always.
Until next time, peace.
