Tech Brew Ride Home - Thu. 03/28 – Claude 3 Opus Surpasses GPT-4 For The First Time
Episode Date: March 28, 2024There’s a new king of the AI hill as Anthropic bests OpenAI for the first time. Amazon invests more in Anthropic and is investing a TON more in datacenters. Is that GPT sort of App Store not exactly... catching fire? A big acquisition in gaming. And the tiny Caribbean island nation that is one of the biggest winners of the AI moment so far. Links: “The king is dead”—Claude 3 surpasses GPT-4 on Chatbot Arena for the first time (ArsTechnica) Amazon spends $2.75 billion on AI startup Anthropic in its largest venture investment yet (CNBC) Amazon Bets $150 Billion on Data Centers Required for AI Boom (Bloomberg) OpenAI’s app store draws investors and students seeking artificial aids (FT) Oregon’s governor signs right-to-repair law that bans ‘parts pairing’ (The Verge) Take-Two Buys Gearbox From Embracer, Confirms Development on New Borderlands Game (IGN) The A.I. Boom Makes Millions for an Unlikely Industry Player: Anguilla (NYTimes) Learn more about your ad choices. Visit megaphone.fm/adchoices
Transcript
Discussion (0)
On April 4th, 2023, around 2 in the morning, a man was found stabbed multiple times on a sidewalk in downtown San Francisco.
Hey, who did this to you?
What happened next turned the story into a political firestorm.
Reports have identified the victim as Bob Lee, the founder of Cash App.
From Bloomberg Podcasts, this is Foundering, the Killing of Bob Lee, beginning April 16.
Welcome to the Tech meme right home for Thursday, March 28th,
2024. I'm Brian McCullough today. There's a new king of the AI Hill as Anthropic Best Open
AI for the first time. Amazon invests more in Anthropic and is investing a ton more in data
centers. Is that GPT sort of app store not exactly catching fire, a big acquisition in gaming,
and the tiny Caribbean nation that is one of the biggest winners of the AI moment so far.
Here's what you miss today in the world of tech. Something, something. These are the headlines
that we didn't used to do, but which matter in this AI era, there's a new King of the Hill
in the AI Model Wars. Anthropics Claude III Opus has surpassed OpenAIs GPT4 for the first time
on Chatbot Arena, a crowdsourced leaderboard used by AI researchers for LLM evaluations.
Quoting Ars Tecina, The King is Dead, tweeted software developer Nick Dobos in a post-comparing
GPT4 Turbo and Claude3 Opus that has been making the
rounds on social media, R-I-P-GPT-4, end quote. Since GPD4 was included in Chatbot Arena around May 10th,
2023, the leaderboard launched May 3rd of that year, variations of GPT4 have consistently been
on the top of the chart until now, so its defeat in the arena is a notable moment in the relatively
short history of AI language models. One of Anthropics' smaller models, Haiku, has also been
turning heads with its performance on the leaderboard. For the first time, the best available models,
Opus for advanced tasks,
Haiku for Cost and Efficiency,
are from a vendor that isn't open AI,
independent AI researcher.
Simon Willison told Ars Technica,
that's reassuring.
We all benefit from a diversity
of top vendors in this space,
but GPD4 is over a year old at this point,
and it took that year for anyone else to catch up,
end quote.
Chatbot Arena is run by large model systems organization.
LIMSysorg, a research organization,
dedicated to open models that operates as a collaboration between students and faculty at
University of California, Berkeley, UC San Diego, and Carnegie Mellon University. We profiled how the site
works in December, but in brief, chatbot arena presents a user visiting the website with a chat
input box and two windows showing output from two unlabeled LLMs. The user's task is to rate which
output is better based on any criteria the user deems most fit. Through thousands of these
subjective comparisons chatbot arena calculates the, quote, best models in aggregate and
populates the leaderboard updating it over time.
Chatbot arena is important to researchers because they often find frustration in trying to
measure the performance of AI chatbots whose wildly varying outputs are difficult to quantify.
In fact, we wrote about how notoriously difficult it is to objectively benchmark LLMs in our
news piece about the launch of Claude 3.
For that story, Willison emphasized the important role of vibes or subjective feelings
in determining the quality of an LLM.
Yet another case of Vibes as a key concept in modern AI, he said.
The Vibes sentiment is common in the AI space,
where numerical benchmarks that measure knowledge or test-taking ability
are frequently cherry-picked by vendors to make their results look more favorable.
Just had a long coding session with Claude III Opus,
and man does it absolutely crush GPT4.
I don't think standard benchmarks do this model justice,
tweeted AI software developer Anton Bechage on March 19.
Claude's rise may give OpenAI pause, but as Willison mentioned, the GPT4 family itself,
although updated several times, is over a year old.
Currently, the arena lists four different versions of GPT4, which represent incremental updates of the
LLM that get frozen in time because each has a unique output style, and some developers using them
with OpenAI's API need consistency, so their apps built on top of GPT4's outputs don't break, end quote.
Speaking of, Amazon has invested another $2.75 billion in Anthropic, the second tranche of its planned $4 billion investment in the AI startup after investing $1.25 billion in September 2023.
Quoting CNBC, Amazon will maintain a minority stake in the company and won't have an Anthropic board seat, the company said.
The deal was struck at the AI startup's last valuation, which was $18.4 billion, according to a source.
Over the past year, Anthropic closed five different funding deals worth about $7.3 billion.
The Amazon Move is the latest in a spending blitz among cloud providers to stay ahead in the AI race.
And it's the second update in a week to Anthropics' capital structure.
Late Friday, bankruptcy filings showed Crypto Exchange FTX struck a deal with a group of buyers to sell the majority of its stake in Anthropic,
confirming a CNBC report from last week, end quote.
This is not unrelated to that.
exactly Sam Altman suggesting $7 trillion would be required to make next generation AI chips,
but this is still an eye-popping number and it's due to AI. According to an analysis,
in the past two years alone, Amazon committed to spending $148 billion over 15 years on data
centers, including in Mississippi, Saudi Arabia, and Malaysia, quoting Bloomberg. The spending spree
is a show of force as the company looks to maintain its grip on the cloud services market,
where it holds about twice the share of number two player Microsoft. Sales growth at Amazon Web Services
slowed to a record low last year as business customers cut costs and delayed modernization projects.
Now spending is starting to pick up again, and Amazon is keen to secure land and electricity
for its power-hungry facilities. We're expanding capacity quite significantly, said Kevin Miller
and AWS Vice President, who oversees the company's data centers. I think that just gives us the
ability to get closer to customers, end quote. Amazon's
planned outlay on server farms dwarfs the public commitments from Microsoft and Alphabet's Google,
though neither company discloses data center-related spending as consistently as Amazon. Microsoft and
Google spokespeople declined to provide comparable figures and added that each company
likely includes different costs in their estimates. Amid broader cost cutting on Amazon,
AWS's capital expenditures on data centers shrank 2% in 2023 for the first time, even as Microsoft
boosted its own spending by more than 50% according to the research firm Deloro Group.
But Amazon's chief financial officers said last month that capital expenditures would increase this year to support AWS growth, including AI-related projects.
Much of Amazon's data center expansion is geared toward meeting a rise in demand for corporate services like file storage and databases,
but the facilities, along with advanced and expensive chips, will also provide the massive computing power required for an expected boom and generative artificial intelligence.
Microsoft, close partner with OpenAI and Google are widely seen as leaders in commercializing,
software capable of generating text and insights, but Amazon is building its own tools to rival
OpenAI's ChatGPT and has partnered with other companies to power AI services with its servers.
As a result, Amazon expects to reap tens of billions of dollars in AI-related revenue, end quote.
Meanwhile, over to OpenAI, according to similar web, GPTs created by subscribers on OpenAI's GPT store,
accounted for just one and a half percent of desktop visits to ChatGPT's site in February.
suggesting limited appeal for this attempt to go the platform play route, quoting the Financial Times.
In some ways, OpenAI is following a very standard how to build a platform template.
That's so predictable that it might have used ChatGPT to write it, said Benedict Evans,
an independent technology analyst.
So we have a developer conference, an API, and an App Store, but it's not clear to me
whether this really has traction, end quote.
The Microsoft-backed startup has allowed paying users to create custom versions of ChatGPT since November,
with other subscribers then able to access these so-called GPTs through an online store.
According to new data from Analytics Group's similar web,
some of the most popular GPs serve educational purposes,
with the second most used app being consensus,
a tool to search and summarize academic papers.
Other apps have surged in usage this year,
including design tools that can instantly generate images,
translate between languages,
or help with job applications by reviewing CVs and cover letters.
But a Financial Times analysis also found many popular GPs could be
in breach of Open AIS.
usage policies, which has rules against chatbots that provide financial, legal, or medical
advice that have not been reviewed by a qualified professional.
Five of the most viewed GPs are described as being able to produce content that can bypass
detection tools employed by schools and universities to determine if students had produced
essays and answers using AI. These tools were viewed at least three million times in total,
despite open AI barring apps that engage in or promote academic dishonesty. Another app called
Finance Wizard, which has been used more than 200,000 times, claims to reveal future stock
movements. Its creator told the FT, the app makes predictions based on historical data and contains
disclaimers warning against using it as financial advice, end quote.
Oregon has become the first state in the nation with a right to repair law that bans manufacturers
from using parts pairing to dictate what replacement components can be used. Quoting the verge.
Oregon Governor Tina Kotech has now signed one of the strongest U.S. right to repair bills into law
after it passed the state legislature several weeks ago by an almost three-to-one margin.
Oregon's SB 1596 will take effect next year, and like similar laws introduced in Minnesota and
California, it requires device manufacturers to allow consumers and independent electronics businesses
to purchase the necessary parts and equipment required to make their own device repairs.
Oregon's rules, however, are the first to ban parts pairing, a practice manufacturer is used
to prevent replacement components from working unless the
company's software approves them. These protections also prevent manufacturers from using parts
pairing to reduce device functionality or performance or display any misleading warning messages
about unofficial components installed within a device. Current devices are excluded from the ban,
which only applies to gadgets manufactured after January 1st, 2025. Much like Minnesota's
and California's laws, Oregon's, other right-to-repair rules apply only to phones sold
after July 1st, 2021, or to other consumer electronics equipment sold after July 1st, 2015.
Some products, like devices powered by combustion engines, medical equipment, farming equipment,
HVAC equipment, video game consoles, and energy storage systems are excluded from Oregon's
rules entirely. According to I-Fixit, quote, the exemption list is a map of the strongest
anti-repair lobbies and also of the next frontier of the movement. However, I-Fixit CEO Kyle
Weans also said in the statement, quote, by applying to most products made after 2015, this law will
open up repair for the things Oregonians need to get fixed right now. And by limiting the repair
restricting practices of parts pairing, it protects fixing for years to come. We won't stop
fighting until everyone everywhere has these rights, end quote. Another similarity between Oregon's and
California's right to repair laws is that both push manufacturers to make any documentation,
tools, parts, and software required to fix their devices available to consumers and repair shops without
overcharging for them. But while California's law requires this support to be available for
seven years after production for devices over $100, Oregon hasn't mandated any such duration, end quote.
Gamesmaker Take 2 has acquired Borderlands developer Gearbox Entertainment from the Embracer Group for
$460 million in stock. The deal is expected to close by the end of June, quoting IGN.
In a press release, Embracer shared that it is divesting gearbox software, gearbox Montreal,
gearbox studio Quebec, and the franchise's borderlands, tiny Tina's Wonderlands, homeworld, risk of rain,
brothers in arms, and Duke Nukem. Embracer will retain rights to Gearbox Publishing San Francisco,
formerly Perfect World Entertainment, and which it plans to rename, the publishing rights to
Remnant, Hyperlight Breaker, and other unannounced games, cryptic studios, Lost Boys Interactive,
and Captured Dimensions. All of its retained.
assets will be integrated into other parts of Embracer Group. The sale is expected to close by
the end of June. Gearbox will join Take-Two's 2K division and will continue to be led by CEO and founder
Randy Pitchford. Currently, Gearbox has both a new borderlands and a new homeworld game in
development as well as at least one exciting new intellectual property per a separate press
release. Notably, the full purchase price of $460 million will be paid to Embracer Group in
Take-2 shares rather than cash. For comparison, Embracer originally,
purchased gearbox for $363 million, half in cash, half in newly issued Embracer Group shares,
with an additional consideration of $1.015 billion, also partially in shares to be paid out if
gearbox hit certain targets within six years. Gearbox and Take Two had a longstanding
relationship with Take Two serving as the publisher of the Borderlands franchise via its 2K label.
The two also have partnered on an upcoming Borderlands film, as well as Gearbox's 2016 game
Battleborn. Embracer Group has been gradually shedding a number of its men.
many, many studios after a multi-year acquisition spree fell apart last year, end quote.
Finally, today, back to AI for the moment.
Lots of AI startups have that dot AI domain name, naturally.
Well, it turns out that's very good news for Czech notes here, Anguilla, quoting the New York Times.
In Anguilla, a tiny Caribbean island to the east of Puerto Rico, the AI boom has made the country a fortune.
The British territory collects a fee from every registration for internet addresses that end in.a.I., which happens to be the domain name assigned to the island, like dotFR for France and dot JP for Japan.
With companies wanting internet addresses that communicate they are at the forefront of the AI boom, like Elon Musk's X.aI. website for his artificial intelligence company, Anguilla has recently received a huge influx in request for domain names.
For each domain registration, Anguilla's government gets anywhere from $140 to $3 to $1,000.
thousands of dollars from website names sold at auctions, according to government data. Last year,
Anguilla's government made about $32 million from those fees. That amounted to more than 10%
of gross domestic product for the territory of almost 16,000 people and 35 square miles. Some people
call it a windfall. Anguillis Premier Ellis Webster said, we just call it God's smiling down on us.
Mr. Webster said the government used the money to provide free health care for citizens 70 and
older, and it has committed millions of dollars to finish building a school and a vocational
training center. The government has also allocated funds to improve its airport, doubled its budget
for sports activities, events, and facilities, and increased the budget for citizens seeking
medical treatment overseas, he said. The island, which relies heavily on tourism, had been
hard hit by the pandemic's restrictions on travel and a devastating hurricane in 2017. The dot-i domain
income was the boosts the country needed. We never thought that it would have this potential,
Mr. Webster said. Anguilla's control of .AI dates back to the early days of the internet when nations
and territories were assigned their slice of cyberspace. Anguilla received.aI and its government, whose own site is
www.gov.org. Did not make much of it until the domain names started bringing in millions.
Officials are uncertain how long the boon will last, but they predicted 2024 would bring in similar
income as last year from domain names. It's not the first bonanza to make a big difference to a grateful
domain owner, Tuvalu, a string of islands northeast of Australia, sold the rights to its suffix.
Dot TV to a Canadian entrepreneur for $50 million and use the money to put electricity
on the outer islands, create scholarships, and finance the process to join the United Nations,
end quote. I've got a quick ask for you, hive mind. Can anyone put me in touch with Balaji
Srinavasin? Want to see if I can get him on the pod. If you can help, please at me on Twitter or
threads or email me at brian at techmeme.com. Thanks in advance.
