Tech Brew Ride Home - Thu. 03/28 – Claude 3 Opus Surpasses GPT-4 For The First Time

Episode Date: March 28, 2024

There’s a new king of the AI hill as Anthropic bests OpenAI for the first time. Amazon invests more in Anthropic and is investing a TON more in datacenters. Is that GPT sort of App Store not exactly... catching fire? A big acquisition in gaming. And the tiny Caribbean island nation that is one of the biggest winners of the AI moment so far. Links: “The king is dead”—Claude 3 surpasses GPT-4 on Chatbot Arena for the first time (ArsTechnica) Amazon spends $2.75 billion on AI startup Anthropic in its largest venture investment yet (CNBC) Amazon Bets $150 Billion on Data Centers Required for AI Boom (Bloomberg) OpenAI’s app store draws investors and students seeking artificial aids (FT) Oregon’s governor signs right-to-repair law that bans ‘parts pairing’ (The Verge) Take-Two Buys Gearbox From Embracer, Confirms Development on New Borderlands Game (IGN) The A.I. Boom Makes Millions for an Unlikely Industry Player: Anguilla (NYTimes) Learn more about your ad choices. Visit megaphone.fm/adchoices

Transcript
Discussion (0)
Starting point is 00:00:00 On April 4th, 2023, around 2 in the morning, a man was found stabbed multiple times on a sidewalk in downtown San Francisco. Hey, who did this to you? What happened next turned the story into a political firestorm. Reports have identified the victim as Bob Lee, the founder of Cash App. From Bloomberg Podcasts, this is Foundering, the Killing of Bob Lee, beginning April 16. Welcome to the Tech meme right home for Thursday, March 28th, 2024. I'm Brian McCullough today. There's a new king of the AI Hill as Anthropic Best Open AI for the first time. Amazon invests more in Anthropic and is investing a ton more in data
Starting point is 00:00:47 centers. Is that GPT sort of app store not exactly catching fire, a big acquisition in gaming, and the tiny Caribbean nation that is one of the biggest winners of the AI moment so far. Here's what you miss today in the world of tech. Something, something. These are the headlines that we didn't used to do, but which matter in this AI era, there's a new King of the Hill in the AI Model Wars. Anthropics Claude III Opus has surpassed OpenAIs GPT4 for the first time on Chatbot Arena, a crowdsourced leaderboard used by AI researchers for LLM evaluations. Quoting Ars Tecina, The King is Dead, tweeted software developer Nick Dobos in a post-comparing GPT4 Turbo and Claude3 Opus that has been making the
Starting point is 00:01:35 rounds on social media, R-I-P-GPT-4, end quote. Since GPD4 was included in Chatbot Arena around May 10th, 2023, the leaderboard launched May 3rd of that year, variations of GPT4 have consistently been on the top of the chart until now, so its defeat in the arena is a notable moment in the relatively short history of AI language models. One of Anthropics' smaller models, Haiku, has also been turning heads with its performance on the leaderboard. For the first time, the best available models, Opus for advanced tasks, Haiku for Cost and Efficiency, are from a vendor that isn't open AI,
Starting point is 00:02:11 independent AI researcher. Simon Willison told Ars Technica, that's reassuring. We all benefit from a diversity of top vendors in this space, but GPD4 is over a year old at this point, and it took that year for anyone else to catch up, end quote.
Starting point is 00:02:26 Chatbot Arena is run by large model systems organization. LIMSysorg, a research organization, dedicated to open models that operates as a collaboration between students and faculty at University of California, Berkeley, UC San Diego, and Carnegie Mellon University. We profiled how the site works in December, but in brief, chatbot arena presents a user visiting the website with a chat input box and two windows showing output from two unlabeled LLMs. The user's task is to rate which output is better based on any criteria the user deems most fit. Through thousands of these subjective comparisons chatbot arena calculates the, quote, best models in aggregate and
Starting point is 00:03:05 populates the leaderboard updating it over time. Chatbot arena is important to researchers because they often find frustration in trying to measure the performance of AI chatbots whose wildly varying outputs are difficult to quantify. In fact, we wrote about how notoriously difficult it is to objectively benchmark LLMs in our news piece about the launch of Claude 3. For that story, Willison emphasized the important role of vibes or subjective feelings in determining the quality of an LLM. Yet another case of Vibes as a key concept in modern AI, he said.
Starting point is 00:03:37 The Vibes sentiment is common in the AI space, where numerical benchmarks that measure knowledge or test-taking ability are frequently cherry-picked by vendors to make their results look more favorable. Just had a long coding session with Claude III Opus, and man does it absolutely crush GPT4. I don't think standard benchmarks do this model justice, tweeted AI software developer Anton Bechage on March 19. Claude's rise may give OpenAI pause, but as Willison mentioned, the GPT4 family itself,
Starting point is 00:04:07 although updated several times, is over a year old. Currently, the arena lists four different versions of GPT4, which represent incremental updates of the LLM that get frozen in time because each has a unique output style, and some developers using them with OpenAI's API need consistency, so their apps built on top of GPT4's outputs don't break, end quote. Speaking of, Amazon has invested another $2.75 billion in Anthropic, the second tranche of its planned $4 billion investment in the AI startup after investing $1.25 billion in September 2023. Quoting CNBC, Amazon will maintain a minority stake in the company and won't have an Anthropic board seat, the company said. The deal was struck at the AI startup's last valuation, which was $18.4 billion, according to a source. Over the past year, Anthropic closed five different funding deals worth about $7.3 billion.
Starting point is 00:05:06 The Amazon Move is the latest in a spending blitz among cloud providers to stay ahead in the AI race. And it's the second update in a week to Anthropics' capital structure. Late Friday, bankruptcy filings showed Crypto Exchange FTX struck a deal with a group of buyers to sell the majority of its stake in Anthropic, confirming a CNBC report from last week, end quote. This is not unrelated to that. exactly Sam Altman suggesting $7 trillion would be required to make next generation AI chips, but this is still an eye-popping number and it's due to AI. According to an analysis, in the past two years alone, Amazon committed to spending $148 billion over 15 years on data
Starting point is 00:05:54 centers, including in Mississippi, Saudi Arabia, and Malaysia, quoting Bloomberg. The spending spree is a show of force as the company looks to maintain its grip on the cloud services market, where it holds about twice the share of number two player Microsoft. Sales growth at Amazon Web Services slowed to a record low last year as business customers cut costs and delayed modernization projects. Now spending is starting to pick up again, and Amazon is keen to secure land and electricity for its power-hungry facilities. We're expanding capacity quite significantly, said Kevin Miller and AWS Vice President, who oversees the company's data centers. I think that just gives us the ability to get closer to customers, end quote. Amazon's
Starting point is 00:06:35 planned outlay on server farms dwarfs the public commitments from Microsoft and Alphabet's Google, though neither company discloses data center-related spending as consistently as Amazon. Microsoft and Google spokespeople declined to provide comparable figures and added that each company likely includes different costs in their estimates. Amid broader cost cutting on Amazon, AWS's capital expenditures on data centers shrank 2% in 2023 for the first time, even as Microsoft boosted its own spending by more than 50% according to the research firm Deloro Group. But Amazon's chief financial officers said last month that capital expenditures would increase this year to support AWS growth, including AI-related projects. Much of Amazon's data center expansion is geared toward meeting a rise in demand for corporate services like file storage and databases,
Starting point is 00:07:17 but the facilities, along with advanced and expensive chips, will also provide the massive computing power required for an expected boom and generative artificial intelligence. Microsoft, close partner with OpenAI and Google are widely seen as leaders in commercializing, software capable of generating text and insights, but Amazon is building its own tools to rival OpenAI's ChatGPT and has partnered with other companies to power AI services with its servers. As a result, Amazon expects to reap tens of billions of dollars in AI-related revenue, end quote. Meanwhile, over to OpenAI, according to similar web, GPTs created by subscribers on OpenAI's GPT store, accounted for just one and a half percent of desktop visits to ChatGPT's site in February. suggesting limited appeal for this attempt to go the platform play route, quoting the Financial Times.
Starting point is 00:08:14 In some ways, OpenAI is following a very standard how to build a platform template. That's so predictable that it might have used ChatGPT to write it, said Benedict Evans, an independent technology analyst. So we have a developer conference, an API, and an App Store, but it's not clear to me whether this really has traction, end quote. The Microsoft-backed startup has allowed paying users to create custom versions of ChatGPT since November, with other subscribers then able to access these so-called GPTs through an online store. According to new data from Analytics Group's similar web,
Starting point is 00:08:44 some of the most popular GPs serve educational purposes, with the second most used app being consensus, a tool to search and summarize academic papers. Other apps have surged in usage this year, including design tools that can instantly generate images, translate between languages, or help with job applications by reviewing CVs and cover letters. But a Financial Times analysis also found many popular GPs could be
Starting point is 00:09:05 in breach of Open AIS. usage policies, which has rules against chatbots that provide financial, legal, or medical advice that have not been reviewed by a qualified professional. Five of the most viewed GPs are described as being able to produce content that can bypass detection tools employed by schools and universities to determine if students had produced essays and answers using AI. These tools were viewed at least three million times in total, despite open AI barring apps that engage in or promote academic dishonesty. Another app called Finance Wizard, which has been used more than 200,000 times, claims to reveal future stock
Starting point is 00:09:36 movements. Its creator told the FT, the app makes predictions based on historical data and contains disclaimers warning against using it as financial advice, end quote. Oregon has become the first state in the nation with a right to repair law that bans manufacturers from using parts pairing to dictate what replacement components can be used. Quoting the verge. Oregon Governor Tina Kotech has now signed one of the strongest U.S. right to repair bills into law after it passed the state legislature several weeks ago by an almost three-to-one margin. Oregon's SB 1596 will take effect next year, and like similar laws introduced in Minnesota and California, it requires device manufacturers to allow consumers and independent electronics businesses
Starting point is 00:10:26 to purchase the necessary parts and equipment required to make their own device repairs. Oregon's rules, however, are the first to ban parts pairing, a practice manufacturer is used to prevent replacement components from working unless the company's software approves them. These protections also prevent manufacturers from using parts pairing to reduce device functionality or performance or display any misleading warning messages about unofficial components installed within a device. Current devices are excluded from the ban, which only applies to gadgets manufactured after January 1st, 2025. Much like Minnesota's and California's laws, Oregon's, other right-to-repair rules apply only to phones sold
Starting point is 00:11:05 after July 1st, 2021, or to other consumer electronics equipment sold after July 1st, 2015. Some products, like devices powered by combustion engines, medical equipment, farming equipment, HVAC equipment, video game consoles, and energy storage systems are excluded from Oregon's rules entirely. According to I-Fixit, quote, the exemption list is a map of the strongest anti-repair lobbies and also of the next frontier of the movement. However, I-Fixit CEO Kyle Weans also said in the statement, quote, by applying to most products made after 2015, this law will open up repair for the things Oregonians need to get fixed right now. And by limiting the repair restricting practices of parts pairing, it protects fixing for years to come. We won't stop
Starting point is 00:11:50 fighting until everyone everywhere has these rights, end quote. Another similarity between Oregon's and California's right to repair laws is that both push manufacturers to make any documentation, tools, parts, and software required to fix their devices available to consumers and repair shops without overcharging for them. But while California's law requires this support to be available for seven years after production for devices over $100, Oregon hasn't mandated any such duration, end quote. Gamesmaker Take 2 has acquired Borderlands developer Gearbox Entertainment from the Embracer Group for $460 million in stock. The deal is expected to close by the end of June, quoting IGN. In a press release, Embracer shared that it is divesting gearbox software, gearbox Montreal,
Starting point is 00:12:40 gearbox studio Quebec, and the franchise's borderlands, tiny Tina's Wonderlands, homeworld, risk of rain, brothers in arms, and Duke Nukem. Embracer will retain rights to Gearbox Publishing San Francisco, formerly Perfect World Entertainment, and which it plans to rename, the publishing rights to Remnant, Hyperlight Breaker, and other unannounced games, cryptic studios, Lost Boys Interactive, and Captured Dimensions. All of its retained. assets will be integrated into other parts of Embracer Group. The sale is expected to close by the end of June. Gearbox will join Take-Two's 2K division and will continue to be led by CEO and founder Randy Pitchford. Currently, Gearbox has both a new borderlands and a new homeworld game in
Starting point is 00:13:20 development as well as at least one exciting new intellectual property per a separate press release. Notably, the full purchase price of $460 million will be paid to Embracer Group in Take-2 shares rather than cash. For comparison, Embracer originally, purchased gearbox for $363 million, half in cash, half in newly issued Embracer Group shares, with an additional consideration of $1.015 billion, also partially in shares to be paid out if gearbox hit certain targets within six years. Gearbox and Take Two had a longstanding relationship with Take Two serving as the publisher of the Borderlands franchise via its 2K label. The two also have partnered on an upcoming Borderlands film, as well as Gearbox's 2016 game
Starting point is 00:14:00 Battleborn. Embracer Group has been gradually shedding a number of its men. many, many studios after a multi-year acquisition spree fell apart last year, end quote. Finally, today, back to AI for the moment. Lots of AI startups have that dot AI domain name, naturally. Well, it turns out that's very good news for Czech notes here, Anguilla, quoting the New York Times. In Anguilla, a tiny Caribbean island to the east of Puerto Rico, the AI boom has made the country a fortune. The British territory collects a fee from every registration for internet addresses that end in.a.I., which happens to be the domain name assigned to the island, like dotFR for France and dot JP for Japan. With companies wanting internet addresses that communicate they are at the forefront of the AI boom, like Elon Musk's X.aI. website for his artificial intelligence company, Anguilla has recently received a huge influx in request for domain names.
Starting point is 00:15:02 For each domain registration, Anguilla's government gets anywhere from $140 to $3 to $1,000. thousands of dollars from website names sold at auctions, according to government data. Last year, Anguilla's government made about $32 million from those fees. That amounted to more than 10% of gross domestic product for the territory of almost 16,000 people and 35 square miles. Some people call it a windfall. Anguillis Premier Ellis Webster said, we just call it God's smiling down on us. Mr. Webster said the government used the money to provide free health care for citizens 70 and older, and it has committed millions of dollars to finish building a school and a vocational training center. The government has also allocated funds to improve its airport, doubled its budget
Starting point is 00:15:43 for sports activities, events, and facilities, and increased the budget for citizens seeking medical treatment overseas, he said. The island, which relies heavily on tourism, had been hard hit by the pandemic's restrictions on travel and a devastating hurricane in 2017. The dot-i domain income was the boosts the country needed. We never thought that it would have this potential, Mr. Webster said. Anguilla's control of .AI dates back to the early days of the internet when nations and territories were assigned their slice of cyberspace. Anguilla received.aI and its government, whose own site is www.gov.org. Did not make much of it until the domain names started bringing in millions. Officials are uncertain how long the boon will last, but they predicted 2024 would bring in similar
Starting point is 00:16:26 income as last year from domain names. It's not the first bonanza to make a big difference to a grateful domain owner, Tuvalu, a string of islands northeast of Australia, sold the rights to its suffix. Dot TV to a Canadian entrepreneur for $50 million and use the money to put electricity on the outer islands, create scholarships, and finance the process to join the United Nations, end quote. I've got a quick ask for you, hive mind. Can anyone put me in touch with Balaji Srinavasin? Want to see if I can get him on the pod. If you can help, please at me on Twitter or threads or email me at brian at techmeme.com. Thanks in advance.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.