The AI Daily Brief: Artificial Intelligence News and Analysis - The Latest AutoGPT and AI Agent Developments

Starting point is 00:00:00 Today on the AI breakdown, we're looking at the latest from AutoGPT, including the results of a recent hackathon. Before that on the brief, Microsoft releasing its own chips, plus the global geopolitical battle around AI regulations and alignment. The AI breakdown is a daily podcast and video about the most important news and discussions in AI. Go to Breakdown.network for more information about our Discord, our YouTube channel, and our newsletter. Welcome back to the AI breakdown brief, all the AI headline news you need in around five minutes. We start with a story that is something that we've been watching and getting little rumors about for some time now. Of course, with the dominance of Nvidia, the high cost of compute, the challenges of just getting access to enough compute, basically every big tech company is trying

Starting point is 00:00:46 to figure out different solutions to AI chips. Last week, we discussed reports that OpenAI is even considering building their own chips. And OpenAI partner Microsoft has long been thought to be in development of their own AI chips. According to a report from the information, last Friday, Microsoft is now finally planning to unveil their first official chip at their annual developer conference, which is coming up next month. Now, importantly, whereas a lot of today's chip startups and alternatives are focused on more specific or different use cases, this chip is purportedly going directly after Nvidia GPUs. In other words, it is designed for use in data centers and for the training of these massive LLMs. The project was codenamed Athena, and earlier in the year,

Starting point is 00:01:26 when reporting suggested that AMD might be involved, just the rumor alone was enough to to send AMD's stock soaring. Microsoft later denied that AMD had any involvement. Now, according to the information source, a small handful of employees from both Microsoft and OpenAI have been testing the chips. It also isn't clear how the performance compares to, for example, Nvidia's H-100. Now, interestingly, this project apparently started back in 2019, right around the same time it invested a billion dollars in OpenAI. According to the information, quote, when Microsoft began working more closely with OpenAI, it determined that the cost of buying GPUs to support the startup, Azure customers, and its own products would be too high. Now, of course,

Starting point is 00:02:04 Microsoft is far from alone in this thinking. Google has its tensor processing units, and Amazon just announced a big deal with Anthropic that was in large part focused on their chips as well. Now, interestingly, although most of Wall Street has been super bullish on AI all year, one analyst right now is making news based on an assessment that it's overhyped and that actual use cases will lag in the coming year, in large part because of the increasing costs around AI. The report is from CCS Insight and comes as part of the company's annual roundup of predictions for the future of the technology industry. The main forecast that they list for next year is that generative AI, quote, gets a cold shower in 2024. Said CCS Insights chief analyst Ben Wood, quote, we are big advocates for

Starting point is 00:02:46 AI, we think it's going to have a huge impact on the economy, we think it's going to have big impacts on society at large, we think it's great for productivity. But the hype around generative AI in 2023 has been just so immense that we think it's overhyped, and that there's lots of obstacles that need to get through to bring it to market. Specifically, Wood continued, just the cost of deploying and sustaining generative AI is immense. It's all very well for these massive companies to be doing it, but for many organizations, many developers, it's just going to become too expensive. Now, on that front, another piece of related news, is that Lambda Labs, which is effectively a startup competing with AWS for renting out servers of Nvidia chips to AI

Starting point is 00:03:21 companies is on the verge of closing a $300 million venture round. Lambda is far from the only company providing that type of service. Another rival Corweave is also out in the market trying to sell around a half billion dollars of employee shares. The Corweave sale would value the company at around $6 billion, while the Lambda financing would see a valuation of $1.5 billion. Now, in addition to the cost of compute, the CCS report also thought that regulation is going to be a challenge, although not just for the companies themselves, but also for the regulators. Basically, their argument is that things are moving so fast on the technology front that it's likely that even past legislation like the EU AI Act

Starting point is 00:03:57 gets updated and refined over and over again as it tries to catch up. Meanwhile, one of the things that looms large for those regulators in different countries around the world is the extent to which China is focused on AI as a technology to get out ahead of geopolitical rivals. The country announced a new plan on Monday of this week that targets an increase of the country's computing power

Starting point is 00:04:16 by 50% by 2025. The announcement came from six aligned government department, and set the target computing capacity at 300 Xaflops. Now for China, the calculus is pretty simple. Akshara Basi, a senior research analyst at Counterpoint said, China has found that traditionally, every one yuan invested in computing power has driven 3 to 4 yuan in economic output. And yet, it's not as easy as just buying more compute.

Starting point is 00:04:39 The U.S. right now has strict sanctions on the country when it comes to critical AI inputs like Nvidia GPUs. Indeed, BASI called access to latest and best-in-class AI GPUs is the primary obstacle that the country faces. Now, we reported a couple weeks ago about how the U.S. was expanding some of those bans and restrictions to a number of different Middle Eastern countries with the concern that those countries were conduits for China to actually surreptitiously get access to the AI chips the U.S. government was trying to deny them access to.

Starting point is 00:05:08 The Financial Times reported about this again this week, writing, Saudi-China collaboration raises concerns about access to AI chips. The article profiles recent moves from the Gulf region to release Arabic-focused large language models and other bespoke AI tools for an Arabic-speaking audience, and discusses the fine line that the U.S. is walking, as relates to the Chinese-S.-Saudi collaboration. FT writes, the U.S. has expanded export license requirements

Starting point is 00:05:30 for graphic processing units made by NVIDIA and AMD, preventing Chinese entities from accessing the chips that are vital in building generative AI models, but the Biden administration has stopped short of blocking exports to the Middle East. Meanwhile, another relationship that is changing based on the need to provide a counterweight and bulwark to China is the relationship between the EU and Japan. Reuters recently reported comments from the EU, saying that they see a convergence with Japan on AI regulations,

Starting point is 00:05:56 said the European Commission Vice President for Values and Transparency, Verairova, I was recently in China and it's a totally different thing. I could discuss with our Japanese partners because we do not have to explain to each other basic, basic things. I see a lot of convergence in how we, the EU and Japan, look at AI and generative AI. Now, reinforcing this idea, the Japanese prime minister said on Monday, that he expects that we will have international AI regulations by Christmas this year. Now, he's referring specifically to a coordinated approach that's known as the Hiroshima AI process that involves the G7 nations, including the United States, Canada, Japan, the United Kingdom, France,

Starting point is 00:06:30 Germany, and Italy, and which might touch on topics ranging from governance to IP rights, to disinformation, to responsible use. Over in Germany, the person in charge of antitrust has delivered a warning that AI has the potential to increase big tech's dominance. Andreas Munt told Reuters in an interview on Friday, For us as a competition authority, it is crucial that this new technology does not further strengthen the dominance of the large corporations. The danger is very great because you need two things above all for AI,

Starting point is 00:06:55 powerful servers, and vast amounts of data. Big internet corporations have both. One additional piece that we've noted here on the show that has emerged as another reason that incumbents have an edge, is that enterprises are more likely to trust companies they already work with and who already have access to or visibility into some of their data, as opposed to smaller startups, who they might not trust with that privileged information. Lastly, as the UK government prepares for its AI summit at the beginning of November,

Starting point is 00:07:19 they are trying to pin down participating nations to agree to some set of statements about AI risks. writes the Guardian, Rishi Sunak's advisors are trying to thrash out an agreement among world leaders on a statement warning about the risks of artificial intelligence as they finalize the agenda for the AI Safety Summit next month. Now, in addition to statements, there's also been reporting that the UK is thinking about establishing something they call an AI Safety Institute, although that reporting has been downplayed somewhat. Now, remember, we talked recently about how the UK was in some amount of international

Starting point is 00:07:48 pressure, given that they had invited China to participate in this summit. But the UK's counterpoint was that when it comes to AI safety, there were fairly severe limitations if principles weren't something that China agreed to. It seems quite clear to me that we are going to have a lot more interesting geopolitical dimension conversation about AI in the months to come. So keep your eyes peeled for that. That wraps the brief. Next up. the main AI breakdown. And now a word from today's sponsor. Are you interested in how two top-of-mind trends AI and crypto can work together?

Starting point is 00:08:20 If so, I have the perfect podcast recommendation for you. Web3 with A16Z Crypto, the chart-topping show brought to you by venture firm Andresen Horowitz. Web3 with A16Z Crypto is your definitive resource for the future of the internet. Whether you're already building in these spaces or simply curious about what's next. If you need a place to start, they recently released an excellent episode with Stanford Cryptography Professor Dan Bonay and former Google Xer Aliya in conversation with host Sonal Choxi about the intersection of AI and crypto. From fighting deepfakes and proving humanity to large language models like ChatchipT, they cover it all.

Starting point is 00:08:55 I highly recommend checking it out, especially if you'd like to learn more about how AI and crypto will impact our everyday lives. Beyond Crypto and AI, this show is for creators seeking more ways to truly own their work, for business leaders trying to prepare for the future today and for innovators exploring trending tech topics. So go ahead, listen to Web3 with A16Z Crypto wherever you get your podcasts. Welcome back to the AI breakdown. One of the interesting things about following artificial intelligence is that there are very different conversations that are going on under that banner in different parts of the internet at any given time. There is, of course, the AI safety conversation in one sector, the AI policy conversation in another.

Starting point is 00:09:35 And then there are the much more practical conversations, people re-skilling, figuring out how to use different tools and where they actually fit into their jobs. And then there are the developers, the builders, the entrepreneurs, the people who are hacking at AI systems and exploring the frontiers of what's possible. One of the biggest themes for that cohort this year has, of course, been AI agents. If ChatGPT got the regular consumer's imagination going about the possibilities of artificial intelligence, I feel like something similar happened for a lot of developers. developers and hackers around the beginning of April when it came to AI agents. The dream is to move towards systems that can actually not only spit back information,

Starting point is 00:10:13 but figure out how to solve problems without a lot of guidance along the way. Now, one of the big progenitors of this whole movement, or at least one of the first things that got people excited this year, was, of course, Auto GPD, that was published to GitHub by Sig Gravitas back in April, and raced to become the most active projects on that site within the first couple weeks, and which has seen, if not the same sort of parabolic increase that it saw in those first couple weeks, still a really steady increase. The project is up over 140,000 stars on GitHub right now as a for example.

Starting point is 00:10:44 Now, there are two really interesting things going on right now in the world of AutoGPT. The first is something they're calling their AutoGPT arena hacks. It's a virtual hackathon that comes not only with cash prizes, but the inducement that the winning agent will become the default AutoGPT in that 150,000 star repop. that we were just discussing. The hackathon has four categories, scrape and synthesize, data mastery, coding excellence, and open-ended agent protocol. They write, Your task is to develop an agent that takes natural language input and can handle tasks, either generalists like Auto-GPT or specific as with Task GPT. In terms of prizes, the top-performing

Starting point is 00:11:21 generalist agent gets that position as the primary auto-GPT in the repository plus $15,000 in cash, and then the top project in scrape and synthesize, which means extracting data from the web and creating datasets, summaries, and plans is requested, wins $3,500, which is the same for data mastery, which they characterize as perform essential data science tasks, including imputation, labeling, and sorting. Coding excellence has two prizes, a first place prize of $4,000 and a second place prize of $1,000, and refers to, quote, master the art of coding by building functions, CLI games, password shorteners, web servers, and more. And finally, a $3,000 prize for open-ended agent protocol. This is their catch-all for any big innovation that might not fit cleanly into one

Starting point is 00:12:00 of the other categories. Because the project is run on public benchmarks, there's also a leaderboard showing which teams are ahead at any given time. Now, in the same way that an in-person hackathon might have a set of speakers and mentors, this event, although held online, has something similar. There are a group of speakers and mentors and an entire schedule of different events and conversations that include keynotes, Q&As, and more. And yet what specifically prompted this episode today was that in addition to this online virtual event, AutoGPT also held an in-person hackathon over last weekend. Alex Reben tweeted, this weekend AI engineers from around the world flew to SF to beat, build, and break state-of-the-art AI agents. Powered by AutoGBT, these hackers

Starting point is 00:12:39 spent 24 hours testing the limits of what's possible. Now, interestingly, Alex also shared the finalist at the event. One finalist was called Auto Arena, and Alex summed it up as better observability in session replays for agents using the Web Arena dataset. That was also the team that Alex was on. A second finalist was improving the tutorial, a series of recommendations and bug fixes to make it easier to build applications with AutoGPT Forge, which he summed up as basically an auto-GPT bug bash. TravelMate was created by a team who came all the way from Argentina and was basically a use-case specific agent that could research, plan, and answer questions about different vacation itineraries. Interestingly, in TravelMate's demo of what they had built, they showed off why this could be

Starting point is 00:13:17 more valuable than just a standard travel website. The example they used was the user asking a question, I heard talk about a parallel exchange rate for dollars in Argentina. What is that about? Now, what they're referring to is what's called the Blue Rate in Argentina, which is very different from the official rate published by the government, and which you can find at basically a variety of back alley brokers in cities like Buenos Aires. And the interesting thing was that the team said that they actually programmed it to search the web, but to look specifically for blog posts rather than official sources of information, as it was the type of insight that might not come easily from your standard style of travel websites. Another finalist was called STEM. Alex sum them up as contextually aware, asynchronous alert agents that listen for news, conversations, and other real.

Starting point is 00:13:56 real-time Ingress events. The team behind it wrote, Our system stim, short for stimulus, enables LLM agents to take in real-time information from multiple sources and effectively batch summarize and prioritize incoming info. Now, Sam's explanation actually gets at some of the challenges that we've talked about before on the show as it relates to AI agents. He continues, current LLM agents are stuck in a linear call-and-response structure. You give an agent a task, it runs off and does it and returns when it's done. This is fine for one-off tasks, but as we get more embodied agents and generalist passive computing interfaces, it's insufficient. Now, Sam points out that human brains do a really

Starting point is 00:14:30 good job of filtering out extra or extraneous stimulus, and that if AI agents like AutoGPT could do this, it would give them a lot more potential to become real-time assistance. They decided that a first step in this problem was this sort of information batching. They write, STEM can monitor your messages and intelligently help you handle them based on relevance to what you're currently doing. Messages come in from anywhere. They're given an initial priority in a topic, then batched with relevant existing topics if available. The main agent examines topics to sides with relevant based on its real-time conversation with its user, then surfaces relevant or high-priority items. Now, let me pause here to actually zoom out around the state of the development here. A lot of the content that you saw

Starting point is 00:15:07 around AutoGPT, call it back in April or May, was excited posts and videos about the first exploratory use cases that people were hacking together. Now, the second wave of content tended to be disappointment about how those tools actually broke and a reassessment of just how production-ready this technology was. What we've seen so far in the finalists at this auto-GPT hackathon include, yes, some end-user applications like the travel GPT that we were just discussing, but a lot more infrastructure and intermediary solutions that are trying to solve for very specific problems that relate to discrete workflows that are designed to help AI agents ultimately achieve their full potential. So again, the STEM team here has recognized information filtering as a big challenge

Starting point is 00:15:46 for an AI agent to actually make for a good assistant and is going specifically at that. Now, interestingly, they do have some examples of use cases that help bring this idea to life. One that they give is a Discord developer relations agent. In other words, an agent that sits inside a Discord server, batches the conversations going on by topic, and assigns priority scores based on relevance, and has the ability to elevate things that seem really relevant. Sam writes, e.g., single bug report comes in, given a low priority, but 50 come in at once so that priority is elevated and the user is interrupted to highlight the problem. At the same time, someone post 50 memes about the Roman Empire, which gets deprioritized. Now, I'm not going to read his

Starting point is 00:16:24 full assessment about what remained difficult about this build. I'll include a link in the show notes. But basically, you see a real maturation here, where Sam and the other teams are able to assess the barriers that stand in their way as they're trying to build these various applications and tooling. Now, the last finalist that Alex mentioned was called Engagement Farmers. It was an AI agent that acted like Redditors, and that could engage with community posts and leave comments. It didn't take long, however, from the IP address to be banned by Reddit. I think a big takeaway here is just how much interest in these tools there really is. Developers literally flew in from all over the world to participate in this hackathon,

Starting point is 00:16:56 and by all accounts came away energized and ready to work on their various startups addressing different parts of this equation. Speaking of, let's do a quick breeze through of some of the other big projects in the space and where they are right now. Div Garg, who I had on the show a couple months ago, and who is the founder of Multion, tweeted a couple days ago, we unlocked a big achievement in AI. Very excited to announce that our web AI agent in default mode now runs faster than the human speed of clicking and typing. The implications are potentially huge. This makes it irrelevant

Starting point is 00:17:24 for you to manually interact with the web, as our AI can just do it faster and better. Now, as you know, one of the big things that I often talk about on this show, especially when it comes to my skepticism around AI assistant tools, is the question of what it would take for the average user to shift behaviors, such as ordering food or ordering an Uber, from the current way that they do things now through specific apps or web interfaces to a more natural language AI-assisted AI agent type of interface. I still don't know what the answer to that question is, and I don't think that speed alone is an answer, but I do think that div is right to recognize that AI agents being able to perform tasks faster than humans manually could is, if not a sufficient condition,

Starting point is 00:18:02 absolutely a necessary one for AI agents to really be personal assistance in the way that many of these entrepreneurs are imagining. So I continue to be excited about what they are building over there, and of course how fast they're moving. DIVs were the follow because in a lot of ways what they're doing over there at Multian is just trying out different use cases of their own tool. This is classic entrepreneurial dog fooding, but done in public. Super AGI is another open source framework for building autonomous agents, and just a few days ago, they announced something called Super AGI models. Basically, this gives users the ability to select from multiple different open source models in the design of their specific AI agents. Indeed, Super AGI also released a playground feature that allows

Starting point is 00:18:40 developers to experiment with those models without having to set up a local dev environment. The next thing that they're working on is something called large agentic models or lambs. These are open source models that are fine-tuned on domain-specific knowledge, and in that way are specialized to work within specific use cases that require understanding in that particular area. Super AGI is also participating in something called Hactoberfest, which is a month-long virtual hackathon, as well as an upcoming autonomous AI agent hackathon in Sydney, Australia, on October 20th through 22nd. So, TLDR, lots of energy continues to flow into this AI agent space, and, improve Importantly for those of you who care about actually having outputs that are useful, a lot less of it is cool theoretical applications, and a lot more of it is building the necessary infrastructure, tooling and intermediate solutions that give other developers the ability to actually build the tools that people want.

Starting point is 00:19:29 This continues to be one of the most dynamic parts of the AI space, and so I will continue to cover it as makes sense. For now, I appreciate you guys listening or watching, and until next time, peace.

The AI Daily Brief: Artificial Intelligence News and Analysis - The Latest AutoGPT and AI Agent Developments

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.