The AI Daily Brief: Artificial Intelligence News and Analysis - Is the Future of AI "Cheating on Everything?"

Starting point is 00:00:00 Today on the AI Daily Brief, is the future of AI cheating on everything? Before that in the headlines, is Amazon fumbling the Anthropic bag? The AI Daily Brief is a daily podcast and video about the most important news and discussions in AI. To join the conversation, follow the Discord link in our show notes. Welcome back to the AI Daily Brief Headlines edition, all the daily AI news you need in around five minutes. Well, it appears that Amazon is fumbling their deployment of Anthropics' AI models. Last year, you might remember the company invested about $8 billion into a partnership with Anthropic

Starting point is 00:00:34 to provide their models through AWS Bedrock Infrastructure. The information, however, reports that the service leaves a lot to be desired, writing, customers who have used Bedrock say its API puts arbitrary limits on how much they can use Anthropics models and lax features they want. Now, AWS claims that usage limits are common in the industry, but the issues imply that they either don't have enough server capacity or are reserving too much for certain large customers. Sources familiar with the internal conversation told the information that some senior

Starting point is 00:01:01 executives are referring to the situation as a disaster. A consulting firm executive said that AWS is at risk of losing its standing with startups over the issue. Several customers told the information, in fact, that problems with Bedrock are so bad that they've reverted to using Anthropics' own APIs to get reliable access to the models. An Amazon spokesperson pushed back on the story, claiming that AWS has tens of thousands of customers using Anthropics models through Bedrock, and that the service is seeing unprecedented demand. They also defended the practice of rate limiting, stating that AWS uses them, to ensure fair access. They shared a statement that said,

Starting point is 00:01:34 the information's suggestion that rate limits are a response to capacity constraints or that Amazon Bedrock is not equipped to support customers' needs is false. However, none of those comments really strike at the heart of the issue, which is that AWS risks losing startup customers if they can't offer reliable service. The AI coding companies in particular are some of the fastest scalers in recent memory, so losing them at this early stage could be a disaster. Sources said that AWS leaders recently discussed concerns that Lovable would defect from bedrock due to its inability to keep up with fast-growing demand. Lovable CEO Anton Oseka confirmed

Starting point is 00:02:05 the company is now using Anthropics Direct API, commenting, I know that we want to get the best possible terms, and we need the most recent features, and the most recent features comes to Anthropic first. Anton declined to comment on whether the company is still using Bedrock, though AWS employees insisted that Loveable is still an active and growing customer. Over the past few months, the issues have reportedly been getting worse. During one incident in April, AWS was rate limited to five requests per minute. Anthropic's native API, on the other hand, was still handling 50 requests per minute. Another problem has been the delay to implement new features. Anthropic introduced prompt caching in December, which allows repeated prompts to return

Starting point is 00:02:41 an automatic response without being processed by the model. That feature wasn't implemented into bedrock, however, until earlier this month. If there are any takeaways, it's that AI cloud is still very much up for grabs. One of Amazon's suggested solutions to rate limits is a service called provision throughput, which charges per hour for inference time rather than per token. Consultants, however, said that startups find it difficult to forecast how much usage they need and have a much easier time with per token pricing. Is this all great for Google? Time will tell, but it's certainly not great for Amazon. Now, speaking of Google, they are also in the news for reasons they probably don't want to be. According to DOJ prosecutors in their ongoing antitrust case,

Starting point is 00:03:18 Google paid Samsung a, quote, enormous sum of money to pre-install the Gemini app on their phone and other devices. The contract provided fixed monthly payment for each device and was set to run for two years. Samsung would also receive a cut of ad revenue from advertisements displayed in the app. Prosecutors called this the Monopolis Playbook at work. That said, this type of paid-for pre-installs are a pretty common practice in tech. And yet, this is always an area where antitrust regulators have a big issue. Microsoft famously suffered one of the largest antitrust penalties in modern history after pre-installing Internet Explorer for Windows in the late 90s. Google themselves received the largest European antitrust fine in history over pre-installs of Chrome on Android devices.

Starting point is 00:03:57 Seven years on, the company is still appealing the $4.9 billion fine. What's interesting to me about this is that it seems pretty clear that AI and AI positioning is getting swept up into these ongoing antitrust suits that started before AI. These things really could have impact on how the industry shakes out. Interestingly, Google's upstart search rival perplexity has been asked to testify in the remedy phase of the DOJ's antitrust case against Google and actually arguing that they don't want Google to be broken up. Perplexity CEO, Arvon Trinnovas, said that he intends to testify that Chrome should remain within and continue to be run by Google, but that Android should become more open to consumer choice. They said, we don't believe anyone else can run a browser at that scale without a

Starting point is 00:04:36 hit on quality, nor the business model to be able to serve that many users profitably by keeping the browser free. At the same time, however, he took issue with Android. He said that the way that Android set up, products like Perplexity's AI assistant have a hard time competing. In any case, all of these antitrust questions are clearly coming home to ruse for AI and will impact how the industry evolves. Lastly today, chat GPT search appears to be growing at a rapid clip in Europe. In their latest update on EU usage, OpenAI disclosed that monthly active users have grown to 41.3 million as of the end of March. That's almost four times more than the 11.2 million they reported at the end of October. These numbers are part of mandatory reporting under the EU Digital Services Act,

Starting point is 00:05:17 so we don't have any insight into growth in other regions. Now, for a comparison, In comparison, Google Search has an estimated 332 million monthly active users, at least as of 2023, which implies that ChatGPT search is already in the ballpark of 8% market share if it were included in standard search engine statistics. The math is a little bit fuzzy, but Bing and Yandex, the closest competitors to Google search, each command less than 4.5% of the market. And indeed, looking around the recent takes on AI search, it does feel like we've hit an inflection point.

Starting point is 00:05:46 In a recent blog post, Simon Wilson wrote about how LLMs recently crossed the threshold, where they're now reliable enough to trust for low-stakes research. He commented, why visit websites if you can get your answers directly from the chatbot instead? I can feel my usage of Google search taking a nosedive already. He also added that this isn't all necessarily exclusively positive, saying, I expect a bumpy ride as a new economic model for the web lurches into view. So, friends, interesting things happening in the world of AI, but that's going to do it for today's AI Daily Brief Headlines edition. Next up, the main episode. Today's episode is brought to you by Plump. If you're building agentic work,

Starting point is 00:06:19 workflows for clients or colleagues, it's time to take another look at Plum. Plum is where AI experts create, deploy, manage, and monetize complex automations. With features like one-click updates that reach all your subscribers, user-level variables for personalization, and the ability to protect your prompts and workflow IP, it's the best place to grow your AI automation practice. Serve twice the clients in half the time with Plum. Sign up today at useplum.com. That's U.S.E.PLumb.com Vanta. Vanta is a trust management platform that helps businesses automate security and compliance, enabling them to demonstrate strong security practices and scale. In today's business landscape, businesses can't just claim security, they have to prove it. Achieving compliance with a framework

Starting point is 00:07:04 like SOC2, ISO-27-01, HIPAA, GDPR, and more, is how businesses can demonstrate strong security practices. And we see how much this matters every time we connect enterprises with agent services providers at Superintelligent. Many of these compliance frameworks are simply not negotiable for enterprises. The problem is that navigating security and compliance is time-consuming and complicated. It can take months of work and use up valuable time and resources. Vanta makes it easy and faster by automating compliance across 35-plus frameworks. It gets you audit-ready in weeks instead of months and saves you up to 85% of associated costs. In fact, a recent IDC White Paper found that Vanta customers achieve $535,000 per year in benefits, and the platform pays for itself in just

Starting point is 00:07:46 three months. The proof is in the numbers. More than 10,000 global companies trust Vantua, including Atlassian, Cora, and more. For a limited time, listeners get $1,000 off at vanta.com slash nLW for $1,000 off. Today's episode is brought to you by Superintelligent, and I am very excited today to tell you about our consultant partner program. The News Superintelligent is a platform that helps enterprises figure out which agents to adopt, and then with our marketplace, go and find the partners that can help them actually build by customize and deploy those agents. At the key of that experience is what we call our agent readiness audits.

Starting point is 00:08:25 We deploy a set of voice agents which can interview people across your team to uncover where agents are going to be most effective in driving real business value. From there, we make a set of recommendations which can turn into RFPs on the marketplace or other sort of change management activities that help get you ready for the new agent-powered economy. We are finding a ton of success right now with consultants bringing the agent readiness audits to their client as a way to help them move down the funnel towards agent deployments, with the consultant playing the role of helping their client hone in on the right opportunities based on what we've recommended and helping manage the partner selection process. Basically, the audits are dramatically

Starting point is 00:09:02 reducing the time to discovery for our consulting partners, and that's something we're really excited to see. If you run a firm and have clients who might be a good fit for the agent readiness audit, reach out to Agent at B-Super.A.I. with Consultant in the title, and we'll get right back to you with more on the consultant partner program. Again, that's Agent at B-Super.A.I and put the word consultant in the subject line.

Starting point is 00:09:24 Welcome back to the AI Daily Brief. Every once in a while in AI, we get one of these companies that just absolutely acts as a lightning rod for fears and concerns about what the future that we're heading into might look like. And the latest of those is this new company, clearly that has just emerged with a $5.3 million funding round.

Starting point is 00:09:44 To some, this company is an abomination, everything that they are concerned with about in AI. While to others, this is just a straight up preview of the future and one that they think there's much to be excited about. So what the heck is this company and why are people freaking out about this? The company comes from Roy Lee, a 21-year-old founder who went viral last month telling the story of how they got suspended from Columbia for, quote, quote-unquote cheating on tech internship interviews. Roy wrote a long viral threat about his experience.

Starting point is 00:10:16 Roy and his co-founder had developed an AI tool that could ace lead code testing commonly used in the interview process. The way that Roy described it was a cheating tool for leit code style technical interviews. Now, Roy points out that they had read the Columbia student handbook first and argued that the tool didn't violate any policies. The goal was to go viral. As he put it, used the tool to get offers from top tech companies, film it, and ride the shock factor. Now, there's a whole element about this on the nature of virality and distribution, which we'll get into in a little bit, but where Roy is coming from is very clearly a belief that you have to have distribution first.

Starting point is 00:10:50 Now, the tool worked. Lee got offers from meta, TikTok, Capital One, and Amazon. He even recorded the entire Amazon interview and published it, which led to Amazon leaning on Columbia to, quote, take proper action against Lee, which led to him getting suspended from the company for a year. Now, back then, even before this new company, Cluelly, the entire episode was extremely divisive. Some condemned Lee for hacking his way through this interview process, other scolded him for throwing away his Ivy League education to go viral. But as I mentioned, Roy had come to the

Starting point is 00:11:18 conclusion that virality and distribution were more important to his future success than an Ivy League education. And now he has put that logic right back into Clule, which announced itself this week with a highly viral product ad and a manifesto on the future of AI titled, We Want to Cheat on Everything. The ad features Lee going on a date while using the product, which is depicted. as a digital assistant appearing through AR glasses, although Cluley is not actually AR glasses. It helps the young man featured in the ad, pretend to be an artificianado, and lie about his age. The tagline is invisible AI to cheat on everything. The actual product itself is intended to be used on a desktop rather than in a real-life scenario, but it's being promoted as being

Starting point is 00:11:57 undetectable during screen sharing. It listens to audio and can feed the user the information they need in real time to navigate exams, sales calls, and of course job interviews. Along with the ad, Cluelly released a manifesto. It's short enough that I'm just going to read it. They write, We want to cheat on everything. Yep, you heard that right. Sales calls, meetings, negotiations. If there's a faster way to win, we'll take it. We built Cluley so you never have to think alone again. It sees your screen, here's your audio, feeds you answers in real time, while others guess, you're already right. And yes, the world will call it cheating. But so was the calculator, so is spellcheck, so was Google. Every time technology makes us smarter the world panics.

Starting point is 00:12:36 then it adapts, then it forgets, and suddenly it's normal. But this is different. AI isn't just another tool. It will redefine how our world works. Why memorize facts, write code, research anything when a model can do it in seconds. The best communicator, the best analyst, the best problem solver, is now the one who knows how to ask the right question. The future won't reward effort, it'll reward leverage. So start cheating, because when everyone does, no one is. Investor Nakunshkothari wrote, Cluley launches this generation's Roershack test, and boys he write, this has generated extremely strong feelings, and I think it's actually a really useful conversation. On the one side of the conversation where people like Twitter user Cody Blakeney who wrote, imagine making a black mirror short as a product ad.

Starting point is 00:13:22 Twitter user Chris, who works on subsea robotics, wrote, I hate this with every fiber of my being, and I think investors should be ashamed of themselves. Clule is a grotesque distortion of what technology is supposed to enable. It doesn't celebrate innovation, it glamorizes, laziness, intellectual theft, and a nihilistic worldview that mistakes shortcuts for success. Referring to the line, we want to cheat on everything, he says this is not a bold vision, it's surrender. Rather than using technology to elevate human capability, clearly tells you to stop trying altogether. It weaponizes defeatism. Don't strive, don't learn, don't build, just win by outsourcing your mind to a machine. It's not ambition, it's an abdication of

Starting point is 00:13:56 responsibility. When it comes to feeding you answers in real time, Chris argues that, while this pretends to be about performance, it's built on fear. Fear of being wrong, fear of thinking, fear of wrestling with hard problems. On their assertion that the future won't reward effort, it'll reward leverage. Chris writes, this is Silicon Valley rot at its worst. Strip away the euphemism and what you're left with is a doctrine that effort, integrity, and expertise are just obsolete. Just find the newest shiny object and ride it to the top. Ultimately, he says this isn't visionary, it's cynical. It asks nothing of the user, no growth, no improvement, no principles, just passive consumption. And in doing so, it turns the human being into a vessel for software prompts.

Starting point is 00:14:31 If Cluelly wins, the future will be filled with people who can pair it, perfect answers they didn't write, understand, or care about. A world of hollow competence, no thinking, no originality, no soul. Quick-Soddy sword writes, You are moral imbeciles. You compare tools that render certain skills unnecessary, the calculator, to lying about who you are and lying about your own abilities. Why take a walk when you could look at things through VR glasses? Why prepare your own food when you could order a meal delivery? Why raise your own kids when you could pay someone else?

Starting point is 00:14:57 Why even have kids when you could pay someone else to have them? Now, this one was on the original thread in Roy bit back, saying every single example you cited is something positive that should and does exist in the real world. And he actually goes through point by point. Why take a walk when you could look at things through VR glasses? He says, this is good. Everyone loves that VR and computers let you experience nature. This tech is a net positive for society. Even though the tech exists, people still go outside. On food and meal delivery, he says DoorDash and Uber Eats is a net positive for society. And even though the tech exists, people still prepare their own food. On raising your own kids, he says nannies have existed

Starting point is 00:15:29 forever and they are net positive for society, etc., etc., etc. He concludes every single thing here is a positive and proof that just because tech enables something different doesn't mean the end of society. Now, part of the issue here, as you can probably tell, was articulated by Colin D who writes, genuinely curious why you chose this go-to-market route versus targeting knowledge workers with a near real-time audio video screen share AI interface. To me, this would make way more sense for things like pair programming, research, due diligence, accounting, etc. Roy responds, this is actually our go-to-market. this video is just a viral company launch video. And I think that people are definitely responding to the fact that the example they chose

Starting point is 00:16:04 to highlight in this video is a guy who's just straight up lying about who he is to a potential date. I think very reasonably people feel like there's something much more sacred about lying to a potential relationship partner than there is about closing a deal on a sales call. Now, as I mentioned before, Roy is very convinced that virality and distribution are the keys to startup success. In fact, in a LinkedIn post, he talks about that video that got him kicked out of Columbia.

Starting point is 00:16:30 He said, my core thesis is that the only thing that works at this stage is organic virality. You simply don't have the capital or recognition to get attention any other way. Nobody's watching your podcast, nobody's reading your newsletter, nobody's reading your lame tweets, and you can't afford ads. The only thing that matters is top of the funnel, so you must do something that goes viral. Ultimately, he says, build with virality in mind, trust your instincts and take more risks. And so clearly the bet here is that this very personal example, which they knew would be much more controversial, was engineered specifically to generate the type of conversation it has.

Starting point is 00:17:01 And the video has been viewed nearly 10 million times at this point, so it's hard to argue that it wasn't effective, at least if the metric of success is just attention. Now, there were also people who came to Cluley's defense. Signal writes, what people don't understand is that cheating in its raws form is just an exploit. And exploits exist because systems are imperfect. When people cheat, they are not necessarily breaking rules. They're revealing where the rules were already broken. It's a stress test. It's adversarial R&D. In school of everyone's cheating, maybe the system is incentivizing memorization over understanding. Cheating is an important feedback loop for society, for humanity.

Starting point is 00:17:33 The real danger isn't cheating. It's pretending the system is just fine when it clearly isn't. This is why software like this is insanely interesting. It reveals societal truths and helps us, paradoxically, be a better version of ourselves. Sam Green writes, many will call real-time access to LLM's cheating. Eventually, we'll call it what it really is, intelligent amplification. It starts with an AI that can process visuals and audios from your computer, but it'll eventually escape that containment barrier and bleed into most situations in life like the

Starting point is 00:18:00 internet did over the last few decades. For years, folks have flocked to search engines for their daily work and life challenges. For example, think of the joke about coding actually being 90% Googling. Today, thanks to LLMs, we've already witnessed the time gap between what would have been a Googled problem and a search result solution shrink to almost instantaneous. I expect this trend to accelerate. Intelligence amplification is often defined as the augmentation of human cognitive abilities through the seamless integration of technology. Much like the world adapted to computers and later smartphones, will adapt to intelligence amplifying technology

Starting point is 00:18:28 that instantly orientes us to anything we're looking at or listening to. But he says, until you can download skills and memories directly into your brain, approaching intelligence amplification through the lens of cheating has limitations. You can't cheat real expertise, discerning taste, great communication skills, and critical thinking ability. In fact, those qualities become even more valuable as this technology becomes widely adopted.

Starting point is 00:18:48 And I think the key to having a sort of middle of the line take about Cluelly is to be able to do a thing that they're not actually doing in their manifesto and in their ad, which is distinguishing between different contexts and different examples of where this sort of so-called cheating might and might not be appropriate. Writer John Stokes wrote a piece called In DefensiveClure.com. And he sums up his thoughts in the section called The Demo is Dumb, but the Promise is Real. He writes, so the kin in the demo was trying to use his AI-powered smart glasses to solve a problem that apparently he has, a total lack of what the kid is. kids call Riz. I, a father of three, do not have this young man's particular problem. I do not lack Riz, obviously, and even if I did, Riz is no longer the critical path of anything I'm trying to obtain. But I do have other, more dad-specific problems. For instance, I am a dad who sometimes has to do things to cars, like change the windshield wiper or check a fluid level or interpret a dashlight,

Starting point is 00:19:37 or put some chains on tires. This usually involves holding my phone in one hand, with YouTube playing and doing the car thing with the other hand. I also have to do things around the house that involve hot water heaters, breaker boxes, generators, and many other tasks that, again, involve a mixture of YouTube, PDFs for manufacturer, websites, tools, and frustration. I want the Cluelly glasses real bad for these kind of tasks. I want to cheat on every single upgrade, repair, replacement, hack, and monkey patch that I do as a homeowner and automobile driver. But then it gets really interesting, and John continues, if I owned a factory, I would want employees to cheat, quote unquote, with Cluelly as they worked the machines and did all

Starting point is 00:20:09 their troubleshooting in QA. If I owned a restaurant, I would want newly hired waiters to use Cluley to identify by name any repeat customers in their recent orders. And I would also want to have some scripts ready for them, just like in the video, about the specials and what to do in situations where the food is late and the customer is mad and so on. I could make up examples all day. It doesn't even take much imagination. AI augmented AR has tons of obvious potential to yield massive savings in employee training and onboarding in many categories of businesses. Clule is ultimately just one example of a new category of AI-enabled tools that will change how we live and work. Maybe Cluley itself will go bust. I have no idea if the founder knows what he's doing or not.

Starting point is 00:20:44 But something like this has the potential to make all of our lives a little easier and to instantly upscale many types of employees. So I think Cluley is kind of great, or at least points the way to something great. I could certainly do without all the obvious scumbaggery on display in the launch, but all that aside, there's definitely something here. So here's to Cluley, or at least to the virtuous practical version of it that's inevitably coming to market. And I think this is really key here.

Starting point is 00:21:08 One, I completely agree with John that this category of tooling is absolutely inevitable. Real-time immediate information accessibility is just going to be a part of our lives. It will be incumbent upon society to decide where and in what ways we think that's appropriate. Obviously, it's going to exist on a spectrum. On the one end of the spectrum, let's take sales calls. Do we really think that there are a lot of people out there who have some huge moral issue with a sales representative having better real-time information about the customer they're trying to convert? Even if the customer knew they had access to that,

Starting point is 00:21:40 the whole thing that they're looking for out of a sales relationship is being understood. If there are tools that allow them to be understood faster, without them gabbing as much, that's probably a net thing for everyone involved in the exchange. On the other end of the spectrum is the example chosen by Cluelly to highlight, which is real important human relationships. And I think this is the one where if you surveyed 100 people, the biggest majority of them would not be super down with this. Reducing important human relationships to trivial games to be won or lost

Starting point is 00:22:09 is a pretty sure sign that someone hasn't had really important relationships that go beyond those first couple interactions. In other words, getting to a second date may be a win, but if you think like that, you probably don't realize that you're not actually just playing a game. It's the middle that's going to be harder, honestly. Now, certainly Roy and his co-founders might have a different take on those human relationships,

Starting point is 00:22:30 and I think that we can't discount how wildly generational value changes could happen in this new world. But given all of Roy's comments on virality, let's take it for granted that he is just trying to engineer conversation and that he might even agree that on the spectrum of use cases, this isn't really where Cooley wants to focus. Okay, so we have large consensus that sales are fine, but that relationship cheating is not, at least to the extent that anything can have consensus in this world.

Starting point is 00:22:55 What's really going to be complicated is everything in the middle. The original cheating that inspired this whole thing, these coding exams, are a great example of where the messy middle exists. On the one hand, these firms are genuinely trying to understand someone's capability to do the type of work that they're going to be called upon to do. However, the person who's doing the cheating is basically demonstrating how they're going to do that work in practice with the help of LLMs. And in that case, we get to the very complicated area of how much it matters, not just to have the skill, but to have learned the skill the hard way. I guess my sense is that over time, it seems

Starting point is 00:23:30 unlikely to me that will fully lose the value of skills acquisition, but it is going to change in ways that I don't fully understand yet. What's also true, though, is that society and markets are self-correcting, and that to the extent all these things that some people are freaking out about become norms, there will be other market actors who run in completely the opposite direction, and give people the option, at least, to go it in a different way. Completely separate from the Cluelly conversation, I noticed investor Gokul Rajaram,

Starting point is 00:23:58 post on both LinkedIn and Twitter recently a request for startups for interview centers. Google writes, here's a prediction, The need to remotely screen scores of candidates, coupled with the increasingly rampant cheating on remote interviews, is going to lead to a startup or start-ups, building out clean-room interview centers where candidates will not have access to any aids, tools, or AI agents to help them. There will be human proctors overseeing the interviews. They will crop up in major metros to start with, but then there will be one within a few miles or kilometers of every town, similar to SAT or GRE test centers.

Starting point is 00:24:27 Pricing will be pretty simple per candidate fee might be higher for higher-pay technical roles, since cheating will be more sophisticated for those roles. This is an extremely successful investor, Gokles on the board of Pinterest and DoorDash, who's basically watching this whole conversation around Cluley and saying, while all you yuckels debate about whether this is happening, I'm just going to make money from the startup that comes to help calibrate the downside from this. So I don't know, man. The world is going to get weirder.

Starting point is 00:24:50 There's going to be more, not less Cluelis. Ultimately, I'm optimistic that we'll figure it out. But I do think being loud about our opinions about these things is a part of that figuring it out. So whether you love this or loat this, keep saying so publicly. and hey, if nothing else, maybe you'll end up on this show one day. For now, that's going to do it for today's AI Daily Brief. Until next time, peace.

The AI Daily Brief: Artificial Intelligence News and Analysis - Is the Future of AI "Cheating on Everything?"

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.