Big Technology Podcast - Too Many AI Companies, Amazon's Alexa Upgrade Awaits, RIP Humane Pin

Episode Date: February 21, 2025

Ranjan Roy from Margins is back for our weekly discussion of the latest tech news. We cover 1) Satya Nadella's criticism of AI benchmark hacking 2) Ex-OpenAI CTO Mira Murati's new Thinking Machines La...b startup 3) There are too many AI startups 4) Why foundation models have commoditized 5) Did Google 'DeepSeek' itself? 6) Grok3 arrives 7) How do you evaluate whether models are good? 8) Grok3 at the top of Chatbot arena 9) Benedict Evans on Deep Research 10) Does using AI tools make our brains atrophy? 11) Amazon's incoming Alexa upgrade 12) Actually, voice AI helps during marital disputes 13) RIP Humane Pin Join the Big Technology Discord here: https://www.bigtechnology.com/p/lets-talk-deepseek-ai-etc-on-big

Transcript
Discussion (0)
Starting point is 00:00:00 There are probably way too many AI companies, but why not? Let's add another. Humans criticize open AI's deep research. Amazon has an Alexa upgrade on the way, and the humane pit is dead. Off to HP to check your toner. That's coming up right after this. Welcome to Big Technology Podcast Friday edition when we break down the news in our traditional cool-headed and nuanced format. We have a wild show for you today because there's a whole lot of news in the world of artificial intelligence and the tech world in general. We got a new AI. company. It is from the X. Open AI CTO, Muradi. We also have new models from GROC. We have big news from Microsoft, which we're going to touch that right at the top. And we'll look at the death of the humane pin. Oh, by the way, there's also an Amazon Alexa upgrade coming. Joining us, as always, to talk about it on Fridays is Ron John Roy of Margins. Ron John, great to see you. Welcome to the show. RIP Humane Pin. What could have been? I know we're both feeling a state of loss. We're both feeling a state of hurt and harm.
Starting point is 00:01:01 We're going to try to do today's show with respect, but also holding back some of those feelings till the very end so we can analyze what's happening with the appropriate nuance, the appropriate tone. We'll analyze and we'll emote at the end for the humane pin. I want to start this week with a quote from Satya Nadella, who's been obviously listening to this show. because he's channeling Rajan in a recent interview with Dwarkesh Patel. He's on the Duarkish podcast talking about Microsoft's new announcements,
Starting point is 00:01:39 one in Quantum, another world model. And he says this, us self-claiming some AGI milestone. That's just nonsensical benchmark hacking to me. The real benchmark is the world growing at 10%. Very interesting statement for him to make quite provocative, especially because we keep hearing from the company that Microsoft has funded OpenAI that they keep getting closer and closer to these AGI benchmarks.
Starting point is 00:02:05 Yet, Satya is coming out there and I think rightly saying, hey, listen, it's not about getting those benchmarks, it's about can you actually have practical impact on the world. Just one statement in the long podcast that I think is worth listening to, but pretty revealing in terms of where Microsoft stands today, where the discussion is going today, and also So maybe how Satya feels about Open AI? Am I reading too much into it, Ron John? I don't think you're reading too much into it. I think we could break this down.
Starting point is 00:02:32 There's two main levels you can look at this. One is actually the legal and contractual side. Because remember, Microsoft and OpenAI's entire deal actually hinges on when AGI is achieved. And that's a key part of it. And we've been joking for weeks or even months now that all Sam has to do is say, AGI and we're there. So I think him recognizing this and openly kind of calling out this self-claiming of AGI Milestones is ridiculous is important in whatever is going to happen in the future of OpenAI Microsoft's relationship.
Starting point is 00:03:08 But the other is nonsensical benchmark hacking is my favorite line. I feel like, I mean, I want to put that on my own LinkedIn right now because Satya is just, after his, we're good for the $80 billion. and now him coming up with phrases like this, this guy, he's got some words. He's got some lines that I would not have expected from him. Yeah, Sanya is a low-key baller. I mean, he has some sharp elbows.
Starting point is 00:03:35 Yeah, I think put him in the cage match. Elon never challenged him, but I think Sotia actually is currently now my favorite in the octagon that never happened. And it's also just so interesting to me that he's talking about it, the nonsensical benchmark hacking and the context. of Open AI, which is very proudly proclaiming how they're doing against the benchmarks. All right. So speaking of Open AI, they have lost a lot of people recently, shall we say.
Starting point is 00:04:03 2024 was a year of exodus for them. Where are these people gone? Well, we know Ilyssut Skever is doing super safe superintelligence or safe superintelligence. I keep forgetting exactly what it's called, but they're raising billions and billions of dollars without building anything yet. And now we see the emergence of another former Open AI executive, former OpenA. AI CTO Miramiramiradi. She was the CEO for the weekend that Sam Altman was fired. She is out with a new company called Thinking Machines Labs. But it really is true that we have so
Starting point is 00:04:34 many companies that are working on, I think, the same thing. And you're getting to the point where you're like, what are we all doing here? So let me just read from TechCrunch. It's called Thinking Machines Lab. The startup intends to build tooling to make AI work for people's unique needs and goals, and to create AI systems that are more widely understood, customizable, and generally capable than those currently available. Roddy is heading the lab as the CEO. She's also brought over, cast of characters from her former company. The co-founder of OpenAI John Shulman is the chief scientist at this new company,
Starting point is 00:05:10 and Barrett Zof. The former OpenAI chief research officer is the CTO. So they've got the former royalty. from Open AI over there. Outside of Ilya, who's doing his own thing. Let me just read you one more bit from the tech crunch story because I do think I got to the point this week where I was reading words and I was just like, what does this all mean? Thinking Machines Lab plans to focus on building multimodal systems that work with people collaboratively that can adapt to the full spectrum of human expertise and enable a broader spectrum of applications.
Starting point is 00:05:45 I you know we talk a lot about PR on the show and about marketing and positioning because we do think that oftentimes that does reveal the inner soul of a company or at least the inner direction of a company and I can't for the life of me figure out what this word salad meets Ron John and maybe you have some clarification here because I'm scratching my head what was what's funny is I was just about to ask you as you were reading it what does that actually mean what does thinking machines lab do So I'm a little disappointed that you already preempted me by acknowledging it's a word salad because I think you also had left out this part from their blog post. The scientific community's understanding of frontier AI systems lags behind rapidly advancing capabilities. Knowledge of how these systems are trained as concentrated within the top research labs, limiting both the public discourse on AI and people's abilities to use AI effectively. Again, more word salad.
Starting point is 00:06:43 I don't want to say an AI wrote this, but it certainly seems like there's no really concrete, tangible thing that this company will do. It's a tough one because you have a lot of very smart people. We have very regularly said Open AI is effectively a research house that had a business that kind of accidentally walked into a business. Sam Altman himself has described his company as this kind of institute. So to me, it's almost odd that you look at how Ilius Etzkever and safe superintelligence, which again has raised or is raising some ungodly amount of money to achieve some nebulous goal where they're essentially saying, I think they said that they're not even planning on having a product anytime soon.
Starting point is 00:07:34 Here, you're having Miramirati, who I'm sure is going to be able to raise a bunch of money, launching a company with a name that's kind of hard to say, like thinking machine. lab. I feel it should be labs. I lost it pretty good right at the beginning. By the way, it's also a copy of a name of another company that was established back in the day. Oh, really? Yeah. It's a good. I mean, yeah, thinking machines is a good one.
Starting point is 00:07:57 Call it labs, though, Mira, please. But overall, the whole alumni ecosystem from Open AI, you can tell it's a problem because they can raise money so easily without any plan product. or anything simply on promise that I think this whole, you know, like the PayPal Mafia ecosystem of companies did quite well. I don't think the Open AI alumni ecosystem is going to do as well. Maybe they're going to produce some of the greatest research out there, but I don't think they're going to build the actual businesses of tomorrow. Yeah, look, I think we're going to also I have to talk about how there are so many AI companies and what are they going to actually offer that's differentiated? Like, do we need another one working on? It seems like that's going to
Starting point is 00:08:48 work on foundational models. I don't know. We still don't really know. By the way, I've invited Miradi on the show. I think I got her email right. I tried about like 17,000 different permutations until I figured it out. So Mir, if you're listening or if Mir's representation is listening, please come on. But before we go on to the fact that there are so many of these companies and how's this going to shake out? I just want to ask one simple question because you did mention Elias Satskever. He's raising this money.
Starting point is 00:09:16 He has this safe super intelligence lab. It's not going to release a product. I mean, Ranjan, to me, I'm just wondering, how are you going to have safe super intelligence if you have to build a product that delivers returns that VCs anticipate? Like, you have,
Starting point is 00:09:34 taking that VC money, the contract is growth. And if you're going to actually, actually do the job, you're going to have to grow the way that they expected. They expect. And I try to put this to read Hoffman when he was here and he seemed to brush it off. But it just doesn't seem to me like this is compatible. And it's sort of been baked into the entire AI mythology. We need more money, so therefore we grow big, but also we're concerned about the impact of this technology. But then again, like you're raising unprecedented rounds. How is that possibly compatible with safety and slow rolling things if you're actually afraid of them. Yeah, just confirming here, the safe super
Starting point is 00:10:13 intelligence reporting was they're looking to raise over $1 billion in capital at a valuation of $30 billion. Now, how you even come up with some kind of math to get to a valuation on something where they're not even announcing or even pitching a product, I don't understand. But I think if I'm to try to see the other side of this, let's look at Open AI is generating revenue. They're losing a lot of money, but they're still, they're growing quite fast. Claude came out of this model. Anthropic, I mean, Anthropic has been playing this same game. Mistral, I don't know how they're doing in terms of revenue, but they clearly played the same game.
Starting point is 00:10:57 So I guess if we take Anthropic and Open AI, the game presented, this. really specific way of working has had some reasonable success stories. I hesitate calling them success stories because they haven't proven themselves as actual sustainable businesses, but they're generating revenue pretty quickly in terms of like VC growth expectations. So maybe the assumption here is we're just going to get more of that exact pathway that these two companies got or three or the early entrance to this. Just kind of basically, it's clear in my mind like there is no safe path to developing this i don't know i just kind of oh the safety i think is a whole different conversation yeah it's just like i don't know i don't see i don't see how it's
Starting point is 00:11:45 feasible and i understand there's like the promise but i don't know if you're taking a billion dollars of venture capital and you're telling me that like you're you're i'm sure you're in the in the meeting with the vc talking to them about how they're going to deliver the returns um you're not really working for society you're working you're working to deliver that money back i I would love to have seen that pitch from Ilya and Save Super Intelligence. Like, it almost feels like an episode of HBO Silicon Valley where you can imagine him sitting across the table. There's some VCs on the other side and him looking them in the eye and saying, we are not going to build a product.
Starting point is 00:12:23 We are not. And then be like, holy shit. Oh, my God. This guy knows something. What do you need? What do you need? Yeah, it's totally, it's totally bizarre. I think this sort of brings us home to this piece that Casey Newton wrote. There are probably too many AI companies now. It's a great piece and there's a really hilarious subhead. Everyone has a model. Almost no one has a business. I'm just going to read a little bit from it. He says, I've been talking with tech executives about the likelihood of a bubble in artificial intelligence. Everyone I've spoken to describe the experience of seeing another AI company come along with a slightly better or cheaper model than they've been using and quickly swapping it in to replace the previous one. There's a reason foundation models,
Starting point is 00:13:04 have become so commoditized. Most of the original research was published openly for anyone to make free use of. Catching up to the state of the art is often just a matter of acquiring the necessary hardware and a smattering of talent. All the big AI labs are losing money. As models improve and prices fall, it seems increasingly certain that a wave of consolidation and even outright failures will follow. And that is kind of, you know, when I think about what I really is doing what I think about what Mira is doing, what I think about what, even the fact that we have all these established ones, the open AIs, the Anthropics, and GROC now, and DeepSeek, I think Casey has a point. Well, I think he's missing one big part of this. Currently, you kind of have two categories.
Starting point is 00:13:49 You have the model builders, and then you have the model users, the productized side. I don't know what we want to call it. But like, on the more product oriented side, there's still endless products coming out like granola AI have you been using it for a note taking in meetings it's really good it's like it's otter and others it's like you know are all competing in this space but it basically both transcribes your meeting it does it without i don't know if anyone's ever used to otter but there's this like kind of really creepy the otter assistant joins the meeting and shows up on zoom or google meet as like a separate guest and everyone gets freaked out and is uh versus granola uses your sound on your computer, takes notes. However, they've trained the model to actually summarize the
Starting point is 00:14:37 notes and give you action items is good. So here's a product that's probably worth paying like 10 bucks a month or whatever their pricing is. And it could work. Maybe it'll be a sustainable business or maybe Open AI will somehow, or any other large player will be able to kind of just abstract away that specific service and kill their business. But there's that side of it. But then on the actual foundation model side, as you said, we have the Open AIs and the Anthropics and the Mistrals and now GROC from XAI. All of these, their products all look kind of similar. There's a chat interface. Obviously, Open AIs operator was a completely new interface and new looking thing.
Starting point is 00:15:25 Gemini and OpenAI all have good voice modes. you know, it still all looks the same. So that's the side where I do believe Casey's right that that part of the entire stack will get completely commoditized. And the idea that they're the ones continuing to raise the most money and have the richest valuations
Starting point is 00:15:46 does not make sense to me. Let's go to Benedict Evans. He's talking about this commoditization issue that Casey pointed out. Open AI and all other foundation model labs have no moat or defense accept access to capital. They don't have product market fit outside of coding and marketing,
Starting point is 00:16:03 and they don't really have products either, just text boxes and APIs for other people to build products. Deep research, which we talked about last week, is one attempt amongst many, both to create a product with some stickiness and to instantiate a use case. But on one hand, perplexity claimed
Starting point is 00:16:20 to launch the same thing a few days earlier, and on the other, the best way to manage error rates today seems to be abstract the LM away as an API call inside software that can manage it, which of course makes the foundation models themselves even more of a commodity. I think this is such a good point from Benedict Devons, basically saying like what is defensible if the best way to use these things is to sort of implement them in your software and put the proper controls in
Starting point is 00:16:51 so that you cannot, so you don't have the same errors that they do, out, you know, off the shelf. And I think he makes this other good point, which is that, and it's also something that Casey pointed to, look how many deep researches we have at this point. I mean, there's four of them. There's four deep researches. Open AI has one. Google has one. Google is first. Uh, uh, Grock has one and now perplexity has one. And so if you ask me, like, what are the modes for these companies and where are these, you know, trillion dollar valuations coming from? I don't know. I think some of these skeptics have a point. I welcome Benedict to team. It's the product, not the model. We still haven't gotten our t-shirts made, but if and when we do, I'll make sure to send you one because, yeah, he laid it out perfectly. And having built on this stuff, it's so easy often to just switch whatever model you're looking at. Like you build something and then you just change an API call and that's it. And as you said, deep research, what's almost
Starting point is 00:17:55 terrifying and amusing is when I read the words deep research in this quote, I actually didn't know what he was talking about because I've actually started using perplexities deep research this week, which they launched off of deep seek R1 within days. So yeah, I think it's very clear that those kind of interfaces, those kind of use cases are going to be completely commoditized. I still give Google an advantage here because like the distribution side of it is going to become even more important. Gemini is still getting there in terms of the integration in Gmail and other areas. It's still not great. But like I think as those get commoditized, the distribution becomes even more important because if you can inject these type of features into places people already are, it becomes a lot more useful than having to get more people to, to your platform though chat GPT actually just came out today and said now they have 400 million
Starting point is 00:19:00 so speaking again of this commoditization of everything and has deep seek been commoditized at this point so this is actually coming from the big technology discord and look folks I'm not gonna I won't spend too much time hitting you all over the head with the discord pitch but it has been pretty fun in there we have 54 people in there and it's for big technology paid subscribers and if you want to join, you can just go to big technology.com, find the story that says, let's talk deep seek, AI, et cetera, on big technology's new Discord server, sign up for the paid tier and then join us. I think it's been awesome, Ron John. The signal to noise has been insane. Like it is some of the highest value conversations I'm having on AI already.
Starting point is 00:19:45 Oh, no, I completely agree. Signal to noise ratio, I think, is how my other chats certainly do not equal the same quality in signal to noise. But yeah, I've been, I've been learning a good deal. It's been cool. So we got, we have some real builders in there. And I just started a meme's channel. So, I apologize for what I'm going to put in there, but it's going to be fun. As all chats go. So yeah, sign up for the big technology premiere, the premium subscription. It's just $8 a month or 80 a year. And you can join us in the Discord. And if not, no worries, we'll just talk with you on the podcast here. So this is from the Discord, though. Someone wrote, it looks like Google just deepseeked itself. Deepseek, of course, dropped the price of a lot of
Starting point is 00:20:28 stuff. And here we go. This is Gemini 2.0 flash thinking. An input token is seven and a half cents per million on flash thinking and it's 55 cents for Deepseek R1. Output token 30 cents per million and deep seek R1, $2.19. It is, I mean, I, you know, whether, the quality is the same or not, it is just amazing to me that, like, talk, what happens when it's a commodity, right? The margin compresses. I gave a talk about this at Web Summit two years ago. The margin will compress when everything is a commodity. Margins are flying out of the game right now for these foundational model companies. Well, you wanted Javon's paradox, everybody? You're getting it. Because if, when we're all two weeks or three weeks ago now talking about Javon's
Starting point is 00:21:19 paradox and the idea that when things are commoditized, they will be used more. And that's the like really, I don't want to say desperate, but it's the argument that, okay, if this goes down from what used to be like three or four bucks down to seven cents, but we're going to now distribute this at a much larger scale because people are going to build a lot more with it, so we'll make more money. I don't know. I don't know. I think it's the product, not the model, is just getting even more real for this. Gemini 2.0 flash thinking, not the greatest name, which we've discussed many times. They get worse. It's amazing. They get worse. But you know what? The models are getting better. I've started using Gemini a lot more. Gemini voice is really, really good. The latency on it,
Starting point is 00:22:07 the voices they use on it, the conversational ability of it. So I think that, again, this going back from a cost standpoint, and also remember, everyone who is already sitting in Google Cloud and Microsoft Azure, if these models and the actual building capabilities within those get really good, they're always going to beat an OpenAI. If you already have a contract and an entire customer success team with Google, why would you then waste your time with Open AI if it's more expensive and not as good? I mean, you wouldn't. So this is, yeah, it's an argument for Google and an argument for Microsoft also.
Starting point is 00:22:49 And there was this hilarious moment in the Dwarkesh interview with Sadia, where he goes to Satya, he says, all right, you tweeted about Jevin's Paradox after Deep Seek. What is so expensive about artificial intelligence today that I would use more of it if it was cheaper? Because for my vantage point, it's already pretty cheap. It was a great question. Of course, you and I just paid $200 for unlimited Chachipiti. So this is like a stupid thing for me to be saying in that context. But I think the big problem is, again, like, what are people going to do with it? Not they want to use it so much that it has to be cheaper for them to use it.
Starting point is 00:23:25 Now, maybe on the enterprise side, it makes sense to, you know, this Jevin's paradox thing makes sense, especially in the eyes of Satya Nadella, who's selling to enterprises and selling that compute. But it was a really good question. And Satya was basically like, listen, he's like, the price needs to come down and the model needs to be better also. I would disagree on that because it's still such, even when OpenAI, let's say it's not 400 million, but I think whatever the other numbers, 200, 300 million of users of Chat ChbT, it's still tiny relative to the overall population.
Starting point is 00:23:59 And the number of use cases for the average person are still tiny. So I do think that most people, the idea of spending $30 a month on ChatGPT Pro or 20 bucks or whatever it is. They're not ready to do that. And also knowing that most of these companies are losing gobs of money, so they're essentially subsidizing our, meaning you and me, our use. So I do think that there is a world that more use cases that like the average people a person adopts, it'll be important the actual cost to the individual.
Starting point is 00:24:36 I accept that. Ron John for Microsoft CEO. That was even a better answer. than Sanya gave. I don't know. After his nonsensical, what was it? Benchmark hacking. Satya gets to stay.
Starting point is 00:24:50 He'll stay. He'll be the game. There's another, Lord Almighty, another AI model to talk about this week. We'll go quick through this. Elon Musk's XAI has released its latest flagship model. GROC 3, this is according to tech crunch. GROC is XAI's answer to models like OpenAIs GPT4O and Google's Gemini. It can analyze images and respond to questions.
Starting point is 00:25:10 and powers a number of features on X. XAI has used an enormous data center in Memphis, which we've talked about here on the show, containing around 200,000 GPUs to train GROC3. XAI claims GRO3 beats GAPT-4-O on some benchmarks. An early version of GROC3 also scored competitively in chatbot arena. I used it. It spells out its chain of thought in some really interesting ways.
Starting point is 00:25:34 It is supposed to be kind of edgy and real, but I always find it to be, I want it to be good, but I find it to be cringe and I feel like it's like the Steve Buscemi thing, like, how do you do fellow kids when it responds to me? But I'll just say this about Grock. I want to know how well this performs. I think that Elon is such a polarizing figure that the people that like Elon say it's the next generation of model and better than everything else.
Starting point is 00:26:02 And the people that hate Elon say that it's filled with nonsensical benchmark hacking. I mean, it's really, really tough together. a solid read on how it's performing, you know, maybe outside of chatbot arena. What do you think about this, Ron John? Well, I think the overall space of benchmarking, for me personally, I don't want to call it nonsensical, and Satya is using his words, but like, to me, the way, like the benchmarking is always done at such a theoretical level using whatever test has been developed or using, and we saw this with like, what was it, 4.0 was 90% or 80% of the way to AGI based on a test that was defined in
Starting point is 00:26:47 2021. I think to me, like, you see this stuff and you feel it when you're actually using these products. And it's so hard. And I don't know, like, I feel there needs to just be like a normie benchmarking where you just ask, you don't try to like fool it with a math problem or you don't try to like have it create some like PhD level thing, but you just ask it some pretty straightforward questions and see if it gets it wrong or right. I mean, I've even done that where if you're trying to test some type of rag tool, like retrieval augmented generation, put in a bunch of documents and ask it like five questions that you already know the answer to, see if it gets it right or not. And a lot of the times, like open AI connecting
Starting point is 00:27:29 to Google chat GPT, connecting to Google Drive, even with Claude, you ask questions and it just doesn't get stuff right. So, like numbers, really specific facts. I don't know what benchmark would actually show that, but I think it exists, though. I mean, that's, isn't that chatbot arena where basically they put two outputs of chatbot side by side and you let, they let you vote on what the better one is? That's the closest we have. That's the, I agreed.
Starting point is 00:27:54 That's the close. But who is using chatbot arena? Come on. Like, people who care about this stuff. Exactly. But the real, the value is going to be accrued to people who don't currently care and who just want to use it and not spend their time in chatbot arena. No, I'm not saying chat bot arena is a valuable program in and of itself, but I think we can
Starting point is 00:28:13 rely on it as a pretty good evaluation. And, uh, you know, I, I, I, as we're talking, I'm taking a look at chatbot arena. And guess what's number one on chatbot arena? It is groc three. So there's an answer. Yeah, but you don't think that can get gamed pretty well by, I guess it could. I mean, come on. I mean, come on. Elon's fans go into Chat Bot Arena and they say, let's find the edgiest, realist answers and vote them up. Yeah, ask some questions where you want something slightly more not politically correct. And then you'll guess pretty easily which one is GROC 3. And again, if the type of user that would spend time on Chatbot Arena, I think in terms of persona would be an Elon fan.
Starting point is 00:28:59 So I think, like, this stuff can get gameed, basically. I want to know when my mom is using chat GPT versus clot or whatever it is and whatever questions she wants to ask, who's going to give the best answer? Let's go back to Casey Newton. He says, if you accept that GROC is a state-of-the-art model, not a single person working in AI believes it will stay there for long. Leading AI labs push out new models every few days, and any innovations are almost all, are almost,
Starting point is 00:29:29 almost all quickly copied and absorbed by their rivals. Speaking to people who work inside these labs, you get the sense that none of that really matters. To the true believers, AI is the final technology, the one that will invent all the others, and almost all the rewards lay simply in getting their first. They seem not to be building traditional moats for their businesses out of a sense that they don't really need them, that once superintelligence arrives, the world will shift from scarcity to abundance, and the need for money will disappear. For now, that vision remains strictly in the realm of science fiction, but a lot of bills are going to come due between now and then.
Starting point is 00:30:11 I think that is a really great analysis and sort of really ties together a lot of what we've been talking about. Where's the ROI on the Iliya thing? Can Mira Muradi's company exists? where's all the products coming from these labs? I think Casey puts it really well and really succinctly. The people that believe in this, and I've heard this before,
Starting point is 00:30:32 either believe it goes to infinity or zero. And so therefore they're investing it. And therefore they're betting on it. And basically like it doesn't, as long as it survives long enough to the point where it gets to be good enough that it can invent other things, then they're good.
Starting point is 00:30:48 But this is, this is, and again, I'm speaking about this with Reed Hoffman a couple weeks ago, that's a big bet. That is a very big and dangerous bet to make. It's a big and dangerous bet. I'm guessing a lot of the investment community either made, I mean, made paper returns on Open AI already or has FOMO for not being in the early rounds of Open AI. So I think, yeah, this feeling or thesis and me sitting here in New York, I don't come across that many people. I feel like if you're in Silicon Valley, maybe you'll hear this more. But I think Casey put it really well. There is this almost religious belief that the
Starting point is 00:31:27 game here is to just build this either AGI or super intelligence or whatever you want to call it. And that will magically solve all the business challenges. Everyone will pay you for the product. Everyone will pay you untold amounts of money for the product. And I guess I don't look at it like that. It's kind of like a science fiction investment thesis. I mean, I guess you have to be believing in science fiction in some sense you're investing in technology but this is it's pretty bold let's go back to that that pitch scenario of ilia and the vc's because i wish silicon valley on hbio is still around because right now i can never watch that show it's way too close to home for me just couldn't i loved it i lived it when i was living in the bay area i was good that's true that's true i saw some
Starting point is 00:32:15 crazy stuff i was uh there's a scene in kevin ruse's book about uh some um some young kid telling him that he's building an automation company and calling it a boomer remover and yes and i was there when he told he was telling both of us this i'm good with the silver i'm good with silicon valley man i mean i i'm happy i spend time there i'll probably live there again at some point uh but i don't need to watch that show all right that's fair that's too much and and what about um research removers so you know let's go back to benedict evans just for a minute So we've wrapped our too many AI companies segment, probably too many.
Starting point is 00:32:56 But if they're right about this science fiction vision, then jokes on us. But as I was reading through, I actually went to, I read through the full Benedict Evans post. And it was actually quite interesting because Benedict Evans is a researcher, an analyst. And he put deep research through the motions, basically trying to see if it was capable. And what he found was quite interesting. He found that basically it does decent research, but it's often quoting from surface level sources, often misses obvious sources that are more definitive, and it's just incomplete. Let me read from him. Are you telling me that today's model gets this table, one that he had to produce, 85% right, and the next version will get it 85.5 or 91% correct?
Starting point is 00:33:44 That doesn't help me. If there are mistakes in the table, it doesn't matter how many there are. I can't trust it. If, on the other hand, you think that these models will go to being 100% right, that would change everything. But that would also be a binary change in the nature of these systems, not a percentage change. And we don't know if that's even possible. We don't know if the error rate will go away. And so we don't know whether we should be building products that presume the model will sometimes be wrong or whether in a year or two we will be building products that presume we can rely on the model by itself. That's quite different to the limitations of other important
Starting point is 00:34:20 technologies from PCs to the web to smartphones, where we knew in principle what could change and what couldn't. Excellent analysis here. Basically, he's saying, like, these models keep improving, so they get like 85% right, 90% right, 95. He's like, why am I going to use a research report, a research tool that I know may never get to 100% right? And if it does get to 100% right, then it's a totally different tech. What do you think about this? I like this. I think this is a really smart take on it,
Starting point is 00:34:51 especially the part around, like, should we be building products with the assumption that this is 85% right or 91% right or whatever it is? Should we be educating people today on how to use a research product that is, call it 85% right? There's tremendous value in that.
Starting point is 00:35:13 I do believe, been using it more and more myself. And again, I'm not going to pay chat GPU or OpenAI 200 bucks now because perplexities is pretty good. It's good enough that I'm okay, not spending 200 bucks. Is it a starting point? Is it, is that 85% good? How should I look at the data being presented? How should I look at the sources? Like me, spending time on how to use it at this error rate is actually valuable for me and then it makes me a better user of it. But these companies are not going to do that though. They're not going to build a product for an 85% because they have to promise that it's going to get to 100%. So now they're going to keep pushing. They're not going to improve
Starting point is 00:35:57 the product. And now this is what I worry about. And I've been saying AI has a brand problem over and over again because they're going to keep building for that 100% accuracy. Maybe they get there in six months a year, two years. But in the meantime, more and more people are going to be like, oh, well, this is useless because they're going to see wrong things rather than just even, it's like if you use Google search or Wikipedia, you learned to use it. You learn that if I use Google search, the best result. If I click the first one, might not be the best one. That's okay. I'm going to work my way through the list. This is a process. So I think this is the smartest point on this is that companies like the way we actually release these products into the
Starting point is 00:36:41 wild there's a big disconnect here right now yep definitely and speaking of AI's branding problem you got to think about humane but we'll talk about that in this well it's not the second half because we're in the second half already but after the break but before we get to the break I want to talk about one more piece of the consequences of deep research which I think the handing tasks off to AI already. There's already some early research that it's atrophying people's brains. And this is from the 404 media. It's about a Microsoft study.
Starting point is 00:37:14 Microsoft study finds AI makes human cognition atrophied and unprepared. A new paper from researchers at Microsoft and Carnegie Mellon University finds that as humans increasingly rely on generative AI in their work, they use less critical thinking, which can result in the deterioration of cognitive faculties that ought to be preserved. So it's a key irony of automation that by mechanizing routine tasks and leaving exception handling to the human user, you deprive the user of the routine opportunities to practice their judgment and strengthen their cognitive musculature, leave them atrophied and unprepared when the expectations doleritis. Interesting that it's coming out of Microsoft
Starting point is 00:37:54 research. But this makes sense to me, right? Like people say that when you use GPS as opposed to a map, that part of your brain kind of goes away. Lord help. help me. I couldn't get around with a map today. And so maybe we're going to experience the same thing on a wider scale when it comes to these products like deep research or even just the chat GPTs. What do you think about this, Ron John? I think it's both correct but not worrying. I think, as you said, GPS. I mean, I honestly have outsource a part of my brain where I remember where I keep things with air tags. And now literally before I would look around for my phone or AirPods or keys and now I go straight to my phone and find the item and beep.
Starting point is 00:38:33 bit. Like, I've completely outsourced that part of my brain. So I think there's always good and bad in these kind of things, but I do think, and you have a Paul Graham piece a bit later. I think the, and it's going back to what I was saying a second ago, the process of researching in itself is valuable. Understanding how to look through what you're presented, if it's 85% right, is valuable. So I do think that certain parts, but the actual act of having 20 tabs open and having to click on and copy paste into a document, that skill will probably go away. Maybe that's not the worst thing in the world. So I think, I think this one's okay. Okay, let's hear from Paul Graham being quoted in The Economist, which uses the most
Starting point is 00:39:21 economist word ever to start this paragraph, and that is ineluctibly, ineluctibly. electively? Why do people write like this? No one speaks like this. If you speak like this, you're a jerk. No, no, the economist editors, they speak like that. They do. I worked at the financial times. I can guarantee you the economist editors speak like that. Well, I'm going to just skip this word. Paul Graham, a Silicon Valley investor has noted that AI models by offering to do people's writing for them risk making them stupid. Writing is thinking, he has said. In fact, there's kind of think it there's kind of a thinking that can only be done by writing the same is true for research for many jobs researching is thinking noticing contradictions and gaps is in the conventional wisdom
Starting point is 00:40:08 the risk of outsourcing all your research to a super genius assistant is that you reduce the number of opportunities to have your best ideas i completely agree with about with this about writing in particular i kind of see your yes and no perspective but i'm also kind of like probably a bit more concerned than you are because this stuff all seems pretty real to me. I'm not denying that I've zero concern on this. But again, I think that over time people will figure out how to use it. There's going to be some large number or some number of people that do use this in a lazy way and do lose the ability to think through writing. But I think overall, I actually had the most random deep research also related to last week we were talking about model picks.
Starting point is 00:40:55 And, uh, do you, do you know what the opposite of a cake donut is? There's two main kinds of donuts. A, is it a potato donut? No, no, no, no, no. Is it? Is it, is it? Is it, you know, like the kind of fluffier ones that are, uh, okay. Anyway, this came up. A bread donut. The answer, I'm not a huge donut guy, but this just came up because someone was like, is a little ashamed to myself now, but sorry, yeah, let's hear it. You can become, and an expert thanks to deep research. So I was asking, and because I use not Google for most of my searches nowadays, I asked, you know, what is the other type of donut other than a cake donut?
Starting point is 00:41:36 I asked perplexity. I accidentally had deep research still clicked. And rather than giving me a simple answer, I received an entire term paper, essentially, a Ph.D. paper, the opposite of cake donuts, a comprehensive exploration of yeast-raised donuts and they're contrasting characteristics. And it goes on and it provides like 6,000 words. It's a yeast donut around this. And this was where I was also like,
Starting point is 00:42:05 how much random stuff is going to get generated by either people asking dumb questions, asking simple questions and picking the wrong model. But these research tools are, they're quite something. Okay, I changed my mind. I'm into this. This will expand the human capacity for cognition. I mean, I might,
Starting point is 00:42:22 You might become a donut expert now. I've learned more about the opposite defined by contrasting leavening methods, textures, preparation techniques, and culinary applications is unequivocally the yeast-raised donut. That's pretty good. I think that these deep research tools are going to be a godsend to stoners everywhere. Who, instead of browsing Wikipedia, will start to read long, half-true. deep research entries about lots of weird stuff. All right, well, maybe deep research will give stoners something else to do outside of
Starting point is 00:43:00 talking with their Amazon echoes all day long about the meaning of life. And I think they're going to have some competition because the Echo and Alexa are about to get better. We'll talk about that right after the break. Hey, everyone. Let me tell you about the Hustle Daily Show, a podcast filled with business, tech news, and original stories to keep you in the loop on what's trending. More than 2 million professionals read The Hustle's daily email for its irreverent and informative takes on business and tech news.
Starting point is 00:43:28 Now, they have a daily podcast called The Hustle Daily Show, where their team of writers break down the biggest business headlines in 15 minutes or less and explain why you should care about them. So, search for The Hustle Daily Show and your favorite podcast app, like the one you're using right now. And we're back here on Big Technology Podcast Friday edition. A couple of news items to hit before we get out of here for the weekend. And next week, Amazon has it going to have a big revamp of Alexa that it will be announcing on Wednesday. This is from Reuters. The AI service will be able to respond to multiple prompts in a sequence and, as company executives have said, even act as an agent on behalf of users by taking actions for them without their direct involvement.
Starting point is 00:44:14 This contrast with the current iteration, which generally handles only a single request at a time. So it looks like at long last, we're going to see an Alexa update. I am planning to be in attendance, and I'm hoping that we'll be podcasting in or around the event. So stay tuned for that. But I'm personally looking forward to this. I have three echoes in my apartment here in New York. And I really want them to work well, to work at all, to even just play some music when I ask them to do it, where sometimes they do and sometimes they don't. Am I getting over my skis being excited about the Amazon Alexa overall, Ron John?
Starting point is 00:44:55 And what are you looking out for here? Well, I don't want to reign on your parade as an Apple home user with multiple home pods everywhere who just gets more and more disappointed by Siri and every new release. But I do, I want to root for you. I want to believe that Amazon is going to figure out. I do think, and I'm almost confused, how good voice has become Gemini on the Gemini app, chat GPT, advanced voice mode. They're so good that there's no reason these voice assistants should not be good.
Starting point is 00:45:34 I don't understand. Like, I'm genuinely baffled why, even the latency with Gemini and stuff like that, like, it's really good. So I've been very confused as to what the hold up with all these companies are. And honestly, Amazon brought us voice assistance. Alexa was one of the most revolutionary devices, I think, of the 2010s. So I think if anyone can do it, I think they have a shot. Well, it's coming.
Starting point is 00:46:02 And it looks like it's going to potentially be a subscription product. This is all speculation. But the post says Amazon previously planned to launch the improved Alexa with the free trial period, after which customers would have to pay a subscription fee. It is going to be delayed, the Post says, so they're going to announce it on the 26th, but we might not see it until the end of March. And the opportunity is massive for Amazon here.
Starting point is 00:46:27 Also from the Post, Lexa is free on the more than 500 million echo devices or Alexa enabled devices around the world that can play music, dim lights, and read the headline. So if they get this right, massive operations. opportunity. But if they don't, just be more disappointment. But the cool thing about having a piece of hardware that is software enabled, as you can update it at any time, and take something that's mildly disappointing, and make it shockingly beautiful and wonderful. And so maybe that is
Starting point is 00:46:58 what Amazon will do. The thing that would worry me here is, and I'm curious, if you have use cases beyond, I don't know if you have it turn off and on lights, but ask the weather ask sports scores, maybe ask some, like, recipe information. Like, do you have any more advanced use cases currently? I used to do lights. I don't do lights anymore. I think it stopped working as well as I wanted it to and sort of gave up on the whole smart home concept.
Starting point is 00:47:25 I do music a lot. I do timers. But every now and again, when my wife and I have an argument about something, I will summon Alexa and say, what's the answer here? And it does get it right every now and again. And I just think that that is going to be like one of the cool use cases. If you have it, you know, be somewhat conversational, start conversations, facilitate things, and be able to answer accurately and sort of actually get the context of your question when you ask something.
Starting point is 00:47:58 Do you know what I will never do? Do you know I will never do while in an argument with my wife, turn to my smart, my voice assistant and ask who is right? because does that really go well? It's a huge miss. It's a huge miss. Honestly, it really does diffuse some of the tension because you're just like, well, at least we're not as dumb as this echo here.
Starting point is 00:48:21 And then, you know, just. All right. All right. But if it gets smart. What if it gets smart and starts, Alex, you're wrong. You're wrong. You're like, is that okay? Look, then I'm owned.
Starting point is 00:48:33 I have to admit it. Right. That's true. Two against one. Yeah. I just think to, I mean, And this is not marriage advice or relationship advice, but I think everybody out there, if you're trying to find ways to bring common ground with those that you love, I think that you might want to just try bringing a voice AI into the conversation and seeing what happens. Maybe it will be great.
Starting point is 00:49:03 Maybe it will be terrible, but at least you tried, and that's what matters. If any listeners do go down this path, write in. Let us know how it went. You can find us at Ron John Roy at gmail.com. I don't even know if that's your email address. But yeah, please don't do this in any high stakes moment. No. Okay. No.
Starting point is 00:49:25 So that will not be our fault. Speaking of high stakes. Just keeping in the low stakes argument. You know, what's the capital of Greece when you say Athens? She says Sparta. And then you call in your voice assistant to tell you the capital of Macedonia. You know, it's just how it goes. Okay, that is fair.
Starting point is 00:49:44 I'll give you, if it's purely information or factual, maybe. But if it's like... Can you believe that? Full blown. You never pick up the kids. We're always at your parents. Alexa, who's right? Who's right?
Starting point is 00:49:56 I think that Alex is wrong. Well, I've been tracking your both of your movement regularly for the last two years, and I can tell you exactly who's right. Yeah. Honestly, it's true innovation that we need. It's going to herald the new American century and once again, take the AI discussion away from model and torch products. See, just trying to help you, Ron John. It's the product.
Starting point is 00:50:18 All right. A few minutes on the humane pin before we leave. The humane pin is dead. If you all remember, it was this pin that you would wear on your shirt, basically, and you could touch it and it could give you AI stuff. And it would project stuff on your hand when you need. some information. HP will acquire the assets from Humane, the maker of the wearable AI pin
Starting point is 00:50:41 introduced in late 2023 for $116 million. The deal will include the majority of Humane's employees in addition to its software platform and intellectual property. However, it will not include Humane's AI pin, which will be wound down. I saw a great meme about this. I should drop it in the Discord meme section
Starting point is 00:51:00 about someone hitting the Humane pin and projecting on that. their hand that their printer is low on toner. Toner, yeah. But it is really a, by the way, they raised $230 million. So it really is a ignominious end. Is that the right way to pronounce it for one of the AI era's worst conceived and worst marketed products?
Starting point is 00:51:24 Rest in peace, humane pin, we will not miss you. We will remember you with the likes of Quibi and Quixter and other terrible failures that began with Q. but you began with an h's umane pin hp hp now stands for humane pin that's the rules i don't make them up that's the rules i think it's not the worst outcome again as uh m g sigler put out they're selling for 116 million so for something that feels like this just dramatic a failure they somehow got something out of it at least to initial investors but i was thinking about remember we came up with the concept of pincasts
Starting point is 00:52:04 Instead of podcasts specific to the PIN, that's what I'm a little disappointed, where you're like, it's some kind of interactive podcasts that you're listening to and it's, you're talking to your PIN and it's projecting some information. I'm a little disappointed.
Starting point is 00:52:18 I did not bet it all on making PINCasts. That's right. We should have done that, but we didn't. And we could have really, I think, saved you, Maine with either PINCasts or just to map. imagine you're you're in a dispute with a loved one it all comes back to solving arguments
Starting point is 00:52:41 things are going rough like real rough and you're like sorry honey let's let's bring in our let's bring in the pin you tap your pin twice and then it settles the matter for you and it's right above my heart as well you ask who is right and it projects the answer on your hand and green let's read mj seegler just before we go he goes a regular person might read that headline that the Sorry, that the company sold for $130 million and think, wow, a startup sold for nine figures. Impressive. Of course, it's not impressive in this use case. It's a fire sale for a company that has been under duress for months after their product. The AI pin failed to catch fire in the market. Actually, that's not technically true. There was a little risk. There was a literal risk of fire when charging the device, which led to a recall. I completely forgot the U-Mane pin lights on fire side of this recall. Never good. of this story, but apparently that's what happened. I do think, I do wonder these kind of things. If they weren't, if those launch videos weren't so bad, did it stand a chance?
Starting point is 00:53:46 Like sometimes you wonder like the, was it really the technology? Also, it was released at a point where voice interaction with generative AI was, again, it was only two years ago, year and a half ago. The voice mode and voice interaction have like, exponentially jumped in quality. So it's a timing issue of one, but the marketing and those launch videos will live in history
Starting point is 00:54:10 as some of the worst I've ever seen. And could this have stood a chance if their launch videos weren't just memed into oblivion? Yeah, that's what M.G. Cigler is basically saying here. He says the PR strategy was obviously a disaster from the get-go. This is easy to say in hindsight,
Starting point is 00:54:29 but many people were saying it in real time from their grandiose, nebulous introduction video to their first product he's on stage at ted it was seemingly less of a strand less of a sound strategy than a vignette of clichés and by god we've heard a lot of vignettes of cliches lately this is me not an mg but he says it also includes a 699 price point that was obviously never going to work for this product and on top of that a monthly subscription fee like cool everyone was like oh that might be cool maybe we weren't but just like the dynamics of this business, we're never going to work. And that kind of leaves me to our final question of the
Starting point is 00:55:05 week, which is there is something to be said of like you kind of got to hand it to these people for actually going ahead and building. But the other side of that is like, I don't know, do you have to applaud everybody that tries or can you just say that sucked and I'm disappointed that happened? And I'm kind of in the second category on this one. Well, I think I have a lot more sympathy or empathy for people who weren't able to raise $240 million before even launching the product. And then the production value and just knowing how expensive those launch videos must have been to make and how much they must have spent on the agencies. So, no, I don't have sympathy for them.
Starting point is 00:55:47 I think, like, you could just, come on, you can make a good engaging video pretty easily nowadays. You should have made one. And maybe we would all be wearing our... pins as we argue with loved ones and listen to our pincasts and and the world would actually be okay right now. Everything would have been okay if humane didn't screw it up. They should have called in Cantraudson Roy. We would have told them. Got to show the use case. Got to show the product and got to show the humanity of your pin. And instead, they burned through $240 million. Could have saved
Starting point is 00:56:23 democracy. Could have saved it all. Instead, there's no more pin. It's now. It's now. It's with the printers. That is the ultimate graveyard of innovation, I think, that we've ever, you can, one can ever imagine, Hewlett Packers, Printers Division. There goes any chance that HP will ever sponsor this podcast, but it was worth it. So thank you. Thank you for that, Ron John. But if you're listening out there, HP, good luck with Humane. Maybe you'll do something cool and we're waiting for it. Bring back the pin. Bring back the pin. In the meantime, we're wishing you all love and success and happiness and tranquility at home. And remember, if anything goes wrong, just ask Alexa. So, Ron, John, great to see you as always. Thanks for coming on the show.
Starting point is 00:57:08 All right. Hopefully my marriage survives the weekend. And I don't ask Alexa to resolve anything. Okay, everybody. Thank you for listening. Chris Hayes is coming on next week. We actually have a very lively and fun conversation about how social media and publishing differs and all this other stuff that he's been talking about on his tour about the Sirens College is his number one best-selling books. So hope you stay tuned for that. Thank you listening. Thank you to Ranjan. And we'll see you next time on Big Technology Podcast.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.