Tech Brew Ride Home - Fri. 09/13 – The First Strawberry is o1

Episode Date: September 13, 2024

The first of the Strawberry models is here. YC plans to have four cohorts a year, but each one is getting smaller. Waymo is already ready to expand to more pretty big markets. And in the long reads, a... deep dive look into the options Intel has at this point in time. Sponsors: HensonShaving.com/ride code ride Links: OpenAI releases o1, its first model with ‘reasoning’ abilities (The Verge) Notes on OpenAI’s new o1 chain-of-thought models (Simon Willison's Weblog) OpenAI's new models 'instrumentally faked alignment' (TransformerNews) Apple AirPods Pro granted FDA approval to serve as hearing aids (TechCrunch) Silicon Valley’s Y Combinator to Double Number of Cohorts Per Year (Bloomberg) Weekend Longreads Suggestions: Intel Has Only Tough Options After Its Long and Stinging Fall From Grace (Bloomberg) Link to the twitter poll about ads Learn more about your ad choices. Visit megaphone.fm/adchoices

Transcript
Discussion (0)
Starting point is 00:00:00 On April 4th, 2023, around 2 in the morning, a man was found stabbed multiple times on a sidewalk in downtown San Francisco. Hey, who did this to you? What happened next turned the story into a political firestorm. Reports have identified the victim as Bob Lee, the founder of Cash App. From Bloomberg Podcasts, this is Foundering, the Killing of Bob Lee, beginning April 16. Welcome to the Tech meme right home for Friday the 13th of September 2024. I'm Brian McCullough today. The first of the strawberry models is here. YC plans to have four cohorts a year, but each one is getting smaller. Waymo is already ready to expand to more pretty big markets, and in the long reads, a deep dive look into the options Intel has at this point in time. Here's what you miss today in the world of tech. After I hit publish on the show yesterday, OpenAI released 01, the first of the rumored reasoning-focused strawberry models into preview, alongside a smaller 01 mini for chat GPT Plus and team users. In terms of reasoning improvements, OpenAI claims that in a
Starting point is 00:01:19 qualifying exam for the International Mathematics Olympiad, O1 correctly solved 83% of the problems, while GPD 40 solved only 13%. Quoting from the verge. For OpenAI, O1 represents a step toward its broader goal of human-like artificial intelligence. More practically, it does a better job at writing code and solving multi-step problems than previous models, but it's also more expensive and slower to use than GPT-40. OpenAI is calling this release of 01 a preview to emphasize how nascent it is. ChatGPT Plus and team users get access to both 01 preview and 01 Mini starting today, while Enterprise and EDU users will get access early next week.
Starting point is 00:02:03 Open AI says it plans to do. to bring O1 Mini access to all the free users of chat GBT, but hasn't set a release date yet. Developer access to O1 is really expensive. In the API, O1 preview is $15 per 1 million input tokens or chunks of text, parsed by the model, and $60 per 1 million output tokens. For comparison, GPT40 costs $5 per 1 million input tokens and $15 per 1 million output tokens. The training behind O1 is fundamentally different from its predecessors, Open AIs Research, lead Jerry Turek tells me, though the company is being vague about the exact details. He says
Starting point is 00:02:40 O-1, quote, has been trained using a completely new optimization algorithm and a new training data set specifically tailored for it. Opening I taught previous GPT models to mimic patterns from its training data. With O1, it trained the model to solve problems on its own using a technique known as reinforcement learning, which teaches the system through rewards and penalties. It then uses a chain of thought to process queries, similarly to how humans process. problems by going through them step by step. As a result of this new training methodology, Open AI says the model should be more accurate. We have noticed that this model hallucinates less. Turek says, but the problem still persists. We can't say we solved hallucinations entirely, end quote.
Starting point is 00:03:20 I'm going to turn to Simon Willison again to assess all this on his blog. Willinson says Open AI's 01 models aren't as simple as the next step up from GPT4 might be as they introduce major costs and performance tradeoffs in exchange for improved reasoning. And quote, one way to think about these new models is as a specialized extension of the chain of thought prompting pattern, the think step-by-step trick that we've been exploring as a community for a couple of years now. First introduced in the paper large language models are zero-shot reasoners in May 2022. Effectively, this means the models can better handle significantly more complicated prompts where a good result requires backtracking and thinking beyond just next token prediction.
Starting point is 00:04:01 I don't really like the term reasoning because I don't think it has a robust definition in the context of LLMs, but OpenAI have committed to using it here, and I think it does an adequate job of conveying the problem these new models are trying to solve. Most interestingly is the introduction of reasoning tokens, tokens that are not visible in the API response, but are still billed and counted as output tokens. These tokens are where the new magic happens. Thanks to the importance of reasoning tokens, OpenAI suggests allocating a budget of around 25,000 of these for prompts. that benefit from the new models. The output token allowance has been increased dramatically. A frustrating detail is that those reasoning tokens remain invisible in the API. You get billed for them, but you don't get to see what they were. Two key reasons here. One is around safety and policy
Starting point is 00:04:47 compliance. They want the model to be able to reason about how it's obeying those policy rules without exposing intermediary steps that might include information that violates those policies. The second is what they call competitive advantage, which I interpret as wanting to avoid other models being able to train against the reasoning work that they have invested in. I'm not at all happy about this policy decision as someone who develops against LLM's interpretability and transparency are everything to me. The idea that I can run a complex prompt and have key details of how that prompt was evaluated hidden from me feels like a big step backwards, end quote. He mentioned safety there, though. So speaking of safety, Apollo research has also come out with a report giving the new model a medium
Starting point is 00:05:30 rating for chemical, biological, radiological, nuclear weapons risk, and warn that it sometimes manipulated task data to fake alignment. Quoting transformer news.aI. But though they aren't dangerous yet, they do seem to be more dangerous than previous models, which suggests open AI may be increasingly moving towards models that might be too risky to release. The company's own policy state that, quote, only models with a post-mitigation risk score of medium or below can be deployed. With CBRN, risk now at that medium level, that threshold may be soon crossed, end quote. And one more OpenAI note before we move on from them. Sources say OpenAIs, chat GPT has more than 11 million paying subscribers, including one million for its higher price business plans, implying that they're
Starting point is 00:06:17 generating more than $225 million in revenue per month. So cluster that with your thinking about their impending raise. Real quick noting that, also after I published yesterday, the FDA officially approved the hearing aid feature in Apple's AirPods Pro 2, calling it the first over-the-counter hearing aid software device. Quoting TechCrunch, the FDA on Thursday announced that it had granted what it calls the first over-the-counter hearing aid software device hearing aid feature. Specifically, it has approved the software update that enables that functionality. Hearing loss is a significant public health issue impacting millions of Americans. The FDA's Michelle Tarver notes in a statement.
Starting point is 00:07:00 Today's marketing authorization of an over-the-counter hearing aid software on a widely used consumer audio product is another step that advances the availability, accessibility, and acceptability of hearing support for adults with perceived mild to moderate hearing loss, end quote. The news was made possible in part by the FDA's October 2020 move to allow for the sale of hearing aids without a prescription. That move has given rise to a new industry of more easily accessible hearing devices, end quote. Why Combinator now plans to expand to four cohorts per year, adding spring and fall sessions next year in 2025. Each batch will be about half the size of the most recent cohorts, which came in at 256 startups each. Quoting Bloomberg, spring and fall cohorts are joining the traditional winter and summer cohorts. President Gary Tan confirmed in a message. The program lasts about 11 weeks, each capped with an investor demo day when the startups pitched top venture capital firms.
Starting point is 00:08:03 The stepped-up schedule is the brainchild of Tan, an entrepreneur and venture capitalist who became president of Y Combinator or YC earlier last year. Under the new schedule, a season that traditionally was a break between June, September, summer session, and January-April, winter session will fill up with a new batch of founders and the attendant talks meetups and office hours. Starting in 2025, a spring session will follow the winter one. The size of each batch will be smaller, Tan said, roughly half the size of the most recent cohort of 256. The great thing for everyone is we will be more responsive to founders and fund them right when they start, Tan said in a text exchange. We will also have 4x in-person demo days, which will give investors twice as much time to meet half the number of companies. Even the smallest moves of YC are closely scrutinized in Silicon Valley. Earlier this year, Tan made a different controversial change shuddering its $700 million continuity fund,
Starting point is 00:08:58 which invested selectively in YC startups judged to hold the greatest potential. The latest scheduling shift could address criticism that YC cohorts have gotten too large to retain the program's exclusivity. In late 2021 and early 2020, cohorts hovered around 400. Now having closer to 100 startups in a batch will bring YC back to levels from about a decade ago. Still, the total number of startups going through the program each year will hold steady at about 500, a far cry from the days when Stripe, say, attended when just 26 startups participated in its 2009 cohort. and there will be more demo days, potentially eroding each one's importance, even as it allows for more individualized attention, end quote. Time for the weekend long read suggestions.
Starting point is 00:09:46 There is only one this week, because I feel like this is a story, the importance of which cannot be understated. It is, of course, the continuing crisis at Intel, which would be important if only for the history, one of the dominant tech companies falling into irrelevance and potentially worse. but also given that this is America's basically one homegrown play in the whole geopolitical silicon game, it's doubly important. So Bloomberg has a deep dive look at the tough options before Intel's board right now, including scaling back factory projects, selling off subsidiaries, or splitting Intel's core operations. And these are decisions that are being made right now, by the way. Quote, over three days of meetings that began Tuesday, Intel's board has been weighing how to move
Starting point is 00:10:31 forward after an August 1st earnings report in which Intel showed disappointing growth, shared a forecast that fell far short of Wall Street estimates, and announced plans to slash 15,000 jobs. The abysmal results sent share prices plummeting and shattered the last vestiges of confidence in a turnaround plan that Pat Gelsinger began when he took over as Chief Executive Officer in 2021. It didn't have to come to this. Intel's strength in making chips for data centers should have left it well positioned for the sudden rise in artificial intelligence, but it lagged in the race to produce the specific kind of equipment needed to train and operate AI models and has almost entirely missed out on the recent boom.
Starting point is 00:11:09 Intel is headed toward its third consecutive year of shrinking sales, estimated to make $52 billion in revenue in 2024, just 70% of what it brought in back in 2021. Its shares have lost more than 60% of their value this year, turning them into the second worst performing stock on the S&P 500. Intel's existing businesses aren't performing well. enough to allow it to spend its way back to relevance. While the overall strategy might have made sense at the outset, the current runway of the business doesn't seem to give enough support to get it to the end anymore. Bernstein Society General Group analyst Stacey Razgun wrote in a note last
Starting point is 00:11:44 week, something clearly has to be done, but what, end quote? The options the board is considering this week are intended to help Intel find a more solid financial footing, even if that means trimming its ambitions, according to people familiar with its deliberations who asked not to be identified because the discussions are private. It's not clear which ones are most likely, and all of the possibilities face real barriers. The board hasn't received any offers from potential buyers for the company, in part or in whole, and has not scheduled any binding votes. One option for Intel to improve its financial position would be to sell off divisions it acquired before Gelsinger took over and which the company has already separated from its core operations,
Starting point is 00:12:24 although this is not on the agenda for this week's meeting. The company is examining whether to sell some of its stake in autonomous driving tech-focused mobile eye? Mobile eye spun out of Intel and went public in 2022. Intel still owns 88% of the company's shares and could presumably sell a larger chunk, either through the public markets or directly to a single buyer. Still, demand is likely to be weak for the automotive tech company, which has lost about 75% of its market value this year. This likely pushes off the sale of any significant portion of its stake in the near term. There's also Alteracorp, a company that makes programmable chips, multi-use devices that are primarily used in telecommunications networks,
Starting point is 00:13:04 Intel bought Altera in 2015, then separated its operations last year with the intent of taking it public. Altera has suffered from weak spending by telecom companies, and Intel management has said Altera needs to produce more up-to-date chips to regain market share. Intel spent about $15 billion each to buy Mobile Eye and Altera. Any sale would almost certainly come at a loss. Another target for cutbacks could be Intel's network of semiconductor factories, which it has committed to spending tens of billions of dollars on with the cooperation of various governments. The most prominent of these are the plans Intel has begun work on in Arizona and Ohio, which are being constructed with support from Biden's chip program and are in line for billions in public subsidies.
Starting point is 00:13:45 Watering down these projects would be a black eye not only for Intel but for the U.S. government. The Biden administration has consistently framed the importance of its chipmaking policy in nationalistic terms, and Intel is the biggest U.S.-based partner in its plans. Commerce Secretary Gina Raimondo has tried to help Intel's foundry business, including by encouraging executives at NVIDIA and advanced microdevices, to consider manufacturing at the Chipmaker's Ohio facility, Bloomberg has reported. Neither currently plans to do so. In recent months, Wall Street has become particularly fascinated with the idea of cleaving Intel into its constituent parts. The tight integration of Intel's design and manufacturing operations has always been a core part of its identity, though, and splitting those divisions would mark the end of the company as we know it. It's also not clear that either side of the business would make much sense if detached from the other.
Starting point is 00:14:33 The factory business, Intel's Foundry Services, lost $5 billion in 2023, and is likely to post even bigger losses this year. Its conspicuous lack of external customers is not only a problem because it illustrates a lack of traction, but also because it makes the division reliant on Intel's product design operations for revenue. The chip design business retains its traditional stronghold in the market for chips used in servers and personal computer processors, and Intel is optimistic about its PC chips specifically. But rivals like AMD and arm holdings are gaining ground. The more of the product design operation languages, the more of the Foundry Business suffers. By Gelsinger's own admission, Intel hasn't developed a compelling way to elbow its way into AI-specific chips,
Starting point is 00:15:13 the most important part of the semiconductor industry today. While it has a line of processors that compete with Nvidia's core products, the CEO acknowledges that Intel isn't going to be most customers' first choice. The competition for chips that can train AI models, he says, is a four-horse race between Nvidia, AMD, companies designing their own in-house chips and Intel. Intel is number four, he says, that's hard. Even if a split made financial sense and a buyer emerged for one half of the business or the other, completing a deal may be prohibitively complicated. Any potential acquisition of Intel's factory network would face significant government scrutiny given the inherent national security
Starting point is 00:15:50 concerns. To pass muster, a new owner would also probably have to agree to spend the tens of billions of dollars that Intel has already promised for new plants. China, as Intel's largest single market, would also want to say. Regulators there have held up approval for U.S. deals to the point where few of any size have made it to completion. The prospects seemed dim for a transaction that would satisfy both Beijing and Washington, end quote. Throwing this in there real quick, something, something inflection point. Waymo and Uber just announced plans to expand their Robotaxi partnership in Phoenix, Arizona to Austin, Texas, and Atlanta, starting early next year. So Austin and Atlanta, pretty big markets, right? No bonus episode
Starting point is 00:16:40 for you this weekend, but I do have something I want to ask you. We're considering changing ad networks for the podcast. I want to find a network that can get us back to more of those SaaS and startup focused sponsors that we had at the beginning of the show. I felt like they were more useful for this audience. And I've been talking to us. platform that wants to buy our ad inventory for a year and sell it to just exactly that profile of sponsor. But the catch is they want three ads in the ad break every episode, not two. Now, when I do the host ads right now, they often go 90 seconds anyway. I don't stick to just a hard 60 seconds per ad, because I'm not able to talk that fast. So there are, on average,
Starting point is 00:17:23 right now already three minutes or so of ads in the show every day. It's been like that for at least six years. This new ad network would only be interested in inserting 60-second ads, a hard 60 seconds as opposed to the 90 seconds that I end up doing when I read them myself. So it would come out to the same. It would be three minutes of sponsored content like we're doing right now on every episode. It's just that instead of two ads every episode, there would be three, and ideally more business and tech-focused ads. So better sponsors, but same amount of time in terms of the ads you have to listen to. Also, frankly, it would be better money for me, which, behind the scenes, this has been the worst year in the nearly seven years of doing this show
Starting point is 00:18:09 in terms of ad revenue. My income from this show is down over 50% from where it was just two years ago. So I do need to find a solution just to make it worthwhile to keep doing the show. So tell me, given everything I've just laid out, are you willing to have three ads in each show? on the show Twitter account, which is at TechMeme podcast and at my personal Twitter at Brian MCC, I posted polls asking for you to weigh in on this. Please vote in those polls, either one. I'll run them through the weekend, pinned to the top of both profiles, but also I have links in the show notes to them. Please vote, but also reply below the polls to give me your thoughts.
Starting point is 00:18:47 I do think we need better ads. So I'm committing to moving to a different ad network anyway. it's just a question of if I go with this one that wants me to do three ads. I do need to get the revenue back to a healthy place. But I want your thoughts on this if this is the best way to do it. Thanks in advance. Chat on Monday.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.