Tech Brew Ride Home - Tue. 05/14 – GPT-4o
Episode Date: May 14, 2024OpenAI unveils GPT-4o which makes Siri look like the technical cul-de-sac it very much is. But what does it mean that this was NOT GPT-5? What does it mean for the gaming industry that the PS5 might b...e underperforming? More streaming bundles. And the 2024 iPad refresh reviews. Sponsors: YahooFinance.com ConstantContact.com Links: OpenAI debuts GPT-4o ‘omni’ model now powering ChatGPT (TechCrunch) OpenAI debuts new model with enhanced real-time voice abilities (Axios) Tom Warren's PS5 sales Tweet Comcast to Launch Peacock, Netflix and Apple TV+ Bundle at a ‘Vastly Reduced Price’ (Variety) The new Apple iPad Air is great — but it’s not the one to get (The Verge) Apple iPad Pro (2024) review: the best kind of overkill (The Verge) Learn more about your ad choices. Visit megaphone.fm/adchoices
Transcript
Discussion (0)
On April 4th, 2023, around 2 in the morning, a man was found stabbed multiple times on a sidewalk in downtown San Francisco.
Hey, who did this to you?
What happened next turned the story into a political firestorm.
Reports have identified the victim as Bob Lee, the founder of Cash App.
From Bloomberg Podcasts, this is Foundering, the Killing of Bob Lee, beginning April 16.
Welcome to the TechMeBright home for Tuesday, May 14th, 2020.
I'm Brian McCullough.
Today, OpenAI unveils GPT-40, which makes Siri look like the technical cul-de-sac it very much is.
But what does it mean that this was not GPT-5?
What does it mean for the gaming industry that the PS5 might be underperforming?
More streaming bundles and the 2024 iPad refresh reviews.
Here's what you missed today in the world of tech.
It wasn't GPT-5, which we'll get to in a second.
but OpenAI yesterday unveiled GPT-4-O, as in the letter O, a new flagship generative AI model that is
faster and natively multimodal rolling out for free to all chat GPT users in the coming weeks.
Quoting TechCrunch, the O stands for Omni referring to the model's ability to handle text, speech, and video.
GPT-4-O is set to roll out iteratively across the company's developer and consumer-facing products over the next few weeks.
OpenAI's CTO Mira Moradi said that GPTT, GPT,
GPD 4-0 provides GPT4 level intelligence, but improves on GBT4's capabilities across multiple
modalities and media.
GPT4-O reasons across voice, text and vision, Marotti said during a streamed presentation at OpenAI's
offices in San Francisco on Monday.
And this is incredibly important because we're looking at the future of interaction between
ourselves and machines.
GPT4 Turbo, OpenAI's previous leading most advanced model, which streamed on a combination of
images and text and could analyze images and text to accomplish tasks like extracting text from
images or even describing the content of those images, but GPT-40 adds speech into the mix.
What does this enable?
A variety of things.
GPT-40 greatly improves the experience in OpenAI's AI-powered chat-pot chat-GPT.
The platform has long offered a voice mode that transcribes the chatbot's responses using
a text-to-speech model, but GPT supercharges this, allowing users to interact with chat-GPT
more like an assistant.
For example, users can ask the GPT-40-powered chat-GPT.
a question and interrupt chat GPT while it's answering. The model delivers real-time responsiveness,
OpenAI says, and can even pick up on nuances in a user's voice in response to generating voices
in a range of different emotive styles, including singing. GPD 4-0 also upgrades chat GPT's
vision capabilities given a photo or a desktop screen. ChatGPT can now quickly answer related questions
from topics ranging from what's going on in this software code to what brand of shirt is this person
wearing. These features will evolve further in the future, Marotti says. While today, GBT 4O can look at a
picture of a menu in a different language and translate it in the future, the model could allow
chat GBT to, for instance, watch a live sports game and explain the rules to you.
GPT4O is more multilingual as well, OpenAI claims with enhanced performance in around 50 languages,
and in OpenAI's API's API and Microsoft's Azure OpenAI service, GPT4O is twice as fast as half the price of
and has higher rate limits than GPT4 Turbo, the company says.
At present, voice isn't a part of the GPT40 API for all customers.
OpenAI citing the risk of misuse says that it plans to launch support for GPT4O's new audio
capabilities to a small group of trusted partners in the coming weeks.
GPT4O is available in the free tier of chat GPT starting today and to subscribers to
OpenAI's premium chat GPT Plus and team plans with 5X higher message limits.
OpenAI notes that ChatGPT will automatically switch to GPT 3.5, an older and less capable model when users hit the rate limit.
The improved chat GPT voice experience underpinned by GPT40 will arrive in Alpha for Plus users in the next month or so alongside enterprise-focused options, end quote.
So one quick thing to note that mysterious GPT2 chatbot that we were talking about recently was GPT40, so mystery solved there.
Also, OpenAI open its GPT store to all users for free.
including the ability to create custom GPTs, OpenAI debuted GPT store for paid users back on January 10th.
There is also a desktop version of chat GPT initially just for the Mac, and, you know, quoting Tom Warren.
Microsoft has invested more than $10 billion into OpenAI and the first desktop app they releases for MacOS because it's, quote, prioritizing where our users are.
Ouch, it plans to launch a Windows version later this year, end quote.
But what all of social media has been focused.
on are the demos, quoting Axios. OpenAI showed off real-time interactions with the voice assistant
in ChatGBTGPT, including faster responses and the ability to interrupt the AI assistant. In one demo,
OpenAI showed one of its workers getting a real-time tutorial on taking deep breaths. Another showed
ChatGPT reading an AI-generated story in different voices, including super-dramatic recitals,
robotic tones, and even singing. In a third demo, a user asked ChatGPT to look at an algebra
equation and help the person solve it rather than simply providing the answer.
In all the demos, GBT40 showed considerably greater personality and conversational skills than it has previously had.
OpenAI showed the new chatbot working simultaneously across languages, in this case, helping translate between English and Italian.
The demos highlighted chat GPT's multimodal capabilities across visual, audio, and text interactions with the AI assistant able to use a phone's camera to read written notes and to attempt to detect the emotion of a person, end quote.
I highly recommend checking out the demo video where they got two AI bots to talk to each other, one describing the room to the other who couldn't see with full understanding of events that had happened previously even. Also, the translating languages in real time video has been shared a lot. Now, that is something that has existed for a while, but that's the thing. When you watch these new demo videos, it's not just the new stuff this bot can do, especially with the voice stuff and the natural conversational interactions. It's the fact that it's just so much better, faster.
It's as close to the bot from the movie her that we've gotten yet.
You tell the bot to be flirty, it flirts, you tell it to be sarcastic, it is.
It makes Siri look like Clippy from Windows 95, though, because it happens almost instantaneously,
again, at least in these video demos, though people online have been reporting similar speediness.
Which brings us to all the stories about Apple negotiating with OpenAI to integrate something into the next version of iOS.
I assume what happened here is that Apple saw early demos of this thing,
and realize there's no way we can launch something that is half as good as this, so let's just join forces.
If Apple doesn't kill Siri next month, expect the drumbeat to grow louder to ditch the branding at the very least around a product that, let's face it, is pretty much a failure at this point.
I think we can all agree.
But also, Google I.O. is today. You have to imagine they are going to demo something similar.
Something at least at parody to this, or for their sake, one would hope something beyond this, which is why,
OpenAI probably announced first. If Google isn't at least as impressive with whatever multimodal bot they
demo, look out. Though there are screenshots going around of Alphabet stock rising during the exact minutes
the Open AI demo was going on, so I guess Wall Street expects something from Google today.
Should also note that access to the new GPT4O API for developers is, I think we mentioned,
half the price and twice as fast as GPT4 Turbo. But let's circle back to GPT5. In the AI,
industry for the last few months, there's been a bit of a waiting game. When is GPT5 coming? Will it be
2x better or 10x? There are a lot of startups holding their raises because investors are waiting
to see if GPT5 is such a step change that a lot of startups are suddenly obviated, but also,
who knows? What if they did drop almost artificial general intelligence? This was, let's be clear,
not GPT5, not in the sense of that level of expectation I just described. So what does that mean? What does it
mean that we didn't get GPT-5. We know that they're working on it. What does it mean that they still
haven't felt like it's ready to release yet? Did they do this release yesterday just to specifically
keep abreast of Google and the Gemini team? Maybe they'll release GPT-5 later in the summer and flat-foot
everybody after Google and Apple have essentially shot their wads. There's one more bit of whispering
around GPT-5, though, that we should make note of. There are a lot of people waiting for
GPT-5 because they want to know, are we maybe approaching the technical ceiling of what this current
generation of AI can do. There are lots of at least theoretical alternatives to the transformer model
that are rising to the surface lately. The transformer model is what this generation of LLMs is largely
built upon. The fact that this wasn't GPT5, let's say the model that this was, if it was supposed
to be GPT5, came out as like 2x better, not 10x in terms of improvement. So OpenAI was like,
we can't name this one, GPT5. This is only meh. So let's go with GPT40. There is a segment of the AI
community that the longer GPT-5 doesn't arrive, we'll start wondering if we have, in fact, reached
the ceiling of what this generation of AI tech can do? What would that do to investing in the
AI space if that proved out? And keep in mind, the gating functions on this tech could also be
economical, even if there are no technical limits to what this current tech can scale to.
There could be limits in terms of how much it costs and the energy costs, etc. So until
GBT5 ever arrives, we kind of remain in limbo with a lot of key questions.
unanswered. Sony had earnings yesterday, and no, we don't usually cover them, but I wanted to make
note of this, as Tom Warren tweeted, Sony's PS5 sales are down 29% year over year in the recent quarter.
Sony also says it sold 20.8 million PS5 consoles for the full 2023 fiscal year, a slight
miss on its revised downward target of 21 million units. Sony now expects PS5 sales to decline
to 18 million over the fiscal year 24 calendar.
quote. The chart that accompanies that tweet shows clear plateauing in PS5 sales and then declining.
I'm not steeped enough in the gaming space to know if this is declining earlier in the console
lifecycle than is traditional, but I mean this points to the continued sort of nuclear winter
in gaming right now. Hot on the heels of the Sony earnings, Square Inix stock fell 16%, the biggest
decline in 13 years after its president said sales of its big budget games largely exclusive to
PlayStation fell short. Square Inix is pivoting.
to get their games on as many platforms as possible, less exclusives, but it's clear that being
so dependent on the PS5 hurt them.
Bundling, reconstituting the cable bundle example number 97. Comcast CEO Brian Roberts said
Comcast will launch StreamSaver a bundle with Peacock, Netflix, and Apple TV Plus later this month
at a, quote, vastly discounted price. Quoting variety, dubbed Streamsaver, the bundle will be available
to all Comcast, Broadband TV, and mobile customers, Robert said, speaking Tuesday at Moffat.
Johnson's 2024 Media Internet and Communications Conference in New York.
The three streaming services, Peacock, Netflix, and Apple TV Plus will, quote,
come at a vastly reduced price to anything available today, Robert said, although he didn't
detail any pricing.
The goal is to, quote, add value to customers and take dollars out of other company's
streaming businesses, he added, while reinforcing Comcast Broadband Service offerings.
This will be a pretty compelling package, Roberts promised.
The cheapest way to get all three streamers separately today is with the ad-supported Peacock
premium at $5.99 a month, going up to $7.99 a month in July. Netflix Basic with ads comes in at $6.99 a month,
and the standard Apple TV Plus plan is at $9.99 a month. Comcast's impending launch of the streamers
stable bundle comes as other media companies have been assembling similar offerings. Last week,
Disney and Warner Brothers Discovery announced a three-way bundle comprising Max, Disney Plus, and
Hulu to be available starting this summer in the U.S. with pricing TBA. In addition, Disney WBD,
and Fox have formed a joint venture to launch a streaming sports bundle slated to debut this fall.
Critics have alleged the venture, which some have dubbed Spulu, a combination of sports and Hulu,
is anti-competitive and violates antitrust law.
Like the other streaming bundling strategies, Comcast's forthcoming Peacock, Netflix,
and Apple TV Plus package is an effort to reduce cancellation rates, aka Churn,
and provide a more efficient means of subscriber acquisition,
coming as the traditional cable TV business continues to deteriorate, end quote.
Finally today, reviews. First, the iPad Air. The Verge says it's great, but it's not the one to get. In fact, what is the iPad Air even for anymore? Quote, ultimately, I think I can answer the Air versus iPad debate in two questions. Do you want a big screen? Do you use the crap out of your Apple Pencil? If so, buy the Air. The 13-inch model is the cheapest big screen in Apple's lineup, a whopping $500 less than the comparable iPad Pro. And the 11-inch model is,
is the least expensive way to get access to the pencil pro, done and done. Otherwise, buy the plain old
iPad, which is already a terrific tablet at a newly terrific price. There's even a better way to
upgrade. I'd urge you to spend $150 upgrading the base iPad to the cellular model rather than
$250 upgrading to the air. Having an iPad that is just always connected without having to think about
it is a game changer for tablet life. My standard buying advice is to buy the best stuff you can
afford and then keep it as long as possible. But I'm confident that even
a two-year-old 10th-generation iPad is capable enough to do most things really well for a long time.
So is the air, obviously, but the bad news for Apple, and the good news for you is that every iPad is a great iPad, including the cheapest one, end quote.
And then the new iPad pros. iPads Pro. Also, from the Verge, they say, gorgeous screen, blazing performance with the M4 chip, thin.
Front camera is in the right spot, finally, but iPad OS can't keep up with the hardware.
There are basically two types of iPad users.
This is an oversimplification, but go with me.
The first type wants a simple way to send emails, read news, do the crossword, look at photos, and browse the web.
For those people, the new iPad Pro is total overkill.
Everything about it is a little better than the new Air or even the newly cheaper base iPad,
but not so much better that I'd recommend splurging unless you really want that OLED screen.
If you do, please know I get it.
I'm with you.
The other type of iPad user does all those things, but also has an iPad-specific feature or two that really
matters to them. Musicians love it for turning sheet music, students for handwriting notes,
filmmakers for quickly reviewing footage, designers for showing interactive renders to clients.
When Apple talks about how versatile the iPad is, I think this is what the company means.
The iPad is not all things to all people, but it should have something for everyone.
By putting ever more power into the device, Apple is trying to expand the number of those
features that might appeal to you. In my own use, my iPad hardly ever leaves the keyboard case.
I use the magic keyboard for journaling, emailing, and just as a stand while I'm cooking and watching shows,
having a better keyboard and a smaller package matters a lot to me, but it won't to a lot of people,
especially at $299.
With both of its accessories, Apple is making the pro more appealing to the people who might already have a pro
and not doing much to win over those who don't.
There is, I should at least note, the possibility that AI could change the whole equation.
Maybe generative AI will make photos so much better that everybody suddenly wants a big, beautiful screen,
Maybe Siri will get so good that the iPad will become a smart home controller.
Maybe the camera software will be so spectacular that you'll use your tablet for all your video calls forever.
Maybe, maybe, maybe.
WWDC is in a few weeks, and I expect Apple to aggressively try to convince you that advances in AI make the iPad pro
more than just an iPad.
If it can make the argument that a super-powerful, super-portable jack-of-all-trade's device is what you need in the future,
I'll probably be running to buy an iPad Pro.
For now, it's just an iPad.
the best iPad ever, I think, maybe even the best iPad you could reasonably ask for.
But the story of the iPad, the magic pane of glass, as Apple is so fun of calling it,
is actually all about software.
The iPad software has let its hardware down for years.
Apple has led us to believe that's about to change, that this year's WWDC will be the great turning point for AI and iPads and everything.
We'll see.
Until then, the iPad Pro is almost too good for its own good, end quote.
Google I.O. should be going on right as this episode drops.
So more on that tomorrow.
