The AI Daily Brief: Artificial Intelligence News and Analysis - Is Zuckerberg the New Jobs?
Episode Date: September 27, 2024Meta’s Connect Developer Conference showcased the latest advancements in AI and AR, with a special focus on their new Orion augmented reality glasses. As Meta’s open-source AI models like Llama 3....2 make waves, Mark Zuckerberg is being compared to Steve Jobs. Could Meta’s new AR glasses and AI innovations cement Zuckerberg’s place as the next big tech visionary? Learn how to use AI with the world's biggest library of fun and useful tutorials: https://besuper.ai/ Use code 'youtube' for 50% off your first month. Concerned about being spied on? Tired of censored responses? AI Daily Brief listeners receive a 20% discount on Venice Pro. Visit https://venice.ai/nlw and enter the discount code NLWDAILYBRIEF. The AI Daily Brief helps you understand the most important news and discussions in AI. Subscribe to the podcast version of The AI Daily Brief wherever you listen: https://pod.link/1680633614 Subscribe to the newsletter: https://aidailybrief.beehiiv.com/ Join our Discord: https://bit.ly/aibreakdown
Transcript
Discussion (0)
Today on the AI Daily Brief is Zuckerberg, the new jobs.
The AI Daily Brief is a daily podcast and video about the most important news and discussions in AI.
To join the conversation, follow the Discord link in our show notes.
Welcome back to the AI Daily Brief.
For the second time in two days, Descript has lost an entire podcast file,
so I actually won't be sharing the update about OpenAI's leadership shakeup.
I'm already on my way to the airport, so can't re-record it and actually had to use 11 labs just to get this disclaimer.
Sorry again, and I'll provide the full update on OpenAI tomorrow.
Today, we are heading to One Hacker Way where Meta held its Connect developer conference
and has unveiled a number of new things ranging from AI to AR.
As you'll see, even in a day where we got a set of new models, it was really quite secondary
to Zuckerberg's main presentation, and I think was a reminder to some that as all in on
AI as Zuckerberg and Meta have become, they have not left entirely the vision of the future
which led them to rename the company Meta in the first place.
First, though, let's talk about the AI updates.
TLDR is that meta showed off the latest version of their open source models at Lama 3.2.
There was Lama 3.2 1B and 3B, which were designed specifically for on-device use cases,
but the more standard for developers were Lama 3.211B and 90B vision models.
11B and 90B were effectively multimodal alternatives to Lama 3.18B and 70B.
And indeed, the multimodality is probably the biggest update.
Meta said that the models, quote, support image reasoning use.
use cases such as document level understanding, including charts and graphs, captioning of images
and visual grounding tasks such as directly pinpointing objects in images based on natural
language descriptions.
Two of the examples they gave, a small business owner could feed Lama 3.2 a chart of their last
year of revenue and ask the model to highlight the months with the best sales.
They could also show Lama a map and ask the model when a hike might become steeper or
the distance of a winding trail.
The models can be deployed with or without META's new safety tool called Lama Guard
Vision, which can detect potential harmful texture images.
The models are available to download from Lama directly as well as from Hugging Face,
and they're also being integrated into Meta's wide range of cloud partners.
As you would expect, these models are also being used to power AI features across
meta's social platforms.
Now, of course, what we didn't get yesterday was any sort of update to Meta's frontier-level
model 405B, which was released back in August.
Meta said, while these models referring to the 405B are incredibly powerful, we recognize
that building with them requires significant compute resources and expertise.
We've also heard from developers who don't have access to these resources and still want
the opportunity to build with Lama.
Point being that the competition is not just at the state of the art, but to build a full
ecosystem of tools that are applicable for different cost contexts and different needs.
When it comes to benchmarking, META presented Lama 3.2 as competitive with leading foundation
models.
For image recognition and visual reasoning tasks, meta's results indicated their models are on
par with Anthropics Claude 3-Hiku and OpenAIS GPD40 Mini.
They also claim that the smaller text-only-3-B model outperformed Google's Gemma 2.6B and Microsoft's
3.5 Mini.
Now, of course, part of the place that meta has staked out in the AI ecosystem is as the great defender of open source.
And while the vast majority of developers that I see interacting on Twitter slash X take Zuckerberg and meta at their word and with their actions as truly committed to open source, there are some who are more skeptical.
TechCrunch, for example, wrote, implicit in meta's rhetoric is a desire that these tools and models be of meta's making.
Spending on models, meta can then commoditize, forces the competition to lower prices, spreads meta's version of AI broadly, and lets meta incorporate improvements from the open source community.
Make no mistake, Meta's playing for keeps. It's spending millions lobbying regulators to come
around to its preferred flavor of open AI, and it's plowing billions into servers, data centers,
and network infrastructure to train future models. To be honest, though, all that really says
is that there is a business strategy behind open source as well. And of course there is.
Meta is a huge company. Zuckerberg's an incredibly savvy entrepreneur. The fact that there are
benefits to open source, such as the mass proliferation of meta-related models,
doesn't to me a priori suggest that the intention behind open source is just a cold calculating
It was also clear that meta isn't just thinking about tech specs. They're also thinking about
how they integrate these models into products. In their presentation, they showcase the number of
new use cases enabled by Lama 3.2 across their platforms, which ironically or perhaps not ironically,
comes just days after OpenAI fully rolled out its advanced voice mode, is that meta writes,
you can now use your voice to talk to meta AI on Messenger, Facebook, WhatsApp, and Instagram
DM, and it'll respond back to you out loud. Basically, meta confirms that this sort of voice-based interaction
where the AI talks back to you is going to be a part of the future,
at least assuming consumers actually respond well to it.
One specific strategic decision they made was to include celebrity AI voices
as some of the options, including Aquafina, Dame Judy Dench, John Cena, Kegan Michael Key,
and Kristen Bell.
But A, why not?
And B, if nothing else, it shows the continued intersection of the entertainment industry
and the AI space.
Meta also highlighted the power of AI to interact with image processing.
They gave examples like being able to identify a flower a user sees,
during a hike, or returning a recipe based on an image of a cake. Like all of the big consumer-facing
AI, they're also showing basic image editing use cases like removing backgrounds. They're also starting
to test integrated translation. I speak about this frequently as one of the most obvious but still
incredibly cool uses of Gen AI. They write, with automatic dubbing and lip syncing, meta AI will
simulate a speaker's voice in another language and sync their lips to match. They are initially
rolling out small tests on Instagram and Facebook, translating between English and Spanish, with an
intent to go broader as soon as they can. While a lot of the focus was on the consumer side,
they didn't ignore businesses either. One of their main business use cases is click-to-message ads on
WhatsApp and Messenger, allowing small businesses to set up business AIs that can talk to customers.
These tools are seeing uptake. Meta says that more than a million advertisers use the tools
and created more than 15 million ads with them in just the last month. They also said that
ad campaigns that use Meta's Gen AI features resulted in an 11% higher click-through rate and 7.6% higher
conversion rate, which is incredibly meaningful at mass scale.
Today's episode is brought to you by Plum. Generative AI promises to supercharge your productivity
and give you superpowers, but if you're not an engineer, trying to harness AI can be
incredibly frustrating. Hours wasted wrestling with complex tools only to give up when they
don't work. We all have tedious tasks we'd love to automate and challenges AI could solve,
but few of us have the skills to fully leverage these game-changing technologies. That's where
Plum comes in. The mission? To make automating your work feel
like magic. Imagine typing out AI, read my Gmail and ping me in Slack when something critical comes in,
and watching it come to life before your eyes. No coding required. Whether you're a marketer, salesperson,
or founder, Plum enables you to create custom AI workflows in minutes, not hours. Check out useplum.com
that's Plum with a B for early access to the future of workflow automation. Today's episode
is brought to you by Venice. The leading AI company store your entire conversation history and
attach it to your identity forever. That's every question you asked, every answer you received,
every image you generate, every thought you share with the machine it's all being spied on.
If you trust all the company's hackers and NSA board members that will ever have access to
your AI conversations, then rejoice, for you are well served. For the rest of us, Venice is an
alternative. Venice is a powerful AI app for text, image, and code generation that respects
you as a sovereign individual and believes privacy and free speech are not only human rights,
but necessary for civilizational advancement. Private, permissionless, and uncensored, you can try it
for free without an account. AIA daily brief listeners receive a 20%
discount on Venice Pro. Visit venice.a.i slash NLW and enter the discount code NLW Daily Brief. That's
NLW Daily Brief. All one word. Today's episode is brought to you by Super Intelligent, which is, of course,
our platform that helps you learn how to use AI tools and perhaps even more importantly,
gives you ideas on the best use cases that are actually going to help you achieve whatever it is
you want to achieve. To recognize the end of summer and back to school slash back to work,
we are running our best promotion ever when you sign up for Super Intelligent, using code so back,
your first month will be 100% free. The platform features over 600 fun, highly practical AI
tutorials that get you using AI fast and with an eye to actually transforming how you get things
done. We've just launched Super for Teams, so if you have a group of people at your company that
want to figure out how to use AI together, I highly suggest you check it out. But for those of you
who are using Superintelligent as an individual, once again, if you sign up for Superintelligent
between now and the end of the month using code so back, you will get your first month 100% free.
Go to B-Supert.aI and check it out today.
Now, when it comes to community reactions, people are still just wrapping their heads around
it. As I mentioned before, there's a lot of positivity around the open source side of things.
Dr. Jim Fan from Nvidia writes, I just pulled the numbers on vision language benchmarks for
Lama 3.211B.
Surprisingly, the open source community at large isn't behind in the lightweight model class.
Pextral, Quentu, VL, Mo Mo, and intern VL2 all stand strong.
OSS-AI models have never been stronger.
Never bet against OSS.
Never underestimate the combined firepower of so many talents distributed all over the world.
Now, speaking all over the world, one place this won't be distributed is the EU.
This is not a surprise.
It's something that META had announced.
But as Jonas writes, it's still disappointed to see the EU excluded from accessing a
promising open source model. The big announcement wasn't about AI. It was about AR, augmented reality,
and specifically, what meta is calling its first true augmented reality glasses, which it calls
Orion. Sorry, OpenAI. If you choose that for GPT5's name, it seems like there might be some confusion.
Meta presented Orion as the next great leap in human-oriented computing. Similar to the Apple Vision
Pro, Orion uses a pair of holographic displays to place 2D and 3D content and experiences into
your physical surroundings. The glasses feature eye, hand, and neural tracking. They also have a
very cool new feature where a wrist monitor can understand hand motions so that you can instruct
your glasses what to do without having to talk to yourself in public or be waving your hands
around like a crazy person. Now, by way of differentiation, especially from other products that
have come out in this similar vein, the big one is that many people feel like these are the
first AR glasses that don't totally look like dog poop. Yes, they are a little bit clunky, but
ultimately they are still in the realm of normal glasses. That's quite different than some other
approaches we've seen, but does reflect something that Facebook has seen a lot of success with,
which is their meta-ray-band glasses, which have been extremely popular.
Zuckerberg was very deliberate in calling these glasses, not a headset, no wires, and lightweight
enough to wear all day. To the extent that there was a catch, it's that these are actually
not yet available for sale. It's still a prototype which will be available for meta employees
and select external audiences, although meta was clear that, quote, this is not a research prototype.
It's one of the most polished product prototypes we've ever developed,
and it is truly representative of something that could ship to consumers.
Rather than rushing to put it on shelves,
we decided to focus on internal development first,
which means that we can keep building quickly
and continue to push the boundaries of the technology
helping us arrive at an even better consumer product faster.
So it seems like there should not be years between when we've seen this
and when we can get our hands on them.
Still, there were a few people who did have a chance to test these out.
The Verge wrote a very, very long and comprehensive review,
basically saying that while yes, this product was still being demoed on guardrails
and wasn't fully baked from a consumer perspective, that it was very impressive.
And frankly, given how much critique we've seen recently of people releasing products
that were half-baked just to get them out, I think that although people will be disappointed
that they can't get their hands on these right now, they also might respond to the better
product ultimately that becomes available.
Because, of course, people haven't had a chance to get their hands on these things.
A lot of the conversation was more speculative in general.
And one of the biggest things that I saw over and over again was summed up.
by Billowal Sidhu who writes,
meta is onto a winning formula pairing neural wristbands with these next-gen
AR glasses.
It's funny.
In an alternative universe, this is what you'd think Apple would have shipped versus
the Vision Pro.
Nikita Beer put it even more simply, Zuck is the new jobs.
Of course, that remains to be seen, but I think for many, he got a lot closer yesterday.
For now, that is going to do it for today's AI Daily Brief.
Until next time, peace.
