The AI Daily Brief: Artificial Intelligence News and Analysis - Is Zuckerberg the New Jobs?

Starting point is 00:00:00 Today on the AI Daily Brief is Zuckerberg, the new jobs. The AI Daily Brief is a daily podcast and video about the most important news and discussions in AI. To join the conversation, follow the Discord link in our show notes. Welcome back to the AI Daily Brief. For the second time in two days, Descript has lost an entire podcast file, so I actually won't be sharing the update about OpenAI's leadership shakeup. I'm already on my way to the airport, so can't re-record it and actually had to use 11 labs just to get this disclaimer. Sorry again, and I'll provide the full update on OpenAI tomorrow.

Starting point is 00:00:36 Today, we are heading to One Hacker Way where Meta held its Connect developer conference and has unveiled a number of new things ranging from AI to AR. As you'll see, even in a day where we got a set of new models, it was really quite secondary to Zuckerberg's main presentation, and I think was a reminder to some that as all in on AI as Zuckerberg and Meta have become, they have not left entirely the vision of the future which led them to rename the company Meta in the first place. First, though, let's talk about the AI updates. TLDR is that meta showed off the latest version of their open source models at Lama 3.2.

Starting point is 00:01:12 There was Lama 3.2 1B and 3B, which were designed specifically for on-device use cases, but the more standard for developers were Lama 3.211B and 90B vision models. 11B and 90B were effectively multimodal alternatives to Lama 3.18B and 70B. And indeed, the multimodality is probably the biggest update. Meta said that the models, quote, support image reasoning use. use cases such as document level understanding, including charts and graphs, captioning of images and visual grounding tasks such as directly pinpointing objects in images based on natural language descriptions.

Starting point is 00:01:43 Two of the examples they gave, a small business owner could feed Lama 3.2 a chart of their last year of revenue and ask the model to highlight the months with the best sales. They could also show Lama a map and ask the model when a hike might become steeper or the distance of a winding trail. The models can be deployed with or without META's new safety tool called Lama Guard Vision, which can detect potential harmful texture images. The models are available to download from Lama directly as well as from Hugging Face, and they're also being integrated into Meta's wide range of cloud partners.

Starting point is 00:02:10 As you would expect, these models are also being used to power AI features across meta's social platforms. Now, of course, what we didn't get yesterday was any sort of update to Meta's frontier-level model 405B, which was released back in August. Meta said, while these models referring to the 405B are incredibly powerful, we recognize that building with them requires significant compute resources and expertise. We've also heard from developers who don't have access to these resources and still want the opportunity to build with Lama.

Starting point is 00:02:35 Point being that the competition is not just at the state of the art, but to build a full ecosystem of tools that are applicable for different cost contexts and different needs. When it comes to benchmarking, META presented Lama 3.2 as competitive with leading foundation models. For image recognition and visual reasoning tasks, meta's results indicated their models are on par with Anthropics Claude 3-Hiku and OpenAIS GPD40 Mini. They also claim that the smaller text-only-3-B model outperformed Google's Gemma 2.6B and Microsoft's 3.5 Mini.

Starting point is 00:03:01 Now, of course, part of the place that meta has staked out in the AI ecosystem is as the great defender of open source. And while the vast majority of developers that I see interacting on Twitter slash X take Zuckerberg and meta at their word and with their actions as truly committed to open source, there are some who are more skeptical. TechCrunch, for example, wrote, implicit in meta's rhetoric is a desire that these tools and models be of meta's making. Spending on models, meta can then commoditize, forces the competition to lower prices, spreads meta's version of AI broadly, and lets meta incorporate improvements from the open source community. Make no mistake, Meta's playing for keeps. It's spending millions lobbying regulators to come around to its preferred flavor of open AI, and it's plowing billions into servers, data centers, and network infrastructure to train future models. To be honest, though, all that really says is that there is a business strategy behind open source as well. And of course there is.

Starting point is 00:03:48 Meta is a huge company. Zuckerberg's an incredibly savvy entrepreneur. The fact that there are benefits to open source, such as the mass proliferation of meta-related models, doesn't to me a priori suggest that the intention behind open source is just a cold calculating It was also clear that meta isn't just thinking about tech specs. They're also thinking about how they integrate these models into products. In their presentation, they showcase the number of new use cases enabled by Lama 3.2 across their platforms, which ironically or perhaps not ironically, comes just days after OpenAI fully rolled out its advanced voice mode, is that meta writes, you can now use your voice to talk to meta AI on Messenger, Facebook, WhatsApp, and Instagram

Starting point is 00:04:25 DM, and it'll respond back to you out loud. Basically, meta confirms that this sort of voice-based interaction where the AI talks back to you is going to be a part of the future, at least assuming consumers actually respond well to it. One specific strategic decision they made was to include celebrity AI voices as some of the options, including Aquafina, Dame Judy Dench, John Cena, Kegan Michael Key, and Kristen Bell. But A, why not? And B, if nothing else, it shows the continued intersection of the entertainment industry

Starting point is 00:04:53 and the AI space. Meta also highlighted the power of AI to interact with image processing. They gave examples like being able to identify a flower a user sees, during a hike, or returning a recipe based on an image of a cake. Like all of the big consumer-facing AI, they're also showing basic image editing use cases like removing backgrounds. They're also starting to test integrated translation. I speak about this frequently as one of the most obvious but still incredibly cool uses of Gen AI. They write, with automatic dubbing and lip syncing, meta AI will simulate a speaker's voice in another language and sync their lips to match. They are initially

Starting point is 00:05:25 rolling out small tests on Instagram and Facebook, translating between English and Spanish, with an intent to go broader as soon as they can. While a lot of the focus was on the consumer side, they didn't ignore businesses either. One of their main business use cases is click-to-message ads on WhatsApp and Messenger, allowing small businesses to set up business AIs that can talk to customers. These tools are seeing uptake. Meta says that more than a million advertisers use the tools and created more than 15 million ads with them in just the last month. They also said that ad campaigns that use Meta's Gen AI features resulted in an 11% higher click-through rate and 7.6% higher conversion rate, which is incredibly meaningful at mass scale.

Starting point is 00:06:03 Today's episode is brought to you by Plum. Generative AI promises to supercharge your productivity and give you superpowers, but if you're not an engineer, trying to harness AI can be incredibly frustrating. Hours wasted wrestling with complex tools only to give up when they don't work. We all have tedious tasks we'd love to automate and challenges AI could solve, but few of us have the skills to fully leverage these game-changing technologies. That's where Plum comes in. The mission? To make automating your work feel like magic. Imagine typing out AI, read my Gmail and ping me in Slack when something critical comes in, and watching it come to life before your eyes. No coding required. Whether you're a marketer, salesperson,

Starting point is 00:06:38 or founder, Plum enables you to create custom AI workflows in minutes, not hours. Check out useplum.com that's Plum with a B for early access to the future of workflow automation. Today's episode is brought to you by Venice. The leading AI company store your entire conversation history and attach it to your identity forever. That's every question you asked, every answer you received, every image you generate, every thought you share with the machine it's all being spied on. If you trust all the company's hackers and NSA board members that will ever have access to your AI conversations, then rejoice, for you are well served. For the rest of us, Venice is an alternative. Venice is a powerful AI app for text, image, and code generation that respects

Starting point is 00:07:15 you as a sovereign individual and believes privacy and free speech are not only human rights, but necessary for civilizational advancement. Private, permissionless, and uncensored, you can try it for free without an account. AIA daily brief listeners receive a 20% discount on Venice Pro. Visit venice.a.i slash NLW and enter the discount code NLW Daily Brief. That's NLW Daily Brief. All one word. Today's episode is brought to you by Super Intelligent, which is, of course, our platform that helps you learn how to use AI tools and perhaps even more importantly, gives you ideas on the best use cases that are actually going to help you achieve whatever it is you want to achieve. To recognize the end of summer and back to school slash back to work,

Starting point is 00:07:59 we are running our best promotion ever when you sign up for Super Intelligent, using code so back, your first month will be 100% free. The platform features over 600 fun, highly practical AI tutorials that get you using AI fast and with an eye to actually transforming how you get things done. We've just launched Super for Teams, so if you have a group of people at your company that want to figure out how to use AI together, I highly suggest you check it out. But for those of you who are using Superintelligent as an individual, once again, if you sign up for Superintelligent between now and the end of the month using code so back, you will get your first month 100% free. Go to B-Supert.aI and check it out today.

Starting point is 00:08:40 Now, when it comes to community reactions, people are still just wrapping their heads around it. As I mentioned before, there's a lot of positivity around the open source side of things. Dr. Jim Fan from Nvidia writes, I just pulled the numbers on vision language benchmarks for Lama 3.211B. Surprisingly, the open source community at large isn't behind in the lightweight model class. Pextral, Quentu, VL, Mo Mo, and intern VL2 all stand strong. OSS-AI models have never been stronger. Never bet against OSS.

Starting point is 00:09:06 Never underestimate the combined firepower of so many talents distributed all over the world. Now, speaking all over the world, one place this won't be distributed is the EU. This is not a surprise. It's something that META had announced. But as Jonas writes, it's still disappointed to see the EU excluded from accessing a promising open source model. The big announcement wasn't about AI. It was about AR, augmented reality, and specifically, what meta is calling its first true augmented reality glasses, which it calls Orion. Sorry, OpenAI. If you choose that for GPT5's name, it seems like there might be some confusion.

Starting point is 00:09:39 Meta presented Orion as the next great leap in human-oriented computing. Similar to the Apple Vision Pro, Orion uses a pair of holographic displays to place 2D and 3D content and experiences into your physical surroundings. The glasses feature eye, hand, and neural tracking. They also have a very cool new feature where a wrist monitor can understand hand motions so that you can instruct your glasses what to do without having to talk to yourself in public or be waving your hands around like a crazy person. Now, by way of differentiation, especially from other products that have come out in this similar vein, the big one is that many people feel like these are the first AR glasses that don't totally look like dog poop. Yes, they are a little bit clunky, but

Starting point is 00:10:14 ultimately they are still in the realm of normal glasses. That's quite different than some other approaches we've seen, but does reflect something that Facebook has seen a lot of success with, which is their meta-ray-band glasses, which have been extremely popular. Zuckerberg was very deliberate in calling these glasses, not a headset, no wires, and lightweight enough to wear all day. To the extent that there was a catch, it's that these are actually not yet available for sale. It's still a prototype which will be available for meta employees and select external audiences, although meta was clear that, quote, this is not a research prototype. It's one of the most polished product prototypes we've ever developed,

Starting point is 00:10:49 and it is truly representative of something that could ship to consumers. Rather than rushing to put it on shelves, we decided to focus on internal development first, which means that we can keep building quickly and continue to push the boundaries of the technology helping us arrive at an even better consumer product faster. So it seems like there should not be years between when we've seen this and when we can get our hands on them.

Starting point is 00:11:07 Still, there were a few people who did have a chance to test these out. The Verge wrote a very, very long and comprehensive review, basically saying that while yes, this product was still being demoed on guardrails and wasn't fully baked from a consumer perspective, that it was very impressive. And frankly, given how much critique we've seen recently of people releasing products that were half-baked just to get them out, I think that although people will be disappointed that they can't get their hands on these right now, they also might respond to the better product ultimately that becomes available.

Starting point is 00:11:35 Because, of course, people haven't had a chance to get their hands on these things. A lot of the conversation was more speculative in general. And one of the biggest things that I saw over and over again was summed up. by Billowal Sidhu who writes, meta is onto a winning formula pairing neural wristbands with these next-gen AR glasses. It's funny. In an alternative universe, this is what you'd think Apple would have shipped versus

Starting point is 00:11:54 the Vision Pro. Nikita Beer put it even more simply, Zuck is the new jobs. Of course, that remains to be seen, but I think for many, he got a lot closer yesterday. For now, that is going to do it for today's AI Daily Brief. Until next time, peace.

The AI Daily Brief: Artificial Intelligence News and Analysis - Is Zuckerberg the New Jobs?

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.