Tech Brew Ride Home - Wed. 05/17 – Open Source Vs. Centralized AI, Part II
Episode Date: May 17, 2023If you’ve been letting some of your Google accounts sit fallow, you better look into that cause Google is gonna start deleting things. Why some new top level domains have people concerned. Why tech ...companies are racing to put generative AI on your phone. And part two of the open source vs. centralized AI debate. Sponsors: OregonState.edu Links: Google will delete accounts, including Gmail & Photos, that haven’t logged on in 2 years (9to5Google) New ZIP domains spark debate among cybersecurity experts (BleepingComputer) StableStudio is Stability AI’s latest commitment to open-source AI (The Verge) The race to bring generative AI to mobile devices (Financial Times) Google I/O and the Coming AI Battles (Stratechery) Learn more about your ad choices. Visit megaphone.fm/adchoices
Transcript
Discussion (0)
Welcome to the TechMeme right home for Wednesday, May 17th, 2020. I'm Brian McCullough today. If you've been letting some of your Google accounts sit fallow, you'd better take a look into that because Google is going to start deleting things soon. Why some new top-level domains have people concerned, why tech companies are racing to put generative AI on your phone. And part two of the open source versus centralized AI debate. Here's what you miss today in the world of tech. This is perhaps urgent news for some of you out there. Google has updated
its inactivity policy, and going forward, accounts inactive for at least two years will be deleted,
except for those with YouTube videos attached. This will start happening in December. So,
if you have old Google accounts that I don't know, have emails or photos or whatnot from some
earlier period in your life and you've just been letting it sit there dormant because you've been
assuming it's fine, yeah, you're going to want to make arrangements, quoting 9 to 5 Google.
If a Google account has not been used or signed into for at least two years,
Google will delete that personal account and its contents.
In addition to the email address becoming inaccessible,
Gmail messages, calendar events, drive, docs, and other workspace files,
as well as Google Photos backups, will be removed.
At the moment, Google is not planning to delete accounts with YouTube videos.
That would be tricky, as some old abandoned clips might have historical relevance.
Google will start deleting inactive accounts in December at the earliest and take a phased approach,
starting with what they say are, quote, accounts that were created and never used again.
The company says it is, quote, going to roll this out slowly and carefully, end quote.
Before deleting an account, we will send multiple notifications over the months leading up to the
deletion to both the account email address and the recovery email if one has been provided,
says Google.
Meanwhile, this only applies to free Google.
accounts and not those managed by a business or school. What keeps an account active? Besides signing
in periodically being logged in and performing basic actions counts as activity. For example,
reading or sending an email, like viewing an inactivity alert, using Google Drive,
watching a YouTube video, downloading an app on the Google Play Store, using Google search,
using sign-in with Google to sign into a third-party app or service. Additionally,
Google tells us that using a signed-in Android device is,
considered activity. Google Photos already has a separate two-year sign-in and usage policy to be
considered active. Meanwhile, accounts with active Play Store subscriptions like Google One or third-party
apps are considered active. Today, Google recommends users assign a recovery email, and the
company points users toward the inactive account manager to decide what happens to their account
and data when it becomes inactive for a period of up to 18 months. Options include sending
files to trusted contacts, setting a Gmail auto-responder,
or account deletion. In making this change, Google cites security, as inactive accounts, often with old or
reused passwords that may have been compromised, are more likely to be compromised. This also, quote,
limits the amount of time Google retains your unused personal information with this time frame
considered to be an industry standard. Unlike other services with different security slash privacy
implications, Google will not free up Gmail addresses to reclaim with deletions, end quote.
Speaking of Google, cybersecurity researchers and IT admins have been raising concerns over two new Google top-level domains.
The domains in question are dot zip and dot MOV, and the concern is that, you know, those look like file extensions and threat actors could use them for fishing and malware schemes.
Quote, earlier this month, Google introduced eight new top-level domains or TLDs that could be purchased for hosting websites or
email addresses. The new domains are dot dad, dot esq,
dot prof for professor. phd, dot nexus,
dot fu, and for the topic of our article, the dot zip and dot MOV domain TLDs.
While the zip and MOVTLDs have been available since 2014, it wasn't until this
month that they became generally available allowing anyone to purchase a domain like
bleeping computer.zip for a website. However, these domains could be perceived as risky,
as the TLDs are also extensions of files commonly shared in forum posts, messages, and online
discussions, which will now be automatically converted into URLs by some online platforms or applications.
Two common file types seen online are Zip Archives and MPEG 4 videos, whose file names end in dot zip
for zip archive or dot MOV video file. Therefore, it's very common for people to post instructions
containing file names with the dot zip and dot MOV extensions. However, now that they are TLDs,
some messaging platforms and social media sites will automatically convert file names with dot zip
and dot MOV extensions into URLs. For example, on Twitter, if you send someone instructions on
opening a zip file and accessing a MOV file, the innocuous file names are converted into a URL
as shown below. When people see URLs and instructions, they commonly think that the URL
can be used to download the associated file and may click on the link. For example, linking file names
to downloads is how we usually provide instructions on bleeping computer in our articles,
tutorials, and discussion forums. However, if a threat actor owned a dot-zip domain with the same
name as a linkified file name, a person may mistakenly visit the site and fall for a phishing
scam or download malware thinking the URL is safe because it came from a trusted source.
While it's very unlikely that threat actors will register thousands of domains to capture a
few victims, you only need one corporate employee to mistakenly install malware for an entire network
to be affected. Abuse of these domains is not theoretical with cyber intel firm silent pushlabs
already discovering what appears to be a phishing page at Microsoft-office.zip attempting to steal
Microsoft account credentials. These developments have sparked a debate among developers,
security researchers, and IT admins, with some feeling the fears are not warranted and others
feeling that the Zip and MOV TLDs add unnecessary risk to an already risky online environment.
Open source developer Matt Holt also requested that the ZipTLD be removed from Mozilla's public
suffix list, a list of all public top-level domains to be incorporated in applications and browsers.
However, the PSL community quickly explained that while there may be a slight risk associated with
these TLDs, they are still valid and should not be removed from the PSL as it would affect the
operation of legitimate sites. Removing existing TLDs from the PSL for this reason would
just be wrong. This list is used for many different reasons, and just because these entries are
bad for one very specific use case, they are still needed for almost all others, explained
software engineer Felix Fontaine, end quote. Now for the day's AI news, it has become news,
really, for our purposes when things like this get released. Stability AI took the wrappers off
stable studio, an open-source version of Dream Studio, its commercial interface for the text
to image stable diffusion model. Quoting the Verge, making an open-source version of Dream
Studio carries benefits for stability AI. It allows community developers to improve and experiment with
the interface with the company potentially reaping the rewards conferred by these improvements.
Stability AI stressed community building in its press release, noting how from enabling local
first development to experimenting with a new plugin system, we've tried hard to make things
extensible for external developers, end quote.
Stability AI has previously leaned hard on its open source approach to create interest in its products.
Various versions of Stable Diffusion have been freely available to download and tinker with since
the model was publicly released back in August 2022, and last month, the company released a suite
of open source large language models collectively called Stable LM.
Stability AIs founder and CEO Imad Mostaki has been outspoken about the importance of making AI
tools open source in order to increase public trust, claiming that open models will be essential
for private data in a Zoom call with the press last month. However, the company's approach sometimes
seems to lack direction. For example, Stable Studio will be available alongside Dream Studio and potentially
compete with it. The company has previously said it plans to generate revenue by creating
customized versions of Dream Studio for corporate clients, but it's not clear how successful
this strategy has been. Recent reports suggest the firm is burning through cash and note that
its most important models like stable diffusion were built in collaboration with other parties, end
quote. We've been talking about the open versus centralized AI debate, but another sort of angle to
that debate is going to some sort of cloud provider to use generative tools versus being
able to run such tools locally on your own hardware. So interesting article from the Financial
Times looking at how tech companies are racing to put generative AI natively on mobile devices.
Quote, as tech companies rush to embed generative AI into their software and services,
they face significantly higher computing costs.
The concern has weighed in particular on Google, with Wall Street analysts warning that
the company's profit margins could be squeezed if internet search users come to expect
AI generated content in standard search results.
Running generative AI on mobile handsets rather than through the cloud on servers operated
by big tech groups could answer one of the biggest economic questions raised by the latest
tech fad.
Google said last week,
that it had managed to run a version of Palm 2, its largest large language model, on a Samsung
Galaxy handset. Though it did not publicly demonstrate the scaled-down model called Gecko,
the move is the latest sign that a form of AI that has required computing resources only found
in a data center is quickly starting to find its way into more places. The shift could make
services such as chatbots far cheaper for companies to run and pave the way for more transformative
applications using generative AI. You need to make the AI hybrid running in both of the
data center and locally, otherwise it will cost too much money. Cristiano Amman, chief executive of
mobile chip company Qualcomm, told the Financial Times, tapping into the unused processing power on mobile
handsets was the best way to spread the cost, he said. Handsets lack the memory, though, to hold large
models like the one behind chat GPT, as well as the processing power required to run them.
Generating a response to a query on a device rather than waiting for a remote data center
to produce a result could reduce the latency or delay from using an application. When a user's
personal data is used to refine the generative responses, keeping all the processing on the
handset could also enhance privacy. More than anything, generative AI could make it easier to
carry out common activities on a smartphone, for instance, when it comes to things that
involve producing text. You could embed the AI in every office application. You get an email.
It suggests a response, said, Amon, you're going to need the ability to run those things
locally as well as on the data center, end quote. Rapid advances in some of the underlying models
have changed the equation. The biggest and most advanced, such as Google's Palm 2 and OpenAI's
GPT4, have hogged the headlines. But an explosion of smaller models has made some of the same
capabilities available in less technically demanding ways. These have benefited in part from
new techniques for turning language models based on a more careful curation of the data sets they
are trained on, reducing the amount of information they need to hold. According to Arvin Krishna,
chief executive of IBM, most companies that look to use generative AI in their own services will get
much of what they need by combining a number of these smaller models. Speaking last week, as IBM
announced a technology platform to help its customers tap into generative AI, he said that many
would opt to use open source models where the code was more transparent and could be adapted,
in part because it would be easier to fine-tune the technology using their own data. Some of the
smaller models have already demonstrated surprising capabilities. They include Lama, an open-source language
model released by meta, which is claimed to have matched many of the features of the largest systems,
Lama comes in various sizes, the smallest of which has only 7 billion parameters, far fewer
than the 175 billion of GPT3, the breakthrough language model OpenAI release in 2020.
The number of parameters in GPT4 released this year has not been disclosed.
A research model based on Lama and developed at Stanford University has already been shown
running on one of Google's Pixel-6 handsets, as well as their far smaller,
size, the open source nature of models such as this, has also made it easier for researchers
and developers to adapt them for different computing environments. Qualkom earlier this year showed off
what it claimed was the first Android handset running Stable Diffusions Image Generation model,
which has about one billion parameters. The chipmaker had quantized or cut down the model
size to run it more easily on a handset without losing any of its accuracy, said Zyad Ashgar,
a senior vice president at Qualcomm, end quote.
And finally today, as promised, here's Ben Thompson's big recent essay on this debate between open-source AI and let's call it platform-based AI.
He says that Google's recent I.O. suggests to him that AI is going to be a sustaining innovation for big technology, not a disruptor.
The true fight will be between the major players' centralized models and the open source models.
Quote, over the past seven years, Google's primary business model innovation,
has been to cram evermore ads into search, a particularly effective tactic on mobile,
and, to be fair, the sort of searches where Google makes the most money, travel insurance,
etc., may not be well suited for chat interfaces anyways.
That, though, ought only increase the concern for Google's management that generative AI may
in the specific context of search represent a disruptive innovation instead of a sustaining one.
Disruptive innovation is, at least in the beginning, not as good as what already exists.
That's why it is easily dismissed by managers,
can avoid thinking about the business model challenges by correctly telling themselves that their
current product is better. The problem, of course, is that the disruptive product gets better,
even as the incumbent's product becomes ever more bloated and hard to use, and that certainly
sounds a lot like Google Search's current trajectory. I tend to believe that disruptive innovations
are actually quite rare, but when they come, they are basically impossible for the incumbent
company to respond to. Their business models, shareholders, and most important customers
make it impossible for management to respond. If that is true, though,
then an incumbent responding is in fact evidence that innovation is actually not disruptive, but sustaining.
To that end, I take this Google I.O. as evidence that AI is in fact a sustaining technology for all of big tech,
including Google. Moreover, if that is the case, then that is a reason to be less bearish on the
search company because all of the reasons to expect them to have a leadership position,
from capabilities to data to infrastructure, to a plethora of consumer touchpoints remains.
open source models running locally might be a big boon to Apple, but they are the truly disruptive
threat to centralize companies like Google and Open AI. I think it is meaningful, though,
that Google made clear it views AI as a sustaining innovation and that it intends to fully
implement generative AI across its business, including search. Of course, that means there
are battles to come within that context. The aggressiveness and competitiveness we've seen from these
large tech companies is a refreshing change from the stasis of the previous decade. At the same
time, the fact that all of big tech is on board and given their supernatural, supernatural,
I should say, nature will inevitably be incentivized to be a helpful and engaged partner to regulators
all around the world, suggesting that the true fight will be between centralized models,
which regulators will more easily work with, and open source. In this view, the recently proposed
EU regulations for AI and the threatened crackdown on open AI via GDPR are simply the first salvo in what may be
the defining war of the digital era.
Will centralized and thus controllable entities win,
or will there be a flowering on the fringe of open models
that truly explore the potential of AI for better or for worse?
End quote.
This is a long essay.
It's expansive.
I only dipped in and out to give you a taste.
I highly recommend you read the whole thing.
Final link in today's show notes.
Beep, beep.
Who's got the keys to the Jeep vroom?
I'm driving to the beach.
Top down, loud sounds, see my peeps.
oddly, that's a Zelda reference, although of course it's Missy Elliott.
If you're playing the new Zelda game, then you know to a surprising degree,
Tears of the Kingdom is a truck and cart, and also maybe airship and rocket ship building sim.
Replace peeps in those lyrics with Korox, and you get the point.
Talk to you tomorrow.
