Tech Brew Ride Home - Mon. 08/21 – Twit Pics Disappear
Episode Date: August 21, 2023More X shenanigans over the weekend. Some solid evidence that some major LLMs have in fact been trained on copyrighted material. A ton of it, in fact. As Arm prepares to IPO, who might join them, depe...nding on how things go? Bad news for Adyen is probably bad news for Stripe. And the rise of high tech sailing ships. Sponsors: Collective.com/ride TryNom.com/ride Links: Twitter Deletes All User Photos And Links From 2011-2014 (Forbes) REVEALED: THE AUTHORS WHOSE PIRATED BOOKS ARE POWERING GENERATIVE AI (The Atlantic) Silicon Valley start-ups revive listing plans as Arm reignites IPO market (Financial Times) Europe’s Stripe rival Adyen saw $20 billion wiped off its value in a single day. Here’s what’s going on (CNBC) UK to spend £100m in global race to produce AI chips (The Guardian) A cargo ship that harnesses wind power has set sail on its maiden journey (Quartz) Learn more about your ad choices. Visit megaphone.fm/adchoices
Transcript
Discussion (0)
On April 4th, 2023, around 2 in the morning, a man was found stabbed multiple times on a sidewalk in downtown San Francisco.
Hey, who did this to you?
What happened next turned the story into a political firestorm.
Reports have identified the victim as Bob Lee, the founder of Cash App.
From Bloomberg Podcasts, this is Foundering, the Killing of Bob Lee, beginning April 16.
Welcome to the Tech meme right home from Monday, August 21st, 2023. I'm Brian McCullough today.
More ex-shenanigans over the weekend. Some solid evidence that major LLMs haven't fact been trained on copyrighted material, a ton of it, perhaps.
As Arm prepares to IPO who might join them, depending on how things go, bad news for Adyan is probably bad news for Stripe as well, and the rise of high-tech sailing ships.
Here's what you miss today in the world of tech. There was some drama at the artists formerly known.
as Twitter over the weekend, first Elon Musk tweeted that the block feature is going to go away,
except for in the cases of DMs to be replaced with, I guess, just better muting or something.
Again, starting to really feel like Elon is almost daring certain users to abandon the platform at this point.
But then again, Elon tweets a lot of things on X, and some of those do happen.
Some of them do not come to pass.
And yes, I'm aware that I just called it a tweet.
then Saturday or Sunday came word that many images uploaded directly to Twitter between
2011 and 2014 are not loading, and links from those years that use Twitter's native
shortening service are broken too, quoting Forbes.
Twitter, the social media platform officially known as X, appears to have deleted all
images from the website that were posted between 2011 and 2014.
Links that use Twitter's native shortening service are also broken.
It's not immediately clear if this was an intentional act or an error, but whatever's
happening is causing concern among users who've been on the site for over a decade.
News about the photo deletions on Twitter first went viral on Saturday after user Tom Coates
tweeted about it. I confirmed that my own photos on the platform from mid-2011 to 2014 have been
deleted and links no longer work, as you can see in the tweet below. It appears that Twitter's
link-shortening domain, the new URL that Twitter generates so it can track user activity,
is the likely culprit behind why images no longer display and links no
longer work. Twitter launched in 2006, but didn't support native image uploads until the summer of 2011.
Several image hosting services sprung up to support Twitter like TwitPick, but that service shut
down in 2014, and many images from those early days are lost. But it now seems images that were
posted to Twitter directly from 2011 to 2014 could be in danger as well, since they're no longer
loading on the site. Some users on the Reddit Forum, data hoarder, which tracks data preservation
from the internet age, speculate that Twitter has broken something in an effort to migrate the site
to X.com, which Twitter owner Elon Musk has held for a number of years. But that's simply a logical
guess at this point and hasn't been confirmed. Another popular theory is that Twitter is attempting
to save money on image hosting fees, another guess that hasn't been confirmed by anyone
officially at Twitter, end quote. According to an analysis from the Atlantic, Books 3, a dataset used to
train meta's Lama, Bloomberg GPT, and Aluthor AI's GPTJ, among others,
contains more than 170,000 books from authors like Stephen King and Wannat Diaz.
Quote, in a lawsuit filed in California last month, the writer's Sarah Silverman,
Richard Cadry, and Christopher Golden allege that meta violated copyright laws by using
their books to train Lama, a large language model similar to Open AIS GPT4, an algorithm that
can generate text by mimicking the word patterns it finds in sample text. But neither the lawsuit
itself nor the commentary surrounding it has offered a look under the hood. We have not previously
known for certain whether Lama was trained on Silverman's cadres or Goldman's books or any
others, for that matter. In fact, it was. I recently obtained and analyzed a data set used by
meta to train Lama. Its contents more than justify a fundamental aspect of the author's
allegations. Pirated books are being used as inputs for computer programs that are changing how we
read, learn, and communicate. The future promised by AI is written with stolen words.
As a writer and computer programmer, I've been curious about what kinds of books are used to
train generative AI systems. Earlier this summer, I began reading online discussions among
academic and hobbyist AI developers on sites such as GitHub and Hugging Face. These eventually
led me to a direct download of The Pile, a massive cache of training text created by Eleuther
AI that contains the Books 3 dataset, plus material from a variety of other sources, YouTube video
subtitles, documents and transcriptions from the European Parliament, English Wikipedia, emails sent
and received by Enron Corporation employees before its 2001 collapse, and a lot more. Upwards of 170,000
books, the majority published in the past 20 years, are in Lama's training data. In addition to
work by Silverman, Cadry, and Golden, nonfiction by Michael Pollan, Rebecca Solnett, and John
Krakauer, is being used as our thrillers by James Patterson and Stephen King, and other fiction by
George Saunders, Zadie Smith, and others. These books are part of a dataset called Books
Three, and its use has not been limited to Lama. Books Three was also used to train Bloomberg's
Blumberg's GPT, Elyther A.I's GPTJ, a popular open source model, and likely other generative
AI programs now embedded in websites across the internet. A meta-spokesperson declined to comment
on the company's use of Books Three. Bloomberg did not respond to emails requesting comment, and
Stella Biederman, Aluther A.I's executive director, did not dispute that the company used Books
3 in GPTJ's training data. Of the 170,000 titles, roughly one-third are fiction, two-thirds
non-fiction. They're from big and small publishers. To name a few examples, more than 30,000
titles are from Penguin Random House and its imprints, 14,000 from Harper Collins, 7,000 from
McMillan, 1800 from Oxford University Press, and 600 from Verso. The collection includes
fiction and non-fiction by Elena Ferranti and Rachel Cusk. It contains at least nine books by
Haruki Murakami, five by Jennifer Egan, seven by Jonathan Franzen, nine by Bell Hooks, five by David
Grant, and nine by Margaret Atwood. Also of note, 102 pulp novels by Elron Hubbard, 90 books by the
Young Earth Creationist Pastor John F. MacArthur, and multiple works of aliens built the pyramids
pseudo-history by Eric von Denkin. And an email statement, Biederman wrote in part, quote,
We work closely with creators and rights holders to understand and support their perspectives and
needs. We are currently in the process of creating a version of the pile that exclusively contains
documents license for that use, end quote. Word that the dam might break later today, when you'd
imagine sometime after the close of trading, probably, Arm is expected to file its much-anticipated
IPO prospectus. Now, the question is, will the dam really break after this? Will the IPO be
successful enough that other tech startups might test public market waters? And if so, who? Well,
quoting the financial times. A group of Silicon Valley's biggest private tech companies are dusting off
long-delayed plans to list their shares with the upcoming initial public offering of chip designer,
Arm, set to provide a new gauge for market sentiment. Grocery Delivery Group Instacart,
software company Databricks and identity verification startup SoCure are among those considered
candidates to launch stock market debuts by next year, according to people familiar with their thinking.
They would follow Arms Blockbuster public offering, which is expected as soon as next month,
to people familiar with the plans. That IPO provides an unusual test of investor thinking.
The UK-based chip designer was public for 18 years before being taken private by SoftBank for
24 billion pounds in 2016. That should ease its passage back onto the public market,
according to investors, but it also makes it harder for other startups to draw from conclusions.
Arm is among the first big tech companies to attempt an IPO in 18 months,
with several well-funded startups such as Stripe having put off float plans during a turbulent period
for public tech stocks. Instacart could be among the first to follow with an IPO before the end of this
year, according to two people close to the matter. It first filed its intention to list in New York
last May, but delayed plans due to market conditions. The grocery delivery company's valuation
has plunged from a peak of $39 billion in March 2021 to $12 billion in May of this year,
according to people with direct knowledge of the company's financial details. It will make a decision
depending on whether public markets stabilize later in the year, said the people. Nasdaq,
stock exchange on which arm plans to list has in 2013 recovered. The bulk of last year's losses
and investors are increasingly confident that a small number of startups that shelved plans to list in
2021 could soon revive them. Josh Wolfe, co-founder of venture capital firm Lux Capital, said,
a slim sliver of an IPO window may open later this year. When it does, singular category
defining companies would be strong, standalone public new listings, he added. Data Bricks,
which posted revenues of more than $1 billion in June and acquired Open AI competitor Mosaic ML for
1.3 billion is a candidate to IBO, according to Wolf, who is one of its venture capital backers.
ID verification company Sokir, which is valued at $4.5 billion, hinted at an IPO in 2021, but
pooled plans when the market soured. So cure this year secured a $95 million credit facility
from J.P. Morgan, has hired a new chief financial officer with IPO experience and is preparing
for an IPO as soon as next year, according to founder Jimmy Ayers, end quote.
Now, you just heard Stripe mentioned in that last piece, but I'm wondering if Stripe might not be among those lining up to go out the IPO door.
That's because Stripe's rival Adyen just saw $20 billion wiped off its valuation in a single day after a bad earnings report.
So if you're a Strip investor and you're looking at comps, this ain't good.
Quoting CNBC, the company's shares plummeted 39% on Thursday.
erasing $18 billion or $20 billion from Adyen's market capitalization as investors dumped the
stock after the firm reported its slowest revenue growth on record. Identified as one of the top
200 global fintech companies globally by CNBC and Statista. Adyen is a payment service firm that
works with customers including Netflix, Meta, and Spotify. It also sells point-of-sale
systems for physical stores and handles payments online and in-store. More than a processor. Ad-ion is what is
known as a payment gateway, meaning it uses technology to enable merchants to take card payments
and transactions through online stores. The company takes a small cut off of every deal that runs
through its platform. Adion last week reported results for the first half of the year that came in
well below expectations. The company's revenue of 739.1 million euros for the period was
up 21% year over year, but also showed Adion's slowest sales growth on record. Analysts had expected
853.6 million euros of revenue and 40% of year-on-year growth, according to refinative ICON forecasts.
Adion has typically been viewed as a growth stock after consistently reporting revenue growth of
26% each half-year period since its 2018 stock market debut. Adion said in a letter to shareholders
last week that its EBITDA margin fell to 43% in the first half of 20203 from 59% in the same
period a year ago. Adian has historically been a lean business, opting to hire fewer people overall,
on its main competitor, Stripe, which has roughly doubled its staff.
Simon Taylor, head of strategy at Sardine AI, said Adion might face a natural ceiling to what
business size it can reach before having to reduce its margins to grow again.
Ultimately, they're subject to the same macro headwinds as everyone in e-commerce is, Taylor
told CNBC, and they still grew 21%.
Incumbents would kill for that, end quote.
Sources say the UK is in talks with Nvidia, AMD, and Intel to buy up to $1,000.
to 100 million pounds worth of GPUs for what's being called a national AI research resource,
though some officials want to spend far more, especially given that that's a fraction of what, say,
Saudi Arabia recently purchased. But again, this is a trend of nation states trying to buy
GPUs, quoting the Guardian. Taxpayer money will be used as part of a drive to build a national
AI resource in Britain, similar to those under development in the U.S. and elsewhere. It is understood
that the funds will be used to order key components for major chipmakers, Nvidia, AMD, and Intel.
But an official briefed on the plans told the Guardian that the 100 million pound offer by the government
is far too low relative to investments by peers in the U.S., EU, and China.
The official confirmed in a move first reported by the Telegraph, which also revealed the investment,
that the government is in advanced stages to order up to 5,000 graphics processing units from Nvidia.
Rishi Sunak's government revealed plans in May to invest over.
1 billion pounds over 10 years in semiconductor research design and production, a step dwarfed by the
U.S.'s 52 billion-pound Chips Act, and the EU subsidies in the neighborhood of 43 billion euro or
37 billion pounds. A holdup in progress triggered by relatively weak investment could leave the
UK exposed amid mounting geopolitical tensions over AI chip technology, end quote.
So again, this is a theme at this point. InVIDIA itself, all by its lonesome, as a single
company is a key geopolitical bottleneck, especially if you believe, as these governments do, that
AI will usher in a new computing era for the 21st century, and you don't want to be left behind.
Just thinking idly now, but how do you imagine the U.S. government thinks about that, thinks about
invidia now? Finally, today, a cargo ship that harnesses wind power has made its maiden voyage,
quoting Quartz, the first vessel of its kind to be retrofitted with the technology called wind wings,
the Pikesis Ocean has set sail from China with a lofty goal of helping the maritime industry decarbonize.
Agribusiness giant cargo chartered the Mitsubishi Corporation vessel.
The wind wings described as an advanced wind assistant propulsion and route optimization system.
In today's press release, have been developed by UK-based design and engineering firm Bar Technologies
and manufactured by Yara Marine Technologies.
Harnessing wind alone along the journey could lead up to a 30% reduction in fuel consumption,
also cutting the ship's carbon emissions. If the ship can stay the course, it could open doors to a
greener future for the polluting industry, retrofitting a solution to decarbonize existing vehicles,
while offering new ones a sustainable edge design, end quote. Some more facts in the typical quartz house
style, 30%. Reduction in fuel consumption and CO2 emissions that windwings can achieve on average
trading patterns, according to simulations. This could be even higher if used in combination with
alternative fuels, cargo, and bar technology set in their press release.
37 meters. That's the size of the solid wing sails, made from the same materials as wind turbines,
featured in the system which are fitted to the deck of the bulk cargo ships.
751 meters. The length of the ship cargo has chartered equivalent to two American football fields,
so this isn't some sort of tiny yacht. 23, large ships currently equipped with some form of wind-assist
technology over the past 12 years. Galvin Allright, Secretary of the International Windship Association
in a statement in July said, the figure is expected to double over the next 12 months. Half. That's the share
of new build ships, quote, that will be ordered with wind propulsion, according to bar technologies,
end quote. We were promised flying cars, but instead we're returning to sailing ships,
which, if it works like it looks like it does, why not? Talk to you tomorrow.
