Moonshots with Peter Diamandis - Why We Need New AI Benchmarks, Which Industries Survive AI, and Recursive Learning Timelines | #218
Episode Date: December 23, 2025Get access to metatrends 10+ years before anyone else - https://qr.diamandis.com/metatrends Matthew Fitzpatrick is the CEO at Invisible Technologies Learn about Invisible Salim Ismail is th...e founder of OpenExO Dave Blundin is the founder & GP of Link Ventures Dr. Alexander Wissner-Gross is a computer scientist and founder of Reified – My companies: Apply to Dave's and my new fund:https://qr.diamandis.com/linkventureslanding Go to Blitzy to book a free demo and start building today: https://qr.diamandis.com/blitzy Grab dinner with MOONSHOT listeners: https://moonshots.dnnr.io/ _ Connect with Peter: X Instagram Connect with Matthew Linkedin Connect with Dave: X LinkedIn Connect with Salim: X Join Salim's Workshop to build your ExO Connect with Alex Website LinkedIn X Email Listen to MOONSHOTS: Apple YouTube – *Recorded on December 16th, 2025 *The views expressed by me and all guests are personal opinions and do not constitute Financial, Medical, or Legal advice. Learn more about your ad choices. Visit megaphone.fm/adchoices
Transcript
Discussion (0)
Most of the public focus to date has been on the large public benchmarks for things like coding.
I think the problem is though.
I've met Fitzpatrick's CEO of invisible technologies.
We're an infrastructure company.
What that infrastructure allows us to do is what I would call hyper-personalized software at skin.
It sounds like your position is we need thousands of new narrow benchmarks to capture maybe every labor category, every industry vertical.
That is an interesting second part of this, which is
We're going to see the largest disruption ever in 2026 from companies that don't make this change.
There are many sectors where the structure of what the industry does is going to change if you think about knowledge work, the production of large amounts of documentation, where these technologies are very disruptive.
I think the question is which parts of your business can really change your AI?
What are you seeing most companies get wrong on their mission to implement AI?
You've got two kind of different challenges.
One is,
now that's the Moonshot, ladies and gentlemen.
Everybody, welcome to Moonshots.
In today's episode, we're going to be discussing
why all companies need to become AI companies in 2026.
How they do that, what happens if they don't.
We'll discuss whether or not big legacy companies
can even make such a dramatic change
and how they can best do it.
we're going to be going over some fun and meaningful AI use cases. So listen up. I think there are
ones that are going to sort of get you excited about what you can do. And we'll dive into some
predictions from our guest for 2026. Today, joining the moonshot mates is a friend of the pod,
Matt Fitzpatrick, who for more than a decade was at McKinsey, rising to the position of global
head of quantum black labs. I love that name, quantum black. It's so cool. Leading the firm's
AI software development, R&D, and global AI products a year ago. Matt joined as the CEO of
Invisible Technologies, a company started by a brilliant friend of mine, Francis Padraza. For those
you don't know Invisible, the company is a modular AI software platform that uses AI training
and provides AI training for most of the large language model providers out there, building
custom workflows and agents for enterprise. They anchor their work in creating
clean data and human-in-the-loop delivery to ensure measurable business results. Matt, welcome. Good to have
you here. Hey, Matt. Thank you for having me. We got DB2, AWG, Salim, almost happy holidays, guys.
It feels like we're on this pod, you know, every other day. I think we should just move into a large,
you know, sort of podcast house. We will have fully documented the singularity.
Well, you know, I'm really looking forward to hearing from Matt because on Thursday we have to do
our predictions for next year, and Matt is going to give us a ton of insight today. One of my
predictions out of the gate is that enterprises are going to move super stupidly slowly compared to
AI capabilities, and Matt is the world-leading expert on the intersection between AI and
enterprise, so I cannot wait for this. You cannot, you can't cheat this way, Dave, and sort of
use Matt's predictions as yours. I can't. No. You won't be there. Everybody listening to this
pod will know that.
All right. Well, take good notes, nothing else. You guys ready to jump in? Matt, I'm going to kick it off with a question, sort of a broad question for you. And here it is. So within the past year, you know, we've heard from like every company out there and every CEO that we're going to be pivoting to become an AI company. Selim, in the last pod, you said something like, we're going to see the largest disruption ever in 2026 from companies that don't make this change.
And I think, Alex, the term you used is they're going to be cooked if they don't.
I said knowledge work is cooked. Not knowledge workers, not companies, knowledge work as we
currently know it. Uh-huh. Okay. So you don't think that companies are going to be cooked that don't
make the transition to AI? I think we're going to see many more companies over time and many more
smaller companies as well. Okay. Well, we're going to dive into that.
Look, in an earlier episode, we pointed out that you're, you know, when you thought you had product market fit and you're scaling a SaaS company, you're like toast because everything needs to be rethought now given AI.
So this is now applying to big companies also.
All right.
So, Matt, the question to kick this off is, can every company truly become an AI company and how?
And then which companies and industries do you think need to disrupt themselves now before they become basically irrelevant?
So that's a softball question to take us all off here.
Peter's saving the hard balls for me, Matt.
Always.
I think your second question relates to your first question in some ways, which is I don't
think, and I think all the data that has come out on so far, that all industries are
going to be impacted equally by this.
I think there are some sectors and areas where you're going to see materially different
impacts.
I think areas like media, legal services, business process outsourcing.
There are many sectors where the strong.
structure of what the industry does is going to change, if you think about knowledge work,
the production of large amounts of documentation, where these technologies are very disruptive.
I think the hype has been a bit overblown is if you take a lot of sectors like oil and gas or
real estate, the function of what they do is going to stay pretty consistent.
And I think actually most of the good analytics of how job dynamics will change over the next
several years will get at this. The decision on which apartment building or which office building
to buy is going to function pretty similar to what it did five, six years ago. And so I think the
question is which parts of your business can really change with AI. It's not all of them. And some
sectors will be more or less. And then the second part of your question, which is also an
interesting one, which is can everyone actually become an AI company? And that is an interesting
second part of this, which is
there are not that many people
that know how to build these sorts of models
or deploy these sorts of models well.
And so one of the big challenges is
do you have the expertise in-house to do this?
How do you think about adjusting
the operating functions of your company to do it?
Is it the same team you have in your IT function
doing it now? And particularly, Peter,
I know the set of folks
that you and I have spoken with in the past, like small
businesses, if you're a 50-person company,
It's hard to deploy a lot of this stuff at scale if you don't even have a CTO in-house.
And so I think there's a mix of, is your industry going to fundamentally change?
And then what are the actual core competency your company has to implement it?
And so I think what you're going to end up finding, yeah.
Yeah.
Yeah. So, I mean, do you end up bringing a chief AI officer into your company?
Are you going to bring that capability or are you basically renting it?
I mean, part of the other thing that's going on right now, we've talked about this on the pot a lot, is your competition isn't really.
really the large, you know, multinational, it's the AI-native startup that came out of no place
that's reinvented themselves from the ground up as an AI-first company, right?
Yeah, I mean, like, down and dirty, Matt, like, which happens first?
I can get a mortgage by talking to an AI and get it done in under an hour, or we're walking
on Mars with our own two feet. Like, which of those two things is going to happen in the real world
first. Yeah. So the way I've often heard the question asked is do the start-ups get distribution
before the big companies build the technology? And I do think that will be the tension in a lot of
ways. And look, I think there's a lot of big established companies that are going to figure out
how to do this really well. Like, I think if you take a sector like legal services, I do think
the big law firms will figure out how to use a lot of this over time. You know, I think there are
sectors where I think banking is a really interesting one to look at right now. You know,
know, if you look at the age of the application footprints in banking, like most of the tech
that exists in banking is north of 20 years old. And so you do have a bunch of very fast-moving
newer fintechs that are approaching at different ways, companies like Revolut. I don't know
how that plays out. But I do think that becomes the question in a lot of ways, is which moves
faster, the emerging entrance or the modernizations of the existing. I think, Peter, to hit on what you
were asking also as a second part of that, do you buy your rent? I think that's something
you've got to be really honest with yourself about as a company, right? And I think the idea
that everyone can buy, everyone can hire people to do this is challenging. The challenge of
trying to adapt an existing IT function to do this is many of the skill sets that people hire
for, even like, do they know Python, things like that, there are gaps in that. And so I think
the answer that most companies I've seen who don't have the resources in-house or being
just directive about how to push that is they are finding ways to rent or buy this externally
and to partner with folks that can allow them to do it. Every week, my team and I study the top
10 technology metatrends that will transform industries over the decade ahead. I cover trends
ranging from humanoid robotics, AGI, and quantum computing to transport, energy, longevity,
and more, there's no fluff, only the most important stuff that matters, that impacts our lives,
our companies, and our careers. If you want me to share these metatrends with you, I writing
a newsletter twice a week, sending it out as a short two-minute read via email. And if you want to discover
the most important metatrends 10 years before anyone else, this reports for you. Readers include
founders and CEOs from the world's most disruptive companies and entrepreneurs building the
world's most disruptive tech. It's not for you if you don't want to be informed about what's coming,
why it matters and how you can benefit from it.
To subscribe for free, go to demadest.com slash metatrends
to gain access to the trends 10 years before anyone else.
All right, now back to this episode.
I think legal and accounting are really, really cool case studies.
And I know you know more about this, you know, your McKinsey time, Quantum Black.
You're like the guy understanding and parsing all of this.
But they're really cool because they can be replaced by a startup, you know, like Harvey.
Dave, two things I'd say about that.
I think one challenge of the implementation of Gen. AI in the enterprise setting is a statistically
validatable baseline to compare against.
As an example, if you take something like mortgage underwriting has made huge progress, in a very
positive way, actually, the percentage of mortgage underwriting that's now done by a very
guardrailed and very effective set of algorithms developed by the banks is pretty high because
they can back test and say, this is a correct credit decision that has no redlining, anything else.
But if you think about a document, like the reason contact centers has been one of the cases we've seen a lot of adoption of this is you do have a clear baseline, right?
Like time per call, CSAT, costs per call.
You've a set of metrics you can pair against.
Something like, let me generate an investment memo, which is different in format at every firm.
It could be 10 pages versus 40 pages.
The content is different.
It's been harder for folks to build baselines.
I do think that's why legal services is an interesting one.
is there are certain areas of legal where those baselines are clear,
like you can look at what documents are really good for like an is agreement.
But where I would, I think you're going to see this in a lot of different segments.
I think the high end of that market still persists in a really differentiated way,
which is if you're doing a large M&A transaction,
you're still going to want a really good lawyer's advice.
And where it changes, I think, is in the more very basic, like produce an NDA type of work.
and the kind of basic, and I think that's going to be one of the shifts again,
is that really good human guidance is going to persist forever.
It's the basic commodity information that right now a lot of people are paid,
you know, probably excessive amounts of money to do.
Yeah, well, you know, the NDA is pretty extreme.
That's it.
But I'll tell you, the venture fundings that we do, you know,
we do tons of these every year.
And the term sheets always say, you, the company that were in,
investing in will bear the costs of the legal, capped at $50,000.
And then the documents are freaking identical every single time. There's like eight knobs.
And you could store all combinations on like the smallest thumb drive in the world.
And how is this like a $50,000? And it always runs up to $4,99.99. It's like, whoa, what a miracle.
So I don't know
That to me feels like
You know that would be on sort of the mid to hard end of the scale
Yet it's still so doable
You know an NDA is a no-brainer
Mortgages are no-brainers
I completely agree
I think what's been interesting though is how slow
The actual adoption curve has been in say contact centers
Because contact centers should have had
I mean generally CESAT scores or people don't really like
Most contact center actions
the kind of general customer feedback you get is pretty unhappy.
And that's been true for a decade.
So you would have expected that this technology would have.
Walk us for the whole Klarna thing, actually.
I know you're an expert on this.
Like the Klarna thing has been really interesting to watch.
Wait, tell us the story.
What is the Klarna thing?
Yeah, by the way, I was not involved in Klarna anyway,
but I can say at least what I know from reading about it
and what my hypothesis would be.
I mean, basically Klarna announced that they were going to move
entirely towards a fully end-to-end agentic contact center.
And then a couple of months later, and by the interesting thing was at that time,
they were the most frequently cited example of agentic success in deployments.
And then about eight to 12 months later, they basically announced they were rolling the
whole thing back and moving back entirely to human contact center ages.
And I found the entire evolution kind of interesting because if you think of how these
systems should be defined, like, you know,
deployed. Like a multi-agent system, the way it should work is you'd have kind of an orchestration
of what the types of calls are. You'd have a set of validations on which calls could go well or
badly, and you'd have some sense of where you need escalations to human agents versus where
you're going to say. So you actually would never want to move. And I think this is a theme
invisible has whoever. You're never going to want to move to doing everything agentic. You're
going to want humans in the loop in almost every industry in almost any topic. Because I think
actually that's where a lot of the, if these models are trained off of precedent data and then
you can train them really well to then kind of continue that logic, you're going to want humans
for some of the things where you don't really have precedent data, or you need them to work
through complex things where you don't have enough historical information. And so I found the entire
structure of how the change happened quite confusing because you would always want to keep
a contact center to be a mix of humans and agents and then evolve the, the
mix between those and I'll watch topics. And so the whole movement from all, all humans,
all agents back to all humans was confusing, I think, from end to end. Selim, you are pregnant
with a question. No, no, I just wanted to give out some details here. So the Klarna situation,
they rolled out an AI to do customer service calls and the claim was in the first month that did
the work of 700 full-time agents, handled 2.3 million calls a month. And they projected that it
would save them 40 million a year. And they were like really proudly saying this was
like month one and it's only ever going to get better from here the when I saw that I was like
okay if I was doing that this sounds like a PR exercise more than anything real because you'd
never put that out in the first month you'd wait a couple of months to see what exactly happened
and Matt you may be able to give a little more color on why did they roll it back in the end
did they find the hard cases were too many the exception handling was too much or was it a cultural
backlash? What was it exactly that had them undo the whole thing?
I don't know in the sense I haven't worked with Klarna, but, you know, I think you hear a variety
of different pieces of feedback on why folks have struggled in contact centers. I think one
reason is there are cases in which humans just want to talk to another human. And so I think
some of the PR are saying we're moving to only agents has its challenges. I think two, a lot of the
challenges and where contact centers are most sensitive is non first line call resolution
topics. So it's not something like check your balance. It might be something like process and
refund, right? Something that's pretty complex, you have to write back to the source systems.
It was surprising to me how quickly they roll that out. And I wonder how hard, how well it was
able to roll to kind of deal with some of the more complex functionality in that example.
Right. Can we get, can we get back?
To level one to level two and three very quickly on those support calls. And then and then you do not
want an AI dealing with you. Can we get back to the main question here, which is you've got,
you know, 2026 is coming up. If you're listening to this in 2026, it's here now. So here's the
question. You're a medium-sized company or a large-sized company. And your board of directors has
just said to Mr. CEO or CTO, guys, what's your AI plan? What are you doing? I mean,
we're seeing that over and over and over again. Their first reaction is what typically and what
should they do? I mean, I want to just get some of the fundamentals here because I want to
serve our listener base in that fashion. Yeah, so I think if you're that CEO, you've got two
kind of different challenges. One is what are the things I should focus on? And then two is
who should do them? And do I have those skills in-house? And so the first thing I would start
with is making sure you know the first question. And I do think this is a question of following the
value. So I'd go down a list. I would not start with letting a thousand flowers bloom. I would
start with, what are two, three things that if you do them well,
materially move the needle for your business?
Maybe it's, you know, we're just talking on customer service.
Maybe that's one example.
Maybe it's forecasting in your FP&A function.
Maybe it's inventory management.
But there's definitely two to three things that almost any business on earth,
even as a small company has, that our digital marketing probably is another one that
you see pretty frequently.
And you focus on one or two of those.
and you make sure you get to a pilot stage in that one or two,
meaning not a strategy document.
I do think the one thing that anyone who spent real time in the space will tell you is,
if you take the paradigm of how machine learning is deployed,
where you spend months and months building something,
and then it works and you can underwrite statistically that it works.
This is kind of the exact opposite paradigm,
and that you can get a prototype up and running in a month,
but you have to do a lot of testing and validation to make sure you can trust it,
And so it is really a function of making sure you can get something up and running and testing and validating.
Peter, the question I was asking, would you bet your annual bonus that whatever use case you deploy works?
And that's a complicated thing.
If it's like, let's say, generate a claims processing review.
And you have to do 10,000 of them.
Most companies don't know how to say that works or it doesn't.
And so what I would do is, just to summarize, make sure you have a list of two to three things that move the needle.
make sure you get to do a proof of concept of one of them.
And I would probably do that first use case as an RFP to a third-party vendor that gets compensated based on results.
And I say that very specifically because I think if you do it in-house, the odds are the in-house team has not had a lot of experience with this.
And so you also can't hold them accountable in the same way of you get paid if it works.
And so I do think tying into outcomes limits your risk.
I mean, that is still the business model for Invisible, right?
you're paid by money saved.
Correct.
Which I love that.
Yeah, outcomes in various ways.
Yeah.
Alex, I want to bring you into the game here.
Much appreciated.
So maybe just as a preliminary matter in interest to full disclosure, I have no financial
interest in Matt's company invisible.
I do have a number of questions, though.
First question, maybe pulling the thread on testing.
One of the things that we talk about here on the pod all the time is benchmarks, the
importance of benchmarking. I'm curious. We all talk about that constantly, Alex.
That is all we talk about. We talk about nothing else. That is all we talk about. Oh, wait,
maybe that's you. Okay. Given that's all we talk about, as Dave just mentioned,
and given that Invisible is also in the business of training so many models, what benchmarks do you
think most need to be brought into existence in the world? What's most missing top three
benchmarks you'd like to see some into existence? Yeah, look.
I think, and you've seen a bunch of these start to get publicized in the past
a couple of months, but most of the public focus to date has been on the large public
benchmarks for things like coding.
And I think those are very useful as metrics for are the models improving broadly?
And I think, you know, that is a way you've been able to see by any standard.
If you look at the last three years, the models of 50% to 100% improvement on most dimensions
that you can look at.
I think the problem is, though, if you think about like enterprises or small businesses,
your benchmark for most cases is not a broad-based, kind of cognitive benchmark.
It's accuracy or human equivalence on a specific task.
And so what I think you're going to see more and more need for is kind of custom evals on highly
specific topics.
So if you go back to the contact center example, the benchmark you want to build, if you're
going to roll this out for a contact center is a series of expert agents that are in your
contact center and how they perform, and then how the AI agents perform similarly. Same with
claims processing. But basically, most businesses are going to have to get comfortable with doing
what's called an eval or a custom benchmark for the tasks they're trying to modernize.
Because an 80% accurate, very smart deployment is not, you know, there's still too much risk
in that rollout framework. And so I think a lot of this is actually the way that we think
about benchmarking will evolve from broad-based benchmarks to hyper-specific benchmarks.
I freaking love that because I can immediately see 10,000 listeners right now just found a calling in life based on what you just said.
Because all this benchmarking within these domains is really, really hard to figure out unless you know the, like, you know, title insurance.
You know, what's the benchmark for successful AI and title insurance?
Well, somebody in that industry listening to this pod right now is going to be like, you know what?
I was an early adopter of AI and I know this space inside and out.
That's my benchmark to own.
And if you declare yourself the owner of it and then broadcast the benchmark, the evidence so far,
is you become an instant star.
Like, nobody's grabbing topic ownership
in all these topics.
And if you just get there first,
you become an instant star.
I completely agree with that.
You just have to be a serious geek like Alex type
or Matt type or David U.S.
I don't even agree with that.
In this era of post-training as a commodity,
if you own the benchmark,
often it's the case I think
that the benchmark is the hard part
and you can leverage existing resources
to post-train and off-the-shelf bottle.
I am curious, though, Matt,
maybe following up on this.
So it sounds like your position is we need thousands of new narrow benchmarks to capture maybe every labor
category, every industry vertical, assuming that's the correct.
Is that something that Invisible is working on, can be working on, should be working on?
Yeah, we do spend quite a bit of time working on that.
In fact, a lot of the time what we're building is customer-specific benchmarks for an individual task.
So that is a lot of what we think about is actually how to test equivalence for a given task.
And, you know, I think one of the things that folks have not fully realized is, let's say you take a really high-performing LLM and you want to tailor it to your individual context, that process of actually fine-tuning it off of your data.
So an example I would give is, and I think one of the challenges that people were hoping that this would be a SaaS buyer's paradigm, meaning like I could just buy something that off the shelf would just solve everything I needed.
So, like, I wanted to buy a sales agent.
I wouldn't have to do anything.
I could just take in a sales agent that would sell well.
And the reality is that's pretty hard to do.
You need to actually train it up on your specific knowledge corpus, your information.
And so the way we would think about it is you take the LOM or you take an agent that's been trained for sales.
And then you fine-tune it off of your specific company information, your products, the way you sell, your way of speaking.
And then you have to build an e-val or a benchmark against that to say, this is performance.
well or not on that frame?
A quick follow-up question, if I may,
because there was the sort of infamous Bloomberg GPT moment
where Bloomberg was sort of in quasi-competition
with the frontier labs.
They had a wide variety of internal proprietary data sets.
Their original plan, this is now sort of an infamous episode
from one to two years ago.
Their plan was to offer their own proprietary frontier model,
basically, but trained critically,
pre-trained and or post-trained off of their internal data sets.
And the plan was to achieve superb performance in financial domain because they had all the data or a lot of data that were not broadly available to the general public.
But what actually happened is the generalist models offered by the frontier labs that were training basically off of the internet and more or less publicly available data sets within a few months leapfrogged Bloomberg's GPT project.
And so I guess the moral of that parable in my mind is how far do you think we can really get with proprietary data sets, proprietary benchmarks, before the generalist models completely wipe the floor up with them?
I'm sorry to clarify, I'm saying you use an LM.
The process I'm describing of actually fine-tuning a model, a large language model for your specific context is basically adding more context.
You're seeing most of the LMs offer a paradigm where you can do this, where you can add your knowledge corpus.
and train it to be more specific to your individual context.
I don't think you'll see individual institutions
building their own LLMs.
I think that's a very compute-intensive,
very difficult thing to do.
I think you'll see them tailoring the large language
models to their context.
Sure, to be clear, I...
One more question, if I may.
To be clear, wasn't asking whether you think
every institution's going to get into the business
of pre-training their models.
I was rather asking whether you think post-training,
which is inclusive of supervised fine-tuning,
reinforcement fine-tuning,
a variety of other post-training,
whether you think that has a long-term future,
or will, maybe in one to two years,
we just use pre-trained plus-post-trained generalist model
off the shelf and not need any internal benchmarks
and any internal data sets for post-training.
Well, I think there are clearly going to be use cases
where you are going to need the context in the individual company, right?
Like, if you just take the law firm example,
I just take the, I mean, just, I'll answer it this way.
There are documents that company has on how they want their future state documents for
an agreement to look, right?
And the LLMs are not going to have that information.
So at some point, you are going to have to see the post-processing layer happening at the
enterprise.
And what we're seeing more and more is there's ways to design that layer so that you can,
as new models evolve, kind of drop those in, and we are seeing more, more folks experiment
with that.
they're using all the new tech that's being rolled out.
I think, in fact, what's going to happen is over time that edge in data is going to be
the most valuable part of any company, is that trade secret type of how do we do things.
Now, at some point, it may leak into the public models.
Like if you used open AI, right?
Yeah, if you use any of the frontier models connecting.
I remember we're talking to, you know, replet, et cetera, people are using it and then the data
is going straight into the cloud, right?
And that's kind of dangerous.
They're going to have to solve that layer in a very powerful way.
That's one of my predictions to forecast, et cetera,
is we're going to need to see a layer of protection
between company data and the broader AI world.
Man, I want to make this a little more tangible.
Now, I know you can't talk about the work you've done with the hyperscalers,
but you've identified, I think, five or six cases
where you can speak publicly about it.
So if you don't mind, maybe we can toss a few of those in and then talk about them as concrete
examples. And since Alex made his no financial involvement statement, I will say I'm a proud
advisor and am conflicted in a positive fashion supporting what Matt and and Francis are doing.
But do you want to pick one of those? I loved the example on the basketball court. Can you
Can you speak to that one?
Yeah, sure.
So we worked with the Charlotte Hornets on fine-tuning custom computer vision models for
draft prep for them.
So in their case, they wanted to look at the spatial movement patterns of players
on a very broad scale across single-point cameras, across a whole host of different
college universities and international locations.
And so we fine-tuned a custom computer vision model to specifically look at.
at moving patterns they were interested in before the draft.
And so that was a big part of their draft evaluation.
Let me put this in English.
You basically took the video and you were able to use models to evaluate every player
based upon the video to see how well they performed at every different.
I'm not like a sports guy.
So it's the big round of orange.
That's becoming clear here, actually.
Yeah, yeah.
So think of, yeah.
Yeah, sure.
So think of if you take typical NBA stats, there are things like points, rebounds, what's called plus minus is one ratio that's often used, which is like the amount you score versus give up when you're in the game.
But they're mostly stats that are kind of transactional stats.
What they don't look at is the movement patterns of the players who create space, where people are positioned at any point in time.
And that's actually a lot of the most interesting data.
if you go back to like, you know, some of the original baseball analytics that Billy Bean did for the A's,
it's the movement patterns of players and who is in the best spacing, right?
And there are companies that do this on very consistent formats, like on the same court.
But we've been able to do is to do that over many different camera angles, many different stadiums very, very quickly.
And that is using custom computer vision models.
So we effectively are able to take a single point camera and understand the movement patterns of players.
in many different environments.
And so the Hornets use this, how, for team selection, player selection?
Draft selection.
So to understand which players fit certain characteristics they were looking at.
Fascinating.
Yeah, it's a complicated problem, too, because chemistry between, like, it's not just about
finding the best player, the chemistry between players matters too.
It gets infinitely complex, and it's a cool little case study.
But, you know, Gavin Baker was saying recently that in fantasy football leagues all over the country,
which I used to love before I ran out of time.
Now I have to spend all my time keeping up with Alex.
Now you have an agent doing it for you and having fun.
But that's exactly, yeah, that's the point.
Now we're obsolating human sports leagues,
replacing them with robot sports leagues and e-sports.
Very 21st century, not Twenzhen.
I'm betting on T-800 again.
Yes.
That's right.
Yeah, but people are losing their leagues.
You know, great, great fantasy football people are losing all over the place
because the AI agent is tracking a huge amount of more detailed data.
And, you know, if you look at the video footage, you know,
if somebody's like making it up and down court very slowly,
nobody's going to notice that, but the AI will notice it in a heartbeat.
And then that just goes into the great model.
It's really a cool little case study.
Since we've asked a little bit about kind of,
if I have a traditional business we're thinking about how to do this,
I'll give a slight different one, which is Lifespan MD,
which actually, Peter, I think this one will resonate you in particular,
I know Chris who runs it, yeah.
Yeah, so Life VanD. is a concierge medicine business, and you can think it of as they have a network of practices, both internationally in the United States, which all have very different sets of data on their patients, kind of practice information.
And so the thing I always start with with any AI use cases, you have to get the data right.
Before you can even start with AI, you have to make sure that you have the structured and unstructured data together that you want.
And so the first thing that we're doing for them is on our data platform neuron, we're
creating a HIPAA compliant multi-tent cloud instance where we bring in together all the patient
and provider data that's of interest.
And we start to bring both a 360-degree view of both the patient and the practice.
And so you can start to think of things like if you wanted to understand what longevity-focused
tests, male patients 35 to 50 are using most frequently, you can start to think about
things like that on patient outcomes that are really interesting.
If you want to understand practice performance, if you want to understand where you have
certain patients that are not compliant or not as interactive, it's effectively just a control
tower to understand everything that's going on across that footprints of practices.
And then I think the area where generative AI has become more important for that is actually
kind of a chat agents where people can ask questions, knowledge management systems,
and really interrogating and ask questions of all the key data from all of those practices.
One of the key things that's challenging about that is, obviously, in healthcare, you have to be extremely careful about which data is stored locally at the practice versus how that's brought centrally.
And so the HIPA-compliant Cloud is one of the key components of that is actually making sure that no patient data leaves the premise of the individual practices.
And doctors are able to access certain things and then certain practice metrics are organized centrally.
I heard the coolest thing this week.
it's a
QA company
that has invented
talk to your defect
is just the coolest concept
the defect actually has
a personality
and you can ask it questions
about itself
like where did you originate
I can totally imagine
what you just said
in healthcare
talk to your illness
like you have a conversation
with it
where did you come from
how do I treat
are you getting better
or worse if I do this thing
and it's talking back to you
with a personality
it's just the coolest
idea ever
isn't it?
It's one thing with the defect is a little awkward when you say, here's this bacteria
you're talking to you're like, I'm now going to kill you.
The defect is real.
I talk to your illness.
Maybe it gets a little weird.
I don't know what voice.
You'd give it Voldemort voice or something.
How do I kill you?
Why dispatch you?
Dave, one thing I'd note there too is I think there's a question, and I got asked this often
of like, how do you ask you earlier, how do sectors evolve?
I think actually the question of does decisioning of visual patient care change with JNAI is a much murkier question.
I think the easier place to start, and I think where, you know, in many ways it would be very interesting is the U.S. as an example, spends about 13 to 14,000 per patient per capita on health care, right?
Compared to 2,500 to 3,000 per capita and say Germany or Canada, something like 30 to 40 percent of that is admin cost.
And that is not admin costs that anyone wants to bear.
And so this is something where I actually think the idea that Lifetime B is pursuing is not to change the standard.
Actually, to make the physician even more empowered, but to take all of the really painful admin and scheduling and make that the part that they don't have to deal with anymore.
And AI should do a huge amount of damage in those areas.
Exactly.
What are you seeing most companies get wrong on their mission to implement AI?
Yeah, I think it's a couple different things.
I think the first one is a lack of focus on data as the starting point.
So I do think the challenge, if you just tried to build an AI agent on fragmented customer
and product data, it's going to break by definition, right?
And so I think you do have to be in a place where the data you're going to feed into
the models is clear and working.
So I think that's been one major challenge.
I think, do you think, I mean, if you had to look at companies as a whole in the medium and large size, do they have clean data? How long is it take a company to sort of get its data into a format and a level of fidelity that's useful? I mean, is this a hard lift or an easy lift?
It depends if you take the paradigm of I'm going to put everything in a data lake and get everything right, which can take five years.
out is most big companies have spent a half decade trying to get all their major data
schemas in order.
But I think if you start with the question of like, what data do I need for this specific
use case?
Like, you know, if you take, let's take credit underwriting.
Like to do that well, you need one, you need a set of data around the credit itself, the market.
You can find to have five to six kind of core data variables.
you need, kind of the core financials of the business, the security of the credit,
all those kinds of core information.
But you don't need every piece of data across the entire commercial bank to be right.
You need the core elements for that use case.
And so I think companies that are focused on the exact data they need to get right,
I think they've done pretty well.
But I do think trying to get all data, like, I mean, you've also seen the enterprise for long time,
Peter, if you asked any Fortune 1000 company to look at their full data repository and how much of it is
accurate and working and clear and accessible right now, very few companies have that.
So I do think being very tactical about what data you need.
The other thing I think for Gen.A.I. in particular is that a lot of the most important data is
non-system of record, non-structured data. So it's things like images, videos, text files.
It's just not things that people have tried to master historically. And so I think the first
step in this is saying, what is the thing I'm trying to solve? And how do I make sure I have that
data ready? Yeah, one thing I see a lot of, you know, had a long board meeting this morning
company that's very AI forward, a portfolio accounting company called Vesmark. And the data,
you know, for account reconciliation, for example, the data is abundant, but it doesn't tell you
what the person actually does, you know, it just tells you how it was reconciled. So now the
path to success is first the AI assistant, which helps accelerate you through the day, but it also
knows what you're actually doing. Then that accumulates. Then that becomes the RLHF or the training
or tuning data. Because what you're trying to do is like, what are you doing, guys? And that's not
really represented in the data. But a lot of times you go talk to a bank or an insurance company,
and they're like, our data is our advantage. Go ahead, bomb it into the neural net and train it.
I don't even know what that means. You know, like, I'm just going to throw a whole.
terabytes of spreadsheet data in and see what happens?
Like, that's going to go clarno on you.
Well, you have all sorts of other issues as well.
I was talking to the CIA of one of the biggest banks in the world,
and they have 300 different customer databases.
Okay, 300, one for mortgages, one for loans, one for this.
Because the mortgage people don't want to tell the loans people
about their customer data, so they guard it jealously.
It's a total disaster for the poor CIO.
Fascinating.
Alex.
Yeah.
I think these are all very interesting points. I'd like to, if I may be so bold, jump up several levels and maybe speak a little bit more about the business model of Invisible. My understanding, correct me from wrong, Matt, is there's an element of the business. I think it's called meridial that as sort of a marketplace for ML freelancers, if I understand correctly. And I'm curious. I think in my mind, one of the many elephants in the room in this conversation is,
that we're arguably on the edge of recursive self-improvement.
All of the frontier labs more or less, I think, would agree with the assertion
that we're nearing the point where you could have an AI researcher,
where you just turn over compute resources to the AI researcher,
and the AI researcher does as good, if not a better job
than the human AI researchers who work for the frontier labs.
If that is indeed the case,
surely one of the several elephants in this room,
but given limited time, focus on this one,
is that the need for a marketplace of ML freelance researchers to train models,
doesn't that evaporate entirely as we start to reach the point where AI researchers
can build custom models off of custom data sets and custom benchmarks for each client?
Yeah, so, as you said, we have two sides of our business,
one side, meridial, which we train all the large language models,
And then on the enterprise side, we build basically custom applications for enterprises.
Look, I think there has been a five-year evolution where I think consistently folks have said at some point you will not need reinforcing learning human feedback to validate and test models.
And I think the challenge of that logic is a couple different things.
One, the spectrum of expertise that if you take language, multimodality, extreme expertise on things like computational biology, and then the fact that,
a lot of these are reasoning tasks, you do need, and there's a whole host of studies on this
that actually pairing synthetic and human data together is stronger, but you do need human
feedback on almost every different sort of agent you want to roll out. And so I think the nature
of RLHF is changing. So I think you're moving more towards things like RLGIMs, controlled
environments, simulations. I think you're starting to see things like much more of the expert
work now is PhDs, masters. So it's less what I'd call commodity.
the cat-dog, cat-dog labeling. But if you say tomorrow you're going to train a model to figure out
different evolutions in 17th century French architecture in French, you are going to want
RLHF to do that to validate it. And I think that you're seeing that over and over is actually
as the models move more and more into very specific areas, there is more and more RLHF needed
for them. That's interesting. Maybe I'll share my intuition.
and then would be curious to hear what you're seeing
in your version of the ground truth.
My intuition, my impression is that we're seeing
greater and greater data efficiency in part,
I mean, our LHF was obviously sort of very fashionable
over the past three years.
Maybe it went through sort of peak fashion, if you will,
and then we saw the rise of reinforcement fine-tuning,
alternative mechanisms that maybe are far more data efficient
and maybe even more human time.
time efficient. If you have to just build an RL environments, arguably, that's per human hour
involved, probably a lot more time efficient than staffing out to some so-called developing
country. Folks to, as you say, cat dog, cat dog do supervised fine-tuning or some other
RLHF type mechanism. Surely, I'm projecting, my intuition is that you'd see more data efficiency,
not less, and therefore the amount of time, effort, money expended on RLHF or any sort of, even if we buy your
assertion that we're seeing sort of hyper-procualization of lots of different tasks, and each of them
is going to need artisanal annotation. Surely there is a competing force, which is increasing
data efficiency from algorithmic efficiencies like reinforcement fine-tuning. What are you seeing?
Yeah, I mean, people have been arguing that for five years, but I think at least what I've seen on the ground is the accuracy that you want in the, if you think about a reasoning task that involves a several-step leap and you think about the risk of those nations, it is more useful to have human feedback involved in that in some form.
And so I don't think that that means, if you think in some ways RLHF happens after all the pre-training compute cost, it's a pretty small percentage of the total cost in training.
training. And it is some of the most valuable feedback. And as you see more and more specific
agents being trained for specific tasks, like take legal services as an example. If you
train a new legal services data set, which is interesting, and you want to train a model off
of that, you are going to want to see some sort of comparable equivalence, whether it's
an associate or an M&A lawyer equivalent, where you actually test if it works. Now, is it possible
that at some point 10, 15 years from now you run out of things to train on? Possibly.
But actually, I mean, if you take the number of languages, modalities, robotics is probably the next frontier of this in some ways, RL gyms, contact centers.
There's a lot.
We are as a company, a fully a believer.
I could talk about it's on the enterprise side too, that human the loop is going to be a feature, not a bug for a long, long time.
And I think the entire red herring of the enterprise, for example, is that autonomous agents will do all of this with no humans a loop.
I actually think you're going to need more humans at every step.
Alex, you're saying that the level of intelligence of,
of these agents as we pass through AGI and get to ASI
is such that they'll figure it all out
as good as any human and be and replace that human in the loop.
What's your timing on that?
That was exactly my question, Peter.
So my timeline, if I had to spitball,
of course this is not the predictions episode,
so don't hold me to it.
Hold me to my predictions in the next episode.
My timelines are approximately two to three years
for, as a conservative outer bound
for some element of recursive self-imbrown
improvement where we get our AI researcher that's as good, if not stronger than the human
researchers for building ML models as a conservative outer bound, not 10 to 15 years, two to three
years max. Yeah, that's the outer, outer edge. But I also believe that's totally right, that
2026 is going to be the year of recursive self-improvement and capabilities growing crazy exponentially
and corporations moving at a snail's pace compared to what they could be doing. And it's all going to be stuck
and bottlenecked and log jammed, and it's going to frustrate the hell out of Google and open
AI. And companies like Invisible are the lubricant that's going to actually get it from point A to
point B. But that Clarnag use case is a really good way. Like, in our tests for contact centers,
80% of the people massively prefer the AI. But the 20% that don't like it more than torture
the whole thing to death and make it better to repeal the entire thing. There are probably eight
ways to fix that quickly.
Yeah.
But it's not going to come from Google, and it's not going to come from OpenAI, and it's
going to involve data that isn't in the natural data set, you know, and it's right now,
if you told me two years ago, that everyone in the world will know what RLHF stands for,
and there will be three people who are multi-billionaires who build RLHF companies walking
around, be like, that's not even a thing.
Oh, wait, now it's not only a thing.
It's massive in scale.
There will be new terminology in 2020.
for many, many of these other bottlenecks that, yeah, the AI can do it, but for whatever reason,
the bank is not doing it, the contact center is not doing it.
And those bottlenecks are going to be so lucrative for companies like Invisible to just plow them down.
I can't answer the specific question of whether your workforce is going to involve the distributed
workforce that you just described.
What was it called, Alex, or Matt?
It's called meridial, meridial.
Meridial, yeah.
So there is a really healthy debate on whether is meridial a key part of this or a network of even more agents a key part of this?
Or is it, you know, is 2026 the transition year between the two.
It's going to be a really interesting foot race between those two different approaches.
And that's really what I'm – that really is – I think you put your finger on it, Dave.
That really is what I'm asking, which I think is a distinct question from, is there value in supervised fine-tuning or reinforcement learning with human feedback going forward?
Of course there is.
What I'm really asking is how much of that can come from AI sort of bootstrapping it in the near-term future versus needing human inputs.
And what I'm saying is think about a balance between generalizability and hyper-specificity.
And I agree with you on generalizability.
I actually don't think RLHF is important even now for that.
But where it gets more complicated is when you want to train off a specific task.
So let's take the insurance claim example that I mentioned earlier, right?
you're going to generate a 10-page insurance claim.
And you could apply this to any enterprise use case, many consumer use cases,
but in that world, you build an LLM, it's producing an outcome,
it's fine-tuned off a specific company's data,
but you need a way to actually say at that point,
does this produce a comparable output to what that claim did,
to what a human doing is task before was doing?
And so when I mentioned earlier custom benchmarks,
that is the process by what you do that,
is you actually do need human equivalence testing.
You need a human to provide a comparable data set and to say, this looks good or it doesn't.
And you just don't have precedent data to train that off of in any CLM because the human input is not.
Now, again, that's going to keep going down more and more specific tasks.
If you take legal services, take it by language, take it by topic, take it by document type.
There's human feedback required for all of that.
I almost, I mean, not to put too fine a point on it, but I want to make sure that those in this episode who want to drink the bitter pill with the bitter glass of water for the bitter lesson are so drinking.
I'm curious, Matt, to understand how you see this.
Surely there's a wave of generalism that is over time, and maybe we can sort of finesse what the appropriate time scale is.
It sounds like maybe your so-called timelines are a little bit longer than perhaps mine.
But would you at least agree with the premise that over time, even the specialized skills
end up getting subsumed by generalist models?
Or do you think that's just never going to happen?
We'll always, or by always, I mean, like on time scales of 10 to 15 years, which is pretty
long time scale, we're just going to have generalist models that are always sort of like
specially fine-tuned?
I don't think all expertise, all specialized.
expertise right now. I mean, again, if you think about a lot of that, a lot of the information
that specific experts have, there's no training data available for that. Like, it's stuff
that sits in people's head, it's experience. Like, take, I mean, again, I'm aware of many of the
narratives that human expertise becomes less important. Again, we are a company that actually
thinks the human touch elements become more, more important. But take, you know, take sales, for example,
many of the best selling patterns,
many of the people who've done that the best,
there is no information you can train off of what they do.
Human interaction, actually, in a world where,
there are 500 companies selling email-based SDRs.
I think human beings become more important in that world.
So I don't actually think special.
I actually think the shifts are expertise becomes more and more important
in many different areas.
I think human loops stays really important.
But I think the, I mean, if you take a,
contact center. And Alex, I understand the theory of what you're saying, but like we're
four or five years into this. And if you look at the number of U.S. contact centers
that have migrated to using agents, it's a pretty small percentage.
Can I ask you actually, the Jane Street question is really burning a hole in my pocket.
So it's really clear that stock picking is moving to AI at warp speed. And the reason is
because there are no barriers. Like you're just placing a trade that's already automated.
So it's like, and that's the other to be. And there's a great benchmark. More money.
More money.
Yeah, and also almost all of the volume on public equities markets has long since been dominated by Algo.
So this happened decades ago.
Yeah, well, it started with rapid trading.
So the quants were already there.
So now that it's moving to fundamental analysis, it's the same mindset.
So that's one of the reasons it's just taking off.
But, like Peter said, you know, you're making more money.
Okay, let's just keep going then.
There's nobody you're saying, but I'm going to lose my job.
It's like, no, we'll just pay you more.
Let's just go.
So it's a really interesting, you know, bellwether.
But, you know, within that world, they're struggling because the data is so proprietary.
And it's looking more and more likely that these self-improving massive foundation models are going to get to, you know, superhuman IQ this year, this year being 2026.
But the prompt window is getting massive and the recursive chain of thought reasoning is getting really, really good.
So you can actually feed it data without having to retrain it and have it achieved the job.
So if I take that mindset, you know, from Jane Street, and I move it over, now I'm a mechanic
and I'm trying to fix a car and trying to diagnose what's wrong with it, and I have audio
and I have, you know, sensor data. Great. Easy use case. But am I going to then put that data
into the LLM API and transmit it to Open AI where they can accumulate it? And then if they decide
later they want to be a garage, they have all my data, or am I going to run some kind of a
walled-off model? And, you know, garage mechanics may be not the best example. That's why I
is Jane Street, because they're never going to take their proprietary data and give it to open AI.
But in the middle ground, you have banks, insurance companies, you know, hospitals,
like, how are they going to deal with this? Like, it's easy now. Like, sometime in 20206, it becomes
easy. But the data is, that's my only reason for having a competitive advantage. I don't want
to give it, like, over the, to the API. Yeah, look, I think you're seeing, there are definitely
sectors, many of what you just named, banking, health care, where, um,
people are deciding to keep their data on premise or they're using things like small language
models for those sorts of reasons. And I think you may continue to see that as a trend.
I think one mistake folks often make is not all data is proprietary. So you can have, you take
the Jane Street case, maybe their trading data is proprietary, but their back office kind of forecasting
data might not be. Or back office finance data might not be. And so I think one thing is being clear
about the data that you don't, that you need to keep proprietary and that you do want to take
more parameters of security around. And then what data you say, look, this is actually, I'm going
to be very careful as a company, but this is data that is not as proprietary. I think that's
sort of balance. I think the whole, you know, similar to what we discussed with contact
centers, the idea of I will not give anything to the LM, but I'll keep it all in-house.
I don't think that makes sense either, but I do think that's a paradigm you're seeing more and more.
I want to kind of change the tack a bit, if that's okay.
I actually do agree that we'll automate,
but I think we'll automate in a way that's different from this discussion.
So let me give an example.
Let's say I'm Canon printers and I'm selling home printers, right?
Right now I have a bunch of people doing marketing and content development,
brand management, then salespeople to sell to the best buys and so on,
online folks. Then you have post-purchase, getting the customer to try and register the
dang printer. And then you've got all the repair support, technical staff. And then you've got
your accounting folks in the company. You can get a job again. Right. So you've got pockets of
people doing different functions across the board. If I was going to build an AI native printer sales
company, then I might think about having all of those things automated completely.
with AI. And then you're not human-centric, but you're function-centric across those.
The printer could report when it's, you know, running out of ink and you ship it a new thing.
It tells you when there's a problem with it or there's a problem coming up.
You alert your repair staff saying, hey, this guy, maybe we can upsell them on printer,
da-da-da-da-da. And you essentially automate all the functionality with AI,
and you leave the human 90% out of the loop almost completely because you've automated the core
functionality. And right now what I'm seeing is what I used to call radio over TV, right? When you first
had television, we took radio announcers, put them on TV to read radio scripts. We didn't adapt for
the medium. And I think what I'm seeing right now is we're automating right now what the human
being is doing at each of those functions. But surely over time, we're going to automate the functional
flow and then get rid of the human beings completely. AI native AI first, right? Not to mention
getting rid of the printers. Well, that's a separate question.
I'm not just using that example who's going to be doing any of the produce.
Let's leave that part aside just for the moment.
I think you're absolutely right, Salim.
I mean, this is where a young AI native company reimagines an entire field and has zero legacy
and zero friction in coming forward.
The question, as Matt said in the beginning, is do they have the distribution, right?
But this is where a large company, Canon in this case, should actually be indebted.
investing in entrepreneurs. I mean, one of the things that you and I talk about a lot of times is
if I'm a large company and I don't know what to do, I would basically hold a competition,
ask young AI entrepreneurs around the world to come forward and how would you disrupt my company?
You know, give me a pitch. And then I would pick the best five of them and I would fund them.
And I would say, you know, we're going to fund you to disrupt us. And then, you know,
we're going to give you access to our data to everything we have. And then ultimately, we're
going to buy you or buy a majority stake in you, and we're going to make you our new company,
right? This is the innovation on the edge, the displacement of the core, et cetera. How do you want
to call it? This episode is brought to you by Blitzy, autonomous software development with
infinite code context. Blitzy uses thousands of specialized AI agents that think for hours to
understand enterprise scale code bases with millions of lines of code. Engineers,
every development sprint with the Blitzy platform bringing in their development requirements.
The Blitzy platform provides a plan, then generates and pre-compiles code for each task.
Blitzy delivers 80% or more of the development work autonomously,
while providing a guide for the final 20% of human development work required to complete the sprint.
Enterprises are achieving a 5x engineering velocity increase
when incorporating Blitzy as their pre-IDE development tool,
pairing it with their coding co-pilot of choice to bring an AI-native SDLC into their org.
Ready to 5X your engineering velocity, visit blitzie.com to schedule a demo and start building with Blitzy today.
You're a medium-sized or a large-sized company. I'm not going to focus on the startup right now.
And what do you do in 2026? Because you're going to have to do something. You're going to have pressure from your board, from your, from your,
your shareholders from, you know, from just competition.
So you got to do something.
And what I heard you so far, Matt, is number one, you got to get clean data.
You need to make sure you understand what your data situation is.
Number two, you should pick two or three, if you would, areas, call them benchmarks,
where you're going to run experiments on.
And it's not a proposal.
It's not a, you know, an idea.
It's actually run it, actually do, you know,
run an experiment to see how it works. What else? And then pour scale, pour money on the things
that do work and then have an expanding sort of increasing circumference around the company's
major revenue engines. How do you think about that? Walk us through a few more steps.
Yeah. So I think one of the things, which has been a lot of the topic of conversation here,
was given all the improvements in the models, given, you know, what Slim is walking through on the
potential to clean sheet and design a company from scratch, there was this MIT report that came out
that 5% of enterprise models right now make into production, right? So I think there's a starting
question of given all this tech excitement, why has that been so much harder? And it's not the
technical challenges. We talk about it's the data, it's the focus on which priorities to look at.
I think the other two big ones, though, are the organizational structure by which you pursue those
and issues. And particularly the advice I give everyone is do not locate this in your technology
organization. Take your best operator, your best ops person, give them an operational KPI and track
it to that. And make sure it's a really clear operational KPI. So we talked a bunch of contact
centers. You should have an operational person there, lead it around, you know, CSAT score, time per call,
whatever the core metrics you're looking at. And that should be your guide. If you want to take
something like inventory forecasting, you should do it around inventory day, stockouts, all those
kind of metrics. But I think if you have a clear sense of which operational person is leading it
and how they're marshalling resources around it and you have a clear KPI, you're going to make
progress if you focus on a couple different things. I think the failure mode on that has been
you let a thousand flowers bloom, none of them have an operational metric and you kind of end up
with a science project dynamic. Yes, exactly. That's exactly right. If you walk in, a thousand
flowers, boom, you walk in and you say, I am going to give you a million genius level people
for free, do something, it fails.
Yeah, I mean, it's like, here's a million people for free, and they're all geniuses.
And it fails for that same reason.
It's like, I didn't think of an idea, so I said a thousand flowers just go bloom.
I couldn't think of anything, so maybe you will.
Like, how's that going to work?
I've seen that, you're exactly right.
It's just so sad, you know.
We go even further.
We basically say not just take the operator and put them outside the organization
and let them build something from scratch at the edge
because otherwise you get encumbered by all the internal rules and bureaucracies
and that gets slowed down for a huge amount.
Then it fails for legacy reasons.
The Lockheed Skunkworks, it's the Apple MacBook team.
Yeah.
I mean, look, Apple is actually a master of this.
If you think of what Apple will do is it will form a small team.
that's very disruptive, they will put them at the edge of the company, they'll keep them
secret and stealth, and they'll say to them, go disrupt another industry, right? Whether there's
watches or retail or whatever. At Lascon, I think they have 18 teams looking at different
industries thinking about, and when they think it's ready to disrupt, they go into it and they
patiently iterate, right? The Apple Watch, for example. So this is the model I think we're going to
see many other companies take on where you do this and you, because if you think of any operational
company, the insights they have on all sorts of adjacent industries is incredible. Very hard to
disrupt in their own industry because they're probably pretty optimized for it unless you come
with the AI startup. But they can really disrupt a lot of the edge cases, a lot of the industries
around them. So I expect them to launch AI native startups that go into adjacent industries
and go attack some of their neighbors. Nice. Matt, before we get to a few of your 226
predictions, can you just share a couple more of the use cases here just because they're
fun. So we worked with SEIC, Vanter, and the U.S. Navy on building intelligence for underwater
drone swarm for unmanned underwater vehicles. So think of that is if you have a series of drones
and you have enormous numbers of sensors on each of those drones and you need to understand
the movement patterns of those different drones. And in each case, you see a different, you know,
you see an object underwater. What do you do? Do you engage? Do you step back? Do you move with other
drones. That whole movement pattern and decisioning for underwater on man
vehicles, that's what we worked down on fine-tuning a model to do that, training it,
looking at all the movement pattern data. And again, this is one of those interesting
things about drones is they are autonomous. And so thinking about how those movement patterns
evolve in complex environments is very hard to do, but you also have lots and lots of interesting
sensor data to do that. I think one that maybe anchors more on the human decisioning side is
Swiss gear, so like Swiss Army, the luggage brand.
Similarly, and I actually think this is Peter, one that a lot of folks in the audience
may relate to in some form, which is, you know, they had an enormous mix of different
data tables around products, customers, et cetera.
They couldn't really bring together for inventory forecasting.
And so we use our data platform neuron to bring together 750 tables really quickly and then
optimize the forecasting to look at both minimizing stockouts and optimizing which
inventory to hold, which if you get inventory forecasting right, it's probably one of the
major issues for most big and small businesses is you minimize loss revenue, you make sure that
you don't hold lots of excess inventory. It's one of the hardest things to do, particularly
if you've got a six to eight month order cycle time. And so that was something we partnered with
them on and I think was what's a great outcome. We ended up expanding their overall inventory
coverage by about 30% and basically 2x the numbers of skis of the reliable prediction.
And again, that was done in about in a couple of months.
All right.
So later this week, my Moonshot Mates and I are recording our 2026 predictions.
We'll have Emod back and we'll be talking.
Each of us will provide two predictions for 2026.
We'll have our top 10 from the Moonshot's podcast.
It's going to be fun.
It's going to be a battle.
We're going to ask our listeners to vote on which predictions they like best.
I mean, of course, they're all going to vote for Alex's, but hey, Matt, talk to us about what you see coming in 2026.
Yeah, I think I'll call out a couple, and we've just done a bunch of research on our 26 predictions.
So I won't say all of them, but I'll call it a couple.
I think one of the first ones I would anchor on is multi-agent teams.
So I think one of the challenges, and it's inherent a lot of what we discussed here, is if you're a large enterprise or a medium-sized company,
implementing a use case, you won't necessarily have one decisioning agent that does everything.
You'll train task-specific agents for individual tasks, usually orchestrated by an LLM.
And what that allows you to do is to pinpoint the accuracy on those specific tasks and then
use the broader logic set of the LM to make sure they all work together properly.
And I think that's been an architecture that's been discussed pretty broadly for a while.
But I think that we're just starting to see the green shoots of more and more folks having
success with that. Contact centers being a good example. So I think that's a big one that I would call
out. The second one I'll call out is the multimodal leap. I think more and more video, images,
audio are going to become a bigger and bigger part of how people engage with these models. So I think
audio probably one of the most interesting. And so I do think the way you'll be able to speak to
them, interact with them, visualize them is going to be a really interesting moment for 2026.
and I don't think that will all be text-based like it has predominantly historically.
And then maybe one other.
Yeah, please.
No, I was going to ask Alex for feedback.
Go ahead, but finish up, Matt.
Yeah, so the third one I'll call out because we've talked about it a couple of bit on this episode.
So I'll is kind of what we call either the mirror world or RL gyms.
So I don't actually think that's a well-understood concept for many folks in the audience.
But think of that as actually creating simulated environments to or digital,
twins for tasks you might want to test, right?
So maybe that's a coding environment.
Maybe that's a contact center as we've used that a couple times.
But it allows you to actually simulate a series of function calls, tasks, or environments by which if you're going to train a model or a task, you can actually test how it's going to work, like a manufacturing environment before you roll it out to your actual physical world.
And I think that's more and more in both model builders and the enterprise, what we're seeing is a very interesting topic.
I want to go around to the mates one second,
maybe ask some final questions of Matt.
Alex, you want to kick us off?
Yeah, I think the most interesting crux of what we're discussing here
is what is the future of human expertise.
For that matter, does human expertise have a future?
And assuming it does, what's the half-life of the...
It's cooked.
What's the half-life of the value of human expertise?
And so to put that in question for Matt,
what do you think of all of the forms of human expertise of all of the labor categories and job
roles that exist in the economy today what do you think will be the last three of those job
roles or forms of expertise that will disappear or ultimately succumb to AI what are the last
three to survive expert standing okay that's right I mean I'll go back to where I started the
episode I think a lot of the commentary on mass shifts ignores the actual
function of jobs in society today. So if I go, let's go, let's take sectors, for example,
oil and gas, a lot of the functional expertise, you know, geosysmic, if you look at seismic
engineers, people on oil and gas sites drilling, that is a human function. Like you do need,
so I think real estate as another example, like, you know, humans actually help select which
you can go down a whole list of different areas. I think there are sectors where you're going to
see more disruption near term. I call that a couple of them, BPO's legal services. I think
media is a fast-changing area. But I'm also not exactly sure that those
lead to negative, meaning have negative employment consequences. Like, if you take media,
it's a really interesting one. You know, five, six years ago, eight years ago, I think media
is a category really struggled in a lot of ways for paid media as an example, right? And you've
actually now seen in the last couple of years post-DLM era, substack, medium, all
these blogs become much more interesting. You have way more media entrepreneurs. And so you've
changed the function of society and like where the money is coming from changes, but it has not
changed total employment. And, you know, look, I understand a lot of skepticism that says that, you know,
AI is going to radically change everything. But I think if you look at the American society for the last
hundred years, it's something like 25% of every high school class goes into a field that did not
exist when they were in high school. And the reason that persists is people go into the working
world, understanding the tools they have, thinking about what they can create from that.
And, you know, one of my favorite statistics, the Wall Street Journal reported the last,
last, like, a couple weeks ago is 20% of U.S. employment right now is digital ecosystem jobs.
And something like 9% of U.S. citizens are full-time social media influencers.
which is mind-bottling to me.
But, you know, again, this is the changing nature of work.
And so I think that pattern will persist.
I think that the core of what will change is the process of looking up information across
multiple systems and documents, that is going to become less valuable.
But I think all the jobs that involve human interaction, physical work, physical.
I actually think one of the most interesting things over the next couple years is the
job ecosystem around data centers, electricians, et cetera, is going to become way more in demand.
I was actually sitting at a panel.
Oh, yeah.
I was on a panel with, with a recruiting, someone who runs a recruiting company.
They were saying that job profile, they think will two, three, four X over the next couple
years.
And so that will have pretty interesting implications for the education system of everything else.
But I don't, I think we will see an evolution.
I meant the human, the humanoid robot, electrician, and plumber.
Alex, very quickly, what are your three last standing human roles here?
It's interesting.
I'll present multiple competing hypotheses.
So, hypothesis one.
Briefly.
Briefly.
One hypothesis is it's the politician because they help to make the laws.
Another hypothesis is that it's the greatest intellects, the physicists or mathematicians,
even though, as we talk on the pod, math and the sciences are all getting
solved on the one hand, there's still perhaps to the extent that that represents the culmination
of human intellectual accomplishment, maybe the greatest intellects will be the last to be automated.
There's another school of thought that says, no, it's the roles that involve the greatest
need for human authenticity, because even though it's not actually a capabilities question,
people nonetheless demand human contact or something to that effect.
And so it's going to be the highest touch job roles where people just want to know.
know that there's a human counterparty on the other side of the interaction.
So that's a set of three hypotheses.
Taste makers will dominate.
That's authenticity.
That's what Mike Saylor said, word for word, actually.
Yeah, we had on his boat, we had that enjoyable sunset conversation.
Thank you, Alex.
Selim, do you want to go next on a closing question for Matt?
I think you covered some of it on the industries that are kind of going after.
You guys have done some government work.
Where in government functionality do you see the biggest opportunity for AI automation, efficiency, et cetera?
Yeah.
Everywhere.
Look, I actually think this could be one of the, I think this could be one of the really positive trends for society.
So I saw a study recently that AI-assisted permitting could cut energy and data center project implementation timelines
by 50%. Think about housing, like one of the biggest challenges right now for housing development
in the U.S. is NIMBY regulations and how complex it is to build housing because of the myriad
of different regulations and zoning contracts by location, right? Or even taking OECD came out
as thing that, or it came out of this report that AI could shrink public sector process cycle timelines
by 70% on licensing, benefits, approvals, compliance,
and basically accelerating infrastructure deployment.
So to me, the simplest thing that AI can do
is project management and timelines
related to all spending and infrastructure deployments
is a really positive thing for society in my mind.
Amazing. Good question, Slim.
Dave, why don't you close us out in the questions here?
Oh, I got so many, but I'll pick the best.
First, Matt, how many hours of video footage
will it be of you one year from today compared to one year ago?
Because I know we saw each other in Riyadh a few weeks ago.
And I know that you are the thought leader in this whole bottleneck of AI getting into the enterprise.
It feels like what we're doing right now.
You know, the footage of you that's out there right now is all this bullshit CNBC, Bloomberg-type, you know, five-minute format.
But here we're getting your real thoughts.
It's just so much better.
But how many hours can we count on a year from today?
Well, look, I think as of 12 months ago, I had done almost no interviews of any kind, so this job has been fun in that front.
And look, what I enjoy about the podcast format is it does allow you to talk about some of the more complex topics.
And so, you know, particularly a podcast like this, that's really interesting.
So hopefully many more in the year to come.
Well, I'm hoping for at least a 10x on that.
And then my follow-up question of that is the Avatar version of you that's also out there talking.
Is that a 2026 thing you think or when?
Yeah, that's probably happens in 2026.
I don't think it would be that hard to train an avatar off of my public statement.
So, you know, I think that'll be an interesting.
We are actually working in the sports space, actually, on the topic of avatar training.
And I think it is actually an interesting space where you could imagine a lot of different areas where rather than a chatbot interaction,
people want to speak to people they know via an avatar that might get.
I actually think that will become a more natural part of society and a pretty interesting one, actually.
I totally agree.
I just, the timeline could be, you know, as soon as two months as far as I'm concerned.
What makes you think it's not an avatar we're speaking to right now, Dave?
That's a good question.
Matt seems very human, actually.
I don't know.
The best ones are.
But the green orbs behind you kind of give it away.
Yeah, they're pretty strange.
That's not real.
Matt, where do people find you?
Where do people find Invisible?
Who should go to Invisible to check out what you do and how you do it?
Sure.
So we have seven offices now, New York, San Francisco, Austin, Texas, D.C., London, Poland, and Paris.
I'm the easiest to find probably. We have an office right off of Union Square, which is where I'm at least half the time when I'm not on the road.
And look, I think in terms of who should come to us, from the listener base in particular, any midcap or enterprise company that knows there is potential in their business, that knows that AI can transform in a positive way,
and is struggling to bring all the pieces together.
I think that is the main thing I would say is there is no doubt,
you know, everything is Alex is asking,
the technology has made an enormous step change over the last couple of years.
The hard thing is actually the change management,
the operationalization, the metric tracking, the evaluation.
It's kind of bringing together.
Like, you know, I think it's the difference between our founder Francis has an idea of,
do you have all the components to build a cake, but you don't have a cake.
Like, what we do is we actually bake the cake again.
We build you something that works.
we make eye work and we use all the modern tools to do that amazing and the website uh invisible
tech. a all right thank you matt selim dave uh awg i'm going to see you guys in a couple of days for
our 2026 predictions um make them brilliant it's going to be it's going to be fun uh all right
i want a benchmark for trapping tracking benchmarks that's your all right
all right no that's not the one i'm going to talk about okay all right guys have a great day every week
my team and i study the top 10 technology metatrends that will transform industries over the
decade ahead i cover trends ranging from humanoid robotics aGI and quantum computing to transport
energy longevity and more there's no fluff only the most important stuff that matters that impacts
our lives our companies and our careers if you want me to share these meta trends with you i writing
newsletter twice a week, sending it out as a short two-minute read via email. And if you want to
discover the most important Metatrends 10 years before anyone else, this reports for you.
Readers include founders and CEOs from the world's most disruptive companies and entrepreneurs
building the world's most disruptive tech. It's not for you if you don't want to be informed
about what's coming, why it matters, and how you can benefit from it. To subscribe for free,
go to Demandis.com slash Metatrends. To gain access to the trends, 10 years,
years before anyone else. All right, now back to this episode.
