Orchestrate all the Things - This is where you sign up for an open-source AI stack for the future. Featuring AI Infrastructure Alliance Lead Dan Jeffries
Episode Date: April 27, 2021Open-source stacks enabled software to eat the world. Some of the most innovative companies in the world are working on building an open-source stack for AI. Dan Jeffries was there when the LAMP... stack enabled software to eat the world. Perhaps you don’t know, or remember, what the LAMP stack is, but it's actually pretty important. LAMP is an acronym made out of the initials of key open-source technologies used in software development - Linux, Apache, MySQL, and PHP. These technologies were hotly debated back in the day. Today, they are so successful that the LAMP stack has become ubiquitous, invisible, and boring. AI, on the other hand, is a hot topic today. Just like the LAMP stack turned software development into a commodity and made it a bit boring (especially if you're not a professional software engineer), an AI stack should turn AI into a commodity - and make it a bit boring, except maybe for data engineers. This is what Dan Jeffries is out to do with the AI Infrastructure Alliance (AIIA). Article published on VentureBeat.
Transcript
Discussion (0)
Welcome to the Orchestrate All the Things podcast. I'm George Amadiotis and we'll be connecting the dots together.
Open source stacks enable software to eat the world. Some of the most innovative companies in the world are working on building an open source stack for AI.
Dan Jeffries was there when the LAMP stack enabled software to eat the world. Perhaps you don't know or remember what the LAMP stack is but it's actually pretty important. LAMP is an acronym made
out of the initials of key open-source technologies used in software
development Linux, Apache, MySQL and PHP. These technologies were hotly debated
back in the day. Today they're so successful that the LAMP stack has
become ubiquitous, invisible and boring. AI, on the other hand, is a hot topic today. Just like the LAMP stack turned software development
into a commodity and made it a bit boring if you're not a professional software engineer,
an AI stack should turn AI into a commodity and make it a bit boring, except maybe for data
engineers. This is what John Jeffries is out to do with the AI Infrastructure Alliance.
I hope you will enjoy the podcast.
If you like my work, you can follow Link Data Orchestration on Twitter, LinkedIn, and Facebook.
Yeah, I've been in technology for about two decades.
I had an IT consulting company that did a lot of big Linux web farms
and a lot of Microsoft back office kinds of things, big exchange servers, AD, those types.
I went to Red Hat after selling my company to my partner, and I was there for a decade during the growth there.
It was about 1,500 people or so when I started, so I got to see it all the way up until the IBM acquisition. And that gave me a very strong sort of open source bent. And I got to see kind of the
sea change of proprietary software to open source software. And after that, I really
got super fascinated by machine learning. I kicked off a 50-page manifesto inside of Red Hat four or
five years ago. And I expected to find dozens of people
sort of beavering away on the infrastructure problem there. And it turns out nobody was
working on it. So I started all these working groups, and I socialized it up to the CTO.
But at some point in time, I felt it was necessary to kind of go back into the startup space,
because I felt like that's where all the energy was flowing. And that's where I ended up at
Packaderm, which is a machine learning startup focused on data lineage and data versioning.
And I love being kind of back at the energy of a smaller company
because I feel like this space is very energetic at this point.
And you've got to be where the action is if you want to make a difference.
Yeah, it certainly feels that way to me at least. And I have to say that I think you went public with the AI Alliance announcement not very far back.
It feels like a few weeks or a couple of months stops, if I'm not mistaken.
And so when I went and checked the names of organizations that were involved,
I saw lots of names that are familiar to me.
And actually, coincidentally, I just covered the Series B funding round from ML Ops,
from, excuse me, OctoML today.
Okay, cool.
Yeah.
It seems like I kind of naturally gravitated towards that space as well.
So it's kind of convergence of things I'm interested in as well. So open source and
machine learning and ML Ops, all of those things. So, okay, let's see where to start with the
AI Alliance. So whose idea was it, basically? How did it came to be? How did you, did all of you,
you know, got together and decided that this is a good idea? I mean, it was my idea, but it was,
the more I started talking about it with lots of people in the industry, the more I realized that
there was just a need for it. And actually I ended up being very surprised. I expected there would
already kind of be an organization that was thinking about how to capture all this activation energy, bring all these different companies together, get them talking to each other, get their integrations teams talking.
And really, nobody was doing it yet.
And that was very surprising to me as I started to dig into it.
And every founder that I ended up talking to, every engineer I ended up talking to is very
excited. I really didn't have to work very hard to kind of get people interested in the concept.
They understood it intuitively and they realized that all of the smaller, all the innovation is
coming from these small to mid-sized companies, right? That are getting funded now, right? And
they're up against these giant vertically
integrated players, like SageMaker from Amazon. But I don't think any of the innovation is coming
from that space, right? I think it used to be, and I saw this movement at Red Hat, that the
proprietary software companies would come up with all the ideas, and then open source would copy it
in kind of a okay sort of way. But then over time,
most of the innovation started to flow to the open source and it started to
flow to the smaller companies projects.
It started flowing to people working together,
even kind of frenemies working together to build something as a collective
that they could all benefit from.
And the proprietary vendors started to fall behind.
So I think that that has maintained at this point in time.
So I think SageMaker and those kinds of vertically integrated products,
they take the innovation from the smaller space and they copy it.
But it's not very innovative.
And the innovation is going to come from a bunch of these companies
working together as like little Lego pieces that we stack together.
I firmly
expect us to have like a LAMP stack or a mean stack of AI ML in the next five to 10 years.
And it'll change over time, right? You went from LAMP to mean to whatever the framework is
nowadays, the same kind of flow will happen in the machine learning space. But I don't believe
the hype that anyone has the end machine learning-end machine learning system at this point.
We can't because it's just moving too fast.
The space itself and the problems we need to solve are in motion at the same time as the software is being created.
So it's going to take a number of years for this to shake out.
But the innovation is definitely going to come from small to mid-sized companies and projects.
Yeah, I mean, in principle, I totally share that notion.
I mean, about innovation coming mostly from the open source space
and from companies that are not, you know, behemoths, basically.
They serve a different purpose, in my opinion,
which is, you know, basically taking that innovation
and deploying that at scale and making it accessible and
deployable along this different hardware and so on. But that's a different part of the ecosystem,
I would say. So seeing your kind of manifesto or mission statement, it became quite clear to me
that, well, actually it didn't become all that clear. And that's going to
be my question to you. So I kind of had the feeling that, okay, so maybe what they're aiming
is what you alluded to earlier. So to actually build a kind of stack or a standard, because you
also mentioned things like interoperability and API bridges and this kind of thing.
So I'm wondering really if you're actually aiming for what you said, like a lamp stack.
And if you are, what's the road to get there?
Because you mentioned earlier that there was a lot of enthusiasm and your idea seemed well
received by a lot of people,
but there's a distance between saying,
hey, okay, that sounds like a good idea, sign me up,
and actually coming up with a roadmap and a plan to make it happen.
So I'm wondering how are you going to walk the walk?
So there's a lot.
I tend to think of myself, in George R.R. Martin's words, as a gardener instead of a planner, right? So you're a writer, so you know that there are two types of writers, right? There's the ones that plan every single thing down to the finest detail before they commit a single word to paper, and then it's just like executing on that. And then there's the gardeners who plant a lot of things and see what comes up and start
to sculpt those different areas.
I'm very much a gardener.
And I see a lot of the folks who I work with in the organization as being gardeners as
well.
And I think it's necessary at this phase.
I see the Alliance evolving over the course of time, right?
And its mission will shift over time.
First of all, I don't think it's possible to just automatically come out with a LAMP stack for AI,
again, because it's still in motion.
So my goal at this point,
and I think a lot of the other folks' goals,
is to get everyone talking,
get the integrations team talking, engineers talking,
get them talking about the problems that are there.
I actually thought that we probably wouldn't end up
hosting any projects as we form into a foundation. We're in talks with the Linux
Foundation, Deep Talks Now to potentially roll under their organization, just like the cloud
native had done or form our own 501c6. But when I look at it, there's, if anyone came,
SGFL Scientific, which just joined and they have 35 machine engineers and some, you know,
tremendous customers,
they said if you come up with a stack right now or an architecture right now,
it's going to change in six months. And I think that's legitimate.
So I think right now the idea is really to get the folks talking at a lot of
different levels rather than working in their own little silos. And to me,
this kind of mirrors the beauty of an open source development model,
right? I also agree through the proprietary big vertically integrated systems serve their own
purpose. And they're always going to make money. But the difference is Kubernetes and Docker don't
become Kubernetes and Docker if they only run on Google. And so what you see is all these
proprietary individual siloed versions come out over time. And then
somebody comes up with a standard that actually starts to transcend and work across these
different things. And they all end up retroactively adopting it, right? So I still remember when VMware
didn't love Kubernetes. But if you look at their marketing literature now, it's like the great,
you know, the great leader, we've always loved Kubernetes, right? And so my thinking is the key is just getting everyone talking.
And we just brought on, for instance, Quantum Black, which is a great solutions integrator.
They were purchased by McKinsey.
They used to have done a ton of Formula One work.
They're working on technical counsel in the same way.
And they're bringing together banks and all the infrastructure
folks and a bunch of people who are thinking about it because they've already seen the tools change
just in their own lifetime. They started in 2008 working everything in MATLAB and now they've seen
three iterations of tools. They understand the kind of same issue is just getting different
people to the table to start talking about these things and forming it. I agree with you, though, it's an incredibly challenging problem, right? And I don't expect it to be
solved overnight. But the key is getting everyone in a room and starting to think about how to
interoperate. And I worry more about now micro alliances within the group. Okay, not everyone
is going to integrate with everyone in there. If you look at 30 logos on the website, you're not
going to build a machine learning stack with all 30 of those things. There's five monitoring
solutions on there. There's six or seven pipelines. There are people who are doing
similar things. We allowed competition within the space, but you'll probably integrate with
three or four of those things. And I'm all for Darwinism in the space. Let the different
companies work on who they think they should partner with and build integration,
build joint examples, build these things.
We're already seeing that happen.
Algorithmy has already built one with five or 10 people.
Packeter did that with Seldon.
And Neuro just did that with us and Seldon and a few other places too.
So that's what I'm hoping to foster at this point.
And then that eventually will lead
to an overarching architecture and kind of ways of talking about it in general. And we just brought
Canonical on as well. They're already looking at how do I build kind of a deployment framework for
this thing that's general purpose. So it happens and fits and starts. Little pieces of the puzzle
get solved. And then eventually you get a glue project that kind of weaves it all together, I think, at time.
And that's, I think, the eventual place
that we want to get to.
But it's going to take a few years to get there.
It's not going to be solved overnight.
Yeah, it sounds reasonable, actually.
And yeah, I would be a bit skeptical if you said,
oh, you know, you have everything figured out.
And, you know, that's the plan.
Give us, I don't know, a year or two and you'll see the standard emerge.
What you're describing sounds much more real, actually, therefore credible, I would say.
My question then would be, like I said, I mean, I'm totally with you on the community building, basically,
because it sounds like this is what you're mostly doing
at this part.
But then again, it only kind of transposes the question
because, again, community building is a very,
as I'm sure you know, having been involved in open source
for as long as you have been,
community building is actually very, very time-consuming
and energy-consuming, and it needs to be, you know,
deliberate, and you need to have things like i don't know um uh rules basically for for how people interact and you know what's
allowed and what's not how do you grow the community in an equitable way and so on and so
forth and since you're all basically you know running your own startups and being very creative
and we all know how startups are how how do you even are how are
you even going to find the time to do that well i happen to have a lot of free time uh in terms of
my ability to dedicate to this at this point i wouldn't call it let me let me not call it free
time because then uh you know pakadar might think oh my god am i doing anything but the truth is
they've been very supportive of the concept from the very
beginning. And they've,
I've reached a point in my career now where I have a lot of kind of leeway to
run with my ideas. And the way I think about it a lot is that when I was
younger and in the internet phase,
I saw how important the internet was that I was, I was just a kid.
I could only really surf the wave.
So I went to work in.com boom and in Linux when everybody thought that was crazy.
But really, I had to go where the innovators were and just kind of hope that I was right.
At this point, I've seen enough of the changes over time that I feel like I can influence it.
And I feel like I can bring together other folks who are smart and powerful and thinking clearly and visionaries
to work together on these things. So that gives me a good chunk of time to kind of
dedicate to taking over kind of the directorship of this stuff. And by turning it into a foundation
or going into the Linux foundation, we get to the point where we can start to add a good chunk of
revenue into the equation. And then you get people who are just firmly focused on it and it becomes a balance of volunteer efforts, right? And of people actually paid to
work on different aspects of it. So I think it's feasible and you're right. It's just going to take
a ton of time and effort. Luckily, I'm in a Goldilocks position where I do have the time
to build the community, to talk to people, and I really enjoy doing it.
And I'm hopeful at this point in my life
that now I get to shape a little of the events
of what I think is probably the most important technology
that we've ever invented in the history of man.
When I look at artificial intelligence at this point,
I think very few people understand
just how important
it's going to be. And I think they have an inkling of it, but it's usually a fear-based kind of
thing, right? And they don't understand fully that in the future, there's two kinds of jobs,
one done by artificial intelligence and one's assisted by artificial intelligence, right?
No doctor is going to just have a technician who doesn't have an AI
assistant looking at that cancer screening. You know, it's just no,
no material scientist is going to just dream up the next iteration of a
material. They're going to have 50 different versions sourced.
No artist is going to play two bars without the artificial intelligence
essentially creating 20 more bars
and going, hey, you know what? I like Rift number three. Play me more iterations of that, right?
It's going to be a co-creative process. And from my standpoint, I just want to be a part of building
that infrastructure layer so that people can move up the stack and do the more interesting things.
It's that you don't get to WhatsApp and 35 engineers hitting 400,000 people and getting that kind of an audience until all those protocols are built.
They didn't have to build a GUI and end-to-end encryption and a peer-to-peer messaging protocol.
They could build from those pieces and do something more interesting.
And that's where I think that we have to get to.
And that's what I hope to influence.
Okay. I was going to ask you about like,
okay, so what's the next milestone for AI Alliance? And I still want to ask you that, but you gave me a few lines that are just too good to pass upon. So you said two types of jobs,
like one job that is done by AI and another type of job that is done assisted by AI. I would say maybe it's three types
of jobs actually because well who's going to build that AI and to me that's a different type of job
don't you think? Well I think the people building the AI are going to be assisted by artificial
intelligence so I think they fall into category two. We're already seeing that now right in other
words most of the artificial intelligence yes there's a creative aspect where the person has to dream up an algorithm about, you know, that
approximates thinking or reasoning in some type of way, right? Or just a pattern matching algorithm.
But, and you think about something like hyperparameter tuning. And in the beginning,
it was all people just pushing and pulling the dials, if you will. But now you have whole search parameters to go through and try to iterate on what the best hyper parameters of what
the best architecture is going to be. So it's already gotten to the point where AI is assisting
in the creation of AI. And I think that's only going to, it's going to continue. It's, I actually
had this conversation with my, with my partner earlier today, where I said, every time we see
like a self-driving car crash in
the news, we go, oh my gosh, we can't, we can't allow this to happen. And I go, listen, people,
unfortunately, 1.2 million people die on the road every year through their own, you know,
you know, lack of skills. It's very difficult to drive, right? It's very difficult for humans to
do this. 50 million people are injured. At some point, the algorithms get better than that, right?
If it cuts it by half or brings it down to a quarter, at some point in time, the entire
shift mentally is going to take place.
And people are going to say, whoa, you want to drive on your own with your own hands and
using your own wits?
No, whoa, whoa, whoa, whoa.
You're only allowed to do that in specific circumstances.
I think Elon Musk said it would probably be illegal for humans to drive in only allowed to do that in specific circumstances. I think Elon Musk said
it would probably be illegal for humans to drive in 50 years, except in very specific circumstances.
And so in that case, that's where I really see the algorithms getting bitter and bitter.
I don't see humans coming out of the loop, though. Again, I hate this story where AI destroys all the
jobs. We've already destroyed all the jobs multiple times in human history. You didn't
hunt the water buffalo to make your clothing before you came to work today, right? You didn't build
the microphone on your head. We destroyed all the jobs. And it's easy to see the destruction,
but it's very hard, very hard to see all the things we create with it. You can't explain
a web designer to an 18th century farmer because it's built on the back of 15 other inventions,
electricity, wires, the web, the browser, Photoshop, hypertext, all these things, right,
that leads to that. So humans are really good at seeing the disaster, but not good at seeing the shift. In my opinion, intelligence, there's not a single industry on earth that's not going
to benefit from having more intelligence.
Drug design, supply chain management,
material science, defense,
any of these types of things are gonna benefit
from having more intelligence.
And so we're going to have all of these jobs
essentially be assisted with a helper along the way.
And I think that's wonderful.
Well, the counter argument to that,
and it's deeply philosophical actually,
but the counter argument to that is like,
okay, so this time around is different
because the pace of innovation is so rapid
that you just don't have enough time
to create the jobs that are going to be displaced
and you don't have enough time to reskill the people
that will have to be reskilled and so on.
So you have this, you know, impedance in the pace of destruction versus creation of new
jobs, basically.
You know, I'd say that I'd say the history of the phrase, this time is different, has
about a zero percent win record in history, right? This time is never different.
It's just an iteration on an old pattern. And if you really look at the history of life,
it's iterations on old patterns again and again. It doesn't mean that there can't ever be sort of
economic disasters, but we don't need AI to do that. We've done that a few times in
history, right? And when you made the switch from hunter-gatherer, you know, to agrarian revolution
that changed the nature of jobs, and then you made it to the industrial revolution, and that
was difficult to integrate. Probably, you know, we had wars and things in that time, probably more to the shift to that type of society. So we have had difficulty integrating, you know, giant changes in the past. But I don't
think that this is just, I don't think that we're moving so fast now that it's impossible to
integrate any of these changes, right? I think that we'll be able to, I think that this time around, we're a lot more
adaptable. I mean, take a look at something like the pandemic right now, okay? It had the potential
to be in a disaster on unprecedented scale, right? Going back to the plague time, right?
Think about how quickly we were able to iterate on a brand new type of vaccine, right? Accelerated by machine learning, accelerated by brand new concepts.
They had the vaccine designed in five days and,
and information sharing across the sciences.
We were able to get the DNA out there of virus and people were able to study
it all over the world. And within a year,
we've been able to take this brand new type of vaccine and get it out into the
world and hopefully get
the world right back on track much faster than we would have ever had. That kind of adaptability,
I think, is built into the system now. So things do move faster, but people are able to integrate
those changes faster. When I see kids today, they pick up an app and they toss it away two years
later and they're fine with it. So I think maybe folks like you and me are maybe have more trouble adapting because we're a little bit older.
Let's be honest. I'm older than I look. I'm sorry. I'm just going to call it out there.
I'm just going to call it out there. Right. But when I look at the kids today,
I mean, they'll throw away an app and never even think about it and use a brand new app as if they used it their whole life.
So I think humans are actually becoming as adaptable as the speed of technology changes.
And I think we worry about the integration of that a bit too much. Humans always adapt in the
long run, even if there's a short-term blip in the flow of it all. Okay. You're obviously
much more, you're definitely an optimist, and I think you're
definitely much more optimistic than I am.
So I'm going to hit you with another counter argument then, which actually ties back to
the original discussion about AI Alliance and what it is that you're doing with AI Alliance.
So you referred to how innovation doesn't really come out of the big players, but more like companies like Pachyderm and OptoML and the other companies that have formed the AI Alliance.
But one thing that the big players do have is data, basically. And you know much better than I do that you can have the best machine learning algorithm that ever existed, it's pretty much
useless without data. So there's a quite big and actually growing, I would say, imbalance there.
Big guys get richer in terms of data and everyone else gets poorer in comparison. So do you think
this is viable and do you see that as a kind of obstacle going forward as to having an equitable
type of AI?
I mean, now you're making a strong argument, right? And I
think it's, it's a legitimate one, right? And I, I would say,
actually, a decent amount of the original kind of innovation did
actually come out of the big company. So I'm going to reverse
a little bit here, right? In other words, it came out of the original kind of innovation did actually come out of the big companies. So I'm going to reverse a little bit here, right?
In other words, it came out of the research labs and the big bang companies and such.
But what you end up seeing is those folks are able to solve all the problems simultaneously.
In other words, they're able to build the infrastructure and the algorithms and the
innovation at the same time because they have these huge general purpose operating systems.
They have all this data, as you've noted, right? So I equate that to them building the car, the wheels, and the
street simultaneously. And for the rest of us to really use this, you have to coalesce and bring
together some of those pieces. You have to have infrastructure you can build on top of, or you
have to have data sets that are more public, or you have to build your own data sets or crowdsource them together.
I think that what you do see, though, is a lot of the engineers that came out of those companies solve a general purpose problem, something like Tekton or Ylabs.
Ylabs, all those engineers came out of Amazon working on SageMaker, and they realized they
needed a complete redesign of how you think about monitoring, because you have to monitor
the whole length of time when
you're doing inference. And if there's drift, you need to keep the whole history. Whereas in a
traditional monitoring system, you don't have to keep that. You just have to know the web servers
up or down. If one web server is bouncing, you kind of want to know that history, but the rest
of it, you can kind of throw away. And I think Tekton, same time, those folks came out of Uber.
They came up with that concept of a feature store. And then they thought, oh, we've abstracted this concept. Now let's leave and start our own company. So I think
there is actually a co-creative process between the bigger companies and the smaller ones.
I like to give obviously credit to the smaller ones because I think that's where it gets
accelerated and that's where it starts to really change. And I think on their own, the big companies
can't do it. But quite frankly, you're right, the smaller companies can't do it without some of the bigger companies' initial innovation.
When it comes to data, I do think that it is a moat in the interim.
The question is, does machine learning change over time? And I think we're getting better at dealing with smaller data sets.
I think we're getting better at, you know, fewer shot learning. I think we're getting better at transfer learning and iterating off
of those things. So I could see large numbers of algorithms trained and then kind of brought out
as kind of pre-trained models that other folks can just start to build on top of, right? Or I can see
each kind of company building its own private data set and transferring it to like a quick training center that has a
an economy of scale but you are right this is a legitimate argument to say that the data sets
is the advantage that the giant you know internet web you know web companies have currently um but
the world is not the the amount of data that the world is creating is not getting smaller. Right. And I think that you are going to see a huge chunk of people get access to more kinds of
data or being able to build their own data sets over time.
And that will sort of naturally tip the balance away from the mode that they currently have.
But you are correct.
It is a legitimate mode at the current time.
Okay.
All right.
And then I guess since we're kind of a bit over time,
then let's bring it back together to where we started from. And I'm going to ask you that
kind of moot question. Okay, so what's the next milestone for AI Alliance then?
What can we expect from you next? The next step really is a lot of logistical work
currently, right? So we have the bootstrap committee that's meeting,
that's got an events group, it's got a governance revenue.
We're building out all the governance structure.
I've been looking at FinOps and working with a team,
the FinOps group, which is under Linux Foundation.
I've been looking at their stuff.
I'm doing due diligence on all these different groups,
building out just kind of the boring stuff, right?
Of like, how do you, it's almost like playing a strategic board game. You have to think about
everything that can go wrong in terms of governance. And you don't want every, you don't
want to, you know, the board in there essentially voting every time you need to do paperclips to
change the website, right? You want to have a degree of flexibility. So I'm working hard to
kind of come up with that concept now. And then at that point, we can turn our, like, as we get kind of the logistics out of the way in this phase, we get back to bringing on different projects, getting people talking to each other, doing events, working on kind of joint architectures together, right?
So I think it's a crawl, walk, run approach.
I'm a patient person.
And I think technology takes time. I think, you know, Wired, it's EDN. Back in the old days, they used to kind of sell on the concept that, right,
that we could,
that technology was changing every five seconds.
But the truth is, it takes a little bit of time.
And so I'm patient.
We're going to do the boring work
and then we're going to get to the exciting stuff
over the coming year.
I hope you enjoyed the podcast.
If you like my work,
you can follow Link Data Orchestration
on Twitter, LinkedIn, and Facebook.