Orchestrate all the Things - This is where you sign up for an open-source AI stack for the future. Featuring AI Infrastructure Alliance Lead Dan Jeffries

Episode Date: April 27, 2021

Open-source stacks enabled software to eat the world. Some of the most innovative companies in the world are working on building an open-source stack for AI. Dan Jeffries was there when the LAMP... stack enabled software to eat the world. Perhaps you don’t know, or remember, what the LAMP stack is, but it's actually pretty important. LAMP is an acronym made out of the initials of key open-source technologies used in software development - Linux, Apache, MySQL, and PHP. These technologies were hotly debated back in the day. Today, they are so successful that the LAMP stack has become ubiquitous, invisible, and boring. AI, on the other hand, is a hot topic today. Just like the LAMP stack turned software development into a commodity and made it a bit boring (especially if you're not a professional software engineer), an AI stack should turn AI into a commodity - and make it a bit boring, except maybe for data engineers. This is what Dan Jeffries is out to do with the AI Infrastructure Alliance (AIIA). Article published on VentureBeat.

Transcript
Discussion (0)
Starting point is 00:00:00 Welcome to the Orchestrate All the Things podcast. I'm George Amadiotis and we'll be connecting the dots together. Open source stacks enable software to eat the world. Some of the most innovative companies in the world are working on building an open source stack for AI. Dan Jeffries was there when the LAMP stack enabled software to eat the world. Perhaps you don't know or remember what the LAMP stack is but it's actually pretty important. LAMP is an acronym made out of the initials of key open-source technologies used in software development Linux, Apache, MySQL and PHP. These technologies were hotly debated back in the day. Today they're so successful that the LAMP stack has become ubiquitous, invisible and boring. AI, on the other hand, is a hot topic today. Just like the LAMP stack turned software development into a commodity and made it a bit boring if you're not a professional software engineer,
Starting point is 00:00:55 an AI stack should turn AI into a commodity and make it a bit boring, except maybe for data engineers. This is what John Jeffries is out to do with the AI Infrastructure Alliance. I hope you will enjoy the podcast. If you like my work, you can follow Link Data Orchestration on Twitter, LinkedIn, and Facebook. Yeah, I've been in technology for about two decades. I had an IT consulting company that did a lot of big Linux web farms and a lot of Microsoft back office kinds of things, big exchange servers, AD, those types. I went to Red Hat after selling my company to my partner, and I was there for a decade during the growth there.
Starting point is 00:01:37 It was about 1,500 people or so when I started, so I got to see it all the way up until the IBM acquisition. And that gave me a very strong sort of open source bent. And I got to see kind of the sea change of proprietary software to open source software. And after that, I really got super fascinated by machine learning. I kicked off a 50-page manifesto inside of Red Hat four or five years ago. And I expected to find dozens of people sort of beavering away on the infrastructure problem there. And it turns out nobody was working on it. So I started all these working groups, and I socialized it up to the CTO. But at some point in time, I felt it was necessary to kind of go back into the startup space, because I felt like that's where all the energy was flowing. And that's where I ended up at
Starting point is 00:02:23 Packaderm, which is a machine learning startup focused on data lineage and data versioning. And I love being kind of back at the energy of a smaller company because I feel like this space is very energetic at this point. And you've got to be where the action is if you want to make a difference. Yeah, it certainly feels that way to me at least. And I have to say that I think you went public with the AI Alliance announcement not very far back. It feels like a few weeks or a couple of months stops, if I'm not mistaken. And so when I went and checked the names of organizations that were involved, I saw lots of names that are familiar to me.
Starting point is 00:03:11 And actually, coincidentally, I just covered the Series B funding round from ML Ops, from, excuse me, OctoML today. Okay, cool. Yeah. It seems like I kind of naturally gravitated towards that space as well. So it's kind of convergence of things I'm interested in as well. So open source and machine learning and ML Ops, all of those things. So, okay, let's see where to start with the AI Alliance. So whose idea was it, basically? How did it came to be? How did you, did all of you,
Starting point is 00:03:47 you know, got together and decided that this is a good idea? I mean, it was my idea, but it was, the more I started talking about it with lots of people in the industry, the more I realized that there was just a need for it. And actually I ended up being very surprised. I expected there would already kind of be an organization that was thinking about how to capture all this activation energy, bring all these different companies together, get them talking to each other, get their integrations teams talking. And really, nobody was doing it yet. And that was very surprising to me as I started to dig into it. And every founder that I ended up talking to, every engineer I ended up talking to is very excited. I really didn't have to work very hard to kind of get people interested in the concept.
Starting point is 00:04:31 They understood it intuitively and they realized that all of the smaller, all the innovation is coming from these small to mid-sized companies, right? That are getting funded now, right? And they're up against these giant vertically integrated players, like SageMaker from Amazon. But I don't think any of the innovation is coming from that space, right? I think it used to be, and I saw this movement at Red Hat, that the proprietary software companies would come up with all the ideas, and then open source would copy it in kind of a okay sort of way. But then over time, most of the innovation started to flow to the open source and it started to
Starting point is 00:05:10 flow to the smaller companies projects. It started flowing to people working together, even kind of frenemies working together to build something as a collective that they could all benefit from. And the proprietary vendors started to fall behind. So I think that that has maintained at this point in time. So I think SageMaker and those kinds of vertically integrated products, they take the innovation from the smaller space and they copy it.
Starting point is 00:05:36 But it's not very innovative. And the innovation is going to come from a bunch of these companies working together as like little Lego pieces that we stack together. I firmly expect us to have like a LAMP stack or a mean stack of AI ML in the next five to 10 years. And it'll change over time, right? You went from LAMP to mean to whatever the framework is nowadays, the same kind of flow will happen in the machine learning space. But I don't believe the hype that anyone has the end machine learning-end machine learning system at this point.
Starting point is 00:06:06 We can't because it's just moving too fast. The space itself and the problems we need to solve are in motion at the same time as the software is being created. So it's going to take a number of years for this to shake out. But the innovation is definitely going to come from small to mid-sized companies and projects. Yeah, I mean, in principle, I totally share that notion. I mean, about innovation coming mostly from the open source space and from companies that are not, you know, behemoths, basically. They serve a different purpose, in my opinion,
Starting point is 00:06:38 which is, you know, basically taking that innovation and deploying that at scale and making it accessible and deployable along this different hardware and so on. But that's a different part of the ecosystem, I would say. So seeing your kind of manifesto or mission statement, it became quite clear to me that, well, actually it didn't become all that clear. And that's going to be my question to you. So I kind of had the feeling that, okay, so maybe what they're aiming is what you alluded to earlier. So to actually build a kind of stack or a standard, because you also mentioned things like interoperability and API bridges and this kind of thing.
Starting point is 00:07:26 So I'm wondering really if you're actually aiming for what you said, like a lamp stack. And if you are, what's the road to get there? Because you mentioned earlier that there was a lot of enthusiasm and your idea seemed well received by a lot of people, but there's a distance between saying, hey, okay, that sounds like a good idea, sign me up, and actually coming up with a roadmap and a plan to make it happen. So I'm wondering how are you going to walk the walk?
Starting point is 00:07:59 So there's a lot. I tend to think of myself, in George R.R. Martin's words, as a gardener instead of a planner, right? So you're a writer, so you know that there are two types of writers, right? There's the ones that plan every single thing down to the finest detail before they commit a single word to paper, and then it's just like executing on that. And then there's the gardeners who plant a lot of things and see what comes up and start to sculpt those different areas. I'm very much a gardener. And I see a lot of the folks who I work with in the organization as being gardeners as well. And I think it's necessary at this phase. I see the Alliance evolving over the course of time, right?
Starting point is 00:08:40 And its mission will shift over time. First of all, I don't think it's possible to just automatically come out with a LAMP stack for AI, again, because it's still in motion. So my goal at this point, and I think a lot of the other folks' goals, is to get everyone talking, get the integrations team talking, engineers talking, get them talking about the problems that are there.
Starting point is 00:09:01 I actually thought that we probably wouldn't end up hosting any projects as we form into a foundation. We're in talks with the Linux Foundation, Deep Talks Now to potentially roll under their organization, just like the cloud native had done or form our own 501c6. But when I look at it, there's, if anyone came, SGFL Scientific, which just joined and they have 35 machine engineers and some, you know, tremendous customers, they said if you come up with a stack right now or an architecture right now, it's going to change in six months. And I think that's legitimate.
Starting point is 00:09:34 So I think right now the idea is really to get the folks talking at a lot of different levels rather than working in their own little silos. And to me, this kind of mirrors the beauty of an open source development model, right? I also agree through the proprietary big vertically integrated systems serve their own purpose. And they're always going to make money. But the difference is Kubernetes and Docker don't become Kubernetes and Docker if they only run on Google. And so what you see is all these proprietary individual siloed versions come out over time. And then somebody comes up with a standard that actually starts to transcend and work across these
Starting point is 00:10:10 different things. And they all end up retroactively adopting it, right? So I still remember when VMware didn't love Kubernetes. But if you look at their marketing literature now, it's like the great, you know, the great leader, we've always loved Kubernetes, right? And so my thinking is the key is just getting everyone talking. And we just brought on, for instance, Quantum Black, which is a great solutions integrator. They were purchased by McKinsey. They used to have done a ton of Formula One work. They're working on technical counsel in the same way. And they're bringing together banks and all the infrastructure
Starting point is 00:10:45 folks and a bunch of people who are thinking about it because they've already seen the tools change just in their own lifetime. They started in 2008 working everything in MATLAB and now they've seen three iterations of tools. They understand the kind of same issue is just getting different people to the table to start talking about these things and forming it. I agree with you, though, it's an incredibly challenging problem, right? And I don't expect it to be solved overnight. But the key is getting everyone in a room and starting to think about how to interoperate. And I worry more about now micro alliances within the group. Okay, not everyone is going to integrate with everyone in there. If you look at 30 logos on the website, you're not going to build a machine learning stack with all 30 of those things. There's five monitoring
Starting point is 00:11:27 solutions on there. There's six or seven pipelines. There are people who are doing similar things. We allowed competition within the space, but you'll probably integrate with three or four of those things. And I'm all for Darwinism in the space. Let the different companies work on who they think they should partner with and build integration, build joint examples, build these things. We're already seeing that happen. Algorithmy has already built one with five or 10 people. Packeter did that with Seldon.
Starting point is 00:11:56 And Neuro just did that with us and Seldon and a few other places too. So that's what I'm hoping to foster at this point. And then that eventually will lead to an overarching architecture and kind of ways of talking about it in general. And we just brought Canonical on as well. They're already looking at how do I build kind of a deployment framework for this thing that's general purpose. So it happens and fits and starts. Little pieces of the puzzle get solved. And then eventually you get a glue project that kind of weaves it all together, I think, at time. And that's, I think, the eventual place
Starting point is 00:12:29 that we want to get to. But it's going to take a few years to get there. It's not going to be solved overnight. Yeah, it sounds reasonable, actually. And yeah, I would be a bit skeptical if you said, oh, you know, you have everything figured out. And, you know, that's the plan. Give us, I don't know, a year or two and you'll see the standard emerge.
Starting point is 00:12:48 What you're describing sounds much more real, actually, therefore credible, I would say. My question then would be, like I said, I mean, I'm totally with you on the community building, basically, because it sounds like this is what you're mostly doing at this part. But then again, it only kind of transposes the question because, again, community building is a very, as I'm sure you know, having been involved in open source for as long as you have been,
Starting point is 00:13:17 community building is actually very, very time-consuming and energy-consuming, and it needs to be, you know, deliberate, and you need to have things like i don't know um uh rules basically for for how people interact and you know what's allowed and what's not how do you grow the community in an equitable way and so on and so forth and since you're all basically you know running your own startups and being very creative and we all know how startups are how how do you even are how are you even going to find the time to do that well i happen to have a lot of free time uh in terms of my ability to dedicate to this at this point i wouldn't call it let me let me not call it free
Starting point is 00:13:58 time because then uh you know pakadar might think oh my god am i doing anything but the truth is they've been very supportive of the concept from the very beginning. And they've, I've reached a point in my career now where I have a lot of kind of leeway to run with my ideas. And the way I think about it a lot is that when I was younger and in the internet phase, I saw how important the internet was that I was, I was just a kid. I could only really surf the wave.
Starting point is 00:14:24 So I went to work in.com boom and in Linux when everybody thought that was crazy. But really, I had to go where the innovators were and just kind of hope that I was right. At this point, I've seen enough of the changes over time that I feel like I can influence it. And I feel like I can bring together other folks who are smart and powerful and thinking clearly and visionaries to work together on these things. So that gives me a good chunk of time to kind of dedicate to taking over kind of the directorship of this stuff. And by turning it into a foundation or going into the Linux foundation, we get to the point where we can start to add a good chunk of revenue into the equation. And then you get people who are just firmly focused on it and it becomes a balance of volunteer efforts, right? And of people actually paid to
Starting point is 00:15:10 work on different aspects of it. So I think it's feasible and you're right. It's just going to take a ton of time and effort. Luckily, I'm in a Goldilocks position where I do have the time to build the community, to talk to people, and I really enjoy doing it. And I'm hopeful at this point in my life that now I get to shape a little of the events of what I think is probably the most important technology that we've ever invented in the history of man. When I look at artificial intelligence at this point,
Starting point is 00:15:41 I think very few people understand just how important it's going to be. And I think they have an inkling of it, but it's usually a fear-based kind of thing, right? And they don't understand fully that in the future, there's two kinds of jobs, one done by artificial intelligence and one's assisted by artificial intelligence, right? No doctor is going to just have a technician who doesn't have an AI assistant looking at that cancer screening. You know, it's just no, no material scientist is going to just dream up the next iteration of a
Starting point is 00:16:16 material. They're going to have 50 different versions sourced. No artist is going to play two bars without the artificial intelligence essentially creating 20 more bars and going, hey, you know what? I like Rift number three. Play me more iterations of that, right? It's going to be a co-creative process. And from my standpoint, I just want to be a part of building that infrastructure layer so that people can move up the stack and do the more interesting things. It's that you don't get to WhatsApp and 35 engineers hitting 400,000 people and getting that kind of an audience until all those protocols are built. They didn't have to build a GUI and end-to-end encryption and a peer-to-peer messaging protocol.
Starting point is 00:16:59 They could build from those pieces and do something more interesting. And that's where I think that we have to get to. And that's what I hope to influence. Okay. I was going to ask you about like, okay, so what's the next milestone for AI Alliance? And I still want to ask you that, but you gave me a few lines that are just too good to pass upon. So you said two types of jobs, like one job that is done by AI and another type of job that is done assisted by AI. I would say maybe it's three types of jobs actually because well who's going to build that AI and to me that's a different type of job don't you think? Well I think the people building the AI are going to be assisted by artificial
Starting point is 00:17:37 intelligence so I think they fall into category two. We're already seeing that now right in other words most of the artificial intelligence yes there's a creative aspect where the person has to dream up an algorithm about, you know, that approximates thinking or reasoning in some type of way, right? Or just a pattern matching algorithm. But, and you think about something like hyperparameter tuning. And in the beginning, it was all people just pushing and pulling the dials, if you will. But now you have whole search parameters to go through and try to iterate on what the best hyper parameters of what the best architecture is going to be. So it's already gotten to the point where AI is assisting in the creation of AI. And I think that's only going to, it's going to continue. It's, I actually had this conversation with my, with my partner earlier today, where I said, every time we see
Starting point is 00:18:23 like a self-driving car crash in the news, we go, oh my gosh, we can't, we can't allow this to happen. And I go, listen, people, unfortunately, 1.2 million people die on the road every year through their own, you know, you know, lack of skills. It's very difficult to drive, right? It's very difficult for humans to do this. 50 million people are injured. At some point, the algorithms get better than that, right? If it cuts it by half or brings it down to a quarter, at some point in time, the entire shift mentally is going to take place. And people are going to say, whoa, you want to drive on your own with your own hands and
Starting point is 00:18:59 using your own wits? No, whoa, whoa, whoa, whoa. You're only allowed to do that in specific circumstances. I think Elon Musk said it would probably be illegal for humans to drive in only allowed to do that in specific circumstances. I think Elon Musk said it would probably be illegal for humans to drive in 50 years, except in very specific circumstances. And so in that case, that's where I really see the algorithms getting bitter and bitter. I don't see humans coming out of the loop, though. Again, I hate this story where AI destroys all the jobs. We've already destroyed all the jobs multiple times in human history. You didn't
Starting point is 00:19:24 hunt the water buffalo to make your clothing before you came to work today, right? You didn't build the microphone on your head. We destroyed all the jobs. And it's easy to see the destruction, but it's very hard, very hard to see all the things we create with it. You can't explain a web designer to an 18th century farmer because it's built on the back of 15 other inventions, electricity, wires, the web, the browser, Photoshop, hypertext, all these things, right, that leads to that. So humans are really good at seeing the disaster, but not good at seeing the shift. In my opinion, intelligence, there's not a single industry on earth that's not going to benefit from having more intelligence. Drug design, supply chain management,
Starting point is 00:20:08 material science, defense, any of these types of things are gonna benefit from having more intelligence. And so we're going to have all of these jobs essentially be assisted with a helper along the way. And I think that's wonderful. Well, the counter argument to that, and it's deeply philosophical actually,
Starting point is 00:20:28 but the counter argument to that is like, okay, so this time around is different because the pace of innovation is so rapid that you just don't have enough time to create the jobs that are going to be displaced and you don't have enough time to reskill the people that will have to be reskilled and so on. So you have this, you know, impedance in the pace of destruction versus creation of new
Starting point is 00:20:52 jobs, basically. You know, I'd say that I'd say the history of the phrase, this time is different, has about a zero percent win record in history, right? This time is never different. It's just an iteration on an old pattern. And if you really look at the history of life, it's iterations on old patterns again and again. It doesn't mean that there can't ever be sort of economic disasters, but we don't need AI to do that. We've done that a few times in history, right? And when you made the switch from hunter-gatherer, you know, to agrarian revolution that changed the nature of jobs, and then you made it to the industrial revolution, and that
Starting point is 00:21:37 was difficult to integrate. Probably, you know, we had wars and things in that time, probably more to the shift to that type of society. So we have had difficulty integrating, you know, giant changes in the past. But I don't think that this is just, I don't think that we're moving so fast now that it's impossible to integrate any of these changes, right? I think that we'll be able to, I think that this time around, we're a lot more adaptable. I mean, take a look at something like the pandemic right now, okay? It had the potential to be in a disaster on unprecedented scale, right? Going back to the plague time, right? Think about how quickly we were able to iterate on a brand new type of vaccine, right? Accelerated by machine learning, accelerated by brand new concepts. They had the vaccine designed in five days and, and information sharing across the sciences.
Starting point is 00:22:33 We were able to get the DNA out there of virus and people were able to study it all over the world. And within a year, we've been able to take this brand new type of vaccine and get it out into the world and hopefully get the world right back on track much faster than we would have ever had. That kind of adaptability, I think, is built into the system now. So things do move faster, but people are able to integrate those changes faster. When I see kids today, they pick up an app and they toss it away two years later and they're fine with it. So I think maybe folks like you and me are maybe have more trouble adapting because we're a little bit older.
Starting point is 00:23:11 Let's be honest. I'm older than I look. I'm sorry. I'm just going to call it out there. I'm just going to call it out there. Right. But when I look at the kids today, I mean, they'll throw away an app and never even think about it and use a brand new app as if they used it their whole life. So I think humans are actually becoming as adaptable as the speed of technology changes. And I think we worry about the integration of that a bit too much. Humans always adapt in the long run, even if there's a short-term blip in the flow of it all. Okay. You're obviously much more, you're definitely an optimist, and I think you're definitely much more optimistic than I am.
Starting point is 00:23:50 So I'm going to hit you with another counter argument then, which actually ties back to the original discussion about AI Alliance and what it is that you're doing with AI Alliance. So you referred to how innovation doesn't really come out of the big players, but more like companies like Pachyderm and OptoML and the other companies that have formed the AI Alliance. But one thing that the big players do have is data, basically. And you know much better than I do that you can have the best machine learning algorithm that ever existed, it's pretty much useless without data. So there's a quite big and actually growing, I would say, imbalance there. Big guys get richer in terms of data and everyone else gets poorer in comparison. So do you think this is viable and do you see that as a kind of obstacle going forward as to having an equitable type of AI?
Starting point is 00:24:51 I mean, now you're making a strong argument, right? And I think it's, it's a legitimate one, right? And I, I would say, actually, a decent amount of the original kind of innovation did actually come out of the big company. So I'm going to reverse a little bit here, right? In other words, it came out of the original kind of innovation did actually come out of the big companies. So I'm going to reverse a little bit here, right? In other words, it came out of the research labs and the big bang companies and such. But what you end up seeing is those folks are able to solve all the problems simultaneously. In other words, they're able to build the infrastructure and the algorithms and the
Starting point is 00:25:19 innovation at the same time because they have these huge general purpose operating systems. They have all this data, as you've noted, right? So I equate that to them building the car, the wheels, and the street simultaneously. And for the rest of us to really use this, you have to coalesce and bring together some of those pieces. You have to have infrastructure you can build on top of, or you have to have data sets that are more public, or you have to build your own data sets or crowdsource them together. I think that what you do see, though, is a lot of the engineers that came out of those companies solve a general purpose problem, something like Tekton or Ylabs. Ylabs, all those engineers came out of Amazon working on SageMaker, and they realized they needed a complete redesign of how you think about monitoring, because you have to monitor
Starting point is 00:26:03 the whole length of time when you're doing inference. And if there's drift, you need to keep the whole history. Whereas in a traditional monitoring system, you don't have to keep that. You just have to know the web servers up or down. If one web server is bouncing, you kind of want to know that history, but the rest of it, you can kind of throw away. And I think Tekton, same time, those folks came out of Uber. They came up with that concept of a feature store. And then they thought, oh, we've abstracted this concept. Now let's leave and start our own company. So I think there is actually a co-creative process between the bigger companies and the smaller ones. I like to give obviously credit to the smaller ones because I think that's where it gets
Starting point is 00:26:37 accelerated and that's where it starts to really change. And I think on their own, the big companies can't do it. But quite frankly, you're right, the smaller companies can't do it without some of the bigger companies' initial innovation. When it comes to data, I do think that it is a moat in the interim. The question is, does machine learning change over time? And I think we're getting better at dealing with smaller data sets. I think we're getting better at, you know, fewer shot learning. I think we're getting better at transfer learning and iterating off of those things. So I could see large numbers of algorithms trained and then kind of brought out as kind of pre-trained models that other folks can just start to build on top of, right? Or I can see each kind of company building its own private data set and transferring it to like a quick training center that has a
Starting point is 00:27:26 an economy of scale but you are right this is a legitimate argument to say that the data sets is the advantage that the giant you know internet web you know web companies have currently um but the world is not the the amount of data that the world is creating is not getting smaller. Right. And I think that you are going to see a huge chunk of people get access to more kinds of data or being able to build their own data sets over time. And that will sort of naturally tip the balance away from the mode that they currently have. But you are correct. It is a legitimate mode at the current time. Okay.
Starting point is 00:28:03 All right. And then I guess since we're kind of a bit over time, then let's bring it back together to where we started from. And I'm going to ask you that kind of moot question. Okay, so what's the next milestone for AI Alliance then? What can we expect from you next? The next step really is a lot of logistical work currently, right? So we have the bootstrap committee that's meeting, that's got an events group, it's got a governance revenue. We're building out all the governance structure.
Starting point is 00:28:31 I've been looking at FinOps and working with a team, the FinOps group, which is under Linux Foundation. I've been looking at their stuff. I'm doing due diligence on all these different groups, building out just kind of the boring stuff, right? Of like, how do you, it's almost like playing a strategic board game. You have to think about everything that can go wrong in terms of governance. And you don't want every, you don't want to, you know, the board in there essentially voting every time you need to do paperclips to
Starting point is 00:28:57 change the website, right? You want to have a degree of flexibility. So I'm working hard to kind of come up with that concept now. And then at that point, we can turn our, like, as we get kind of the logistics out of the way in this phase, we get back to bringing on different projects, getting people talking to each other, doing events, working on kind of joint architectures together, right? So I think it's a crawl, walk, run approach. I'm a patient person. And I think technology takes time. I think, you know, Wired, it's EDN. Back in the old days, they used to kind of sell on the concept that, right, that we could, that technology was changing every five seconds. But the truth is, it takes a little bit of time.
Starting point is 00:29:33 And so I'm patient. We're going to do the boring work and then we're going to get to the exciting stuff over the coming year. I hope you enjoyed the podcast. If you like my work, you can follow Link Data Orchestration on Twitter, LinkedIn, and Facebook.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.