Orchestrate all the Things - Stargate, a GraphQL for databases from DataStax. First stop - Cassandra. Featuring Ed Anuff, DataStax CPO

Episode Date: December 9, 2020

A flexible API is key to database accessibility and developer friendliness today. Apache Cassandra was lacking in that department, and DataStax is trying to address this with the release of a new... API layer called Stargate.  A discussion with Ed Anuff, formerly of Apogee and Google Cloud, and currently DataStax Chief Product Officer, on the rationale behind Stargate, its architecture and operation, how it compares to GraphQL, and a roadmap for the future. Article published on ZDNet

Transcript
Discussion (0)
Starting point is 00:00:00 Welcome to the Orchestrate All the Things podcast. I'm George Amatiotis and we'll be connecting the dots together. A flexible API is key to database accessibility and developer fairness today. Apache Cassandra was lacking in that department, and Datastacks is trying to address this with the release of a new API layer called Stargate. A discussion with Ed Anoff, formerly of Apple G and Google Cloud, and currently DataStax Chief Product Officer, on the rationale behind Stargate, its architecture and operation,
Starting point is 00:00:35 how it compares to GraphQL, and a roadmap for the future. I hope you will enjoy the podcast. If you like my work, you can follow Link Data Orchestration on Twitter, LinkedIn, and Facebook. Okay, so finally on the record, I guess we can start, as mentioned, with you saying a few words about the occasion today, which is, I guess, general availability of Stargate and what led to this announcement and, well, actually a few words about yourself, if you will. So how you are involved in this, what you do in data stacks and this kind of thing. Sure.
Starting point is 00:01:19 Yeah. So my name is Ed Anoff. I am the Chief Product Officer of DataStacks. I joined DataStax at the beginning of the Google, where I was for about three years. And then prior to that was at Apogee, the API management company that Google acquired back in 2016. So just in terms of giving you a little bit of background of why we're doing Stargate and what brought us up to this. You know, at the beginning of the year, we went and talked to a lot of people who were using Cassandra. We wanted to know why weren't more people using Cassandra. It's one of the most powerful, scalable of the NoSQL databases. It's very well used. It's proven by, you know,
Starting point is 00:02:26 a lot of companies and sites are powered by it. It's well adopted within the enterprise. But people don't talk about Cassandra as much as some of the other databases that you might read about on Hacker News or, you know, different people, you know, new developers look at. And so we found was that, you know, there were really two things. Cassandra was challenging to run, even though it was very powerful. Deploying it was challenging. Running
Starting point is 00:02:57 and operating it was challenging. And then once you did that, it wasn't very easy to develop for them. And so we looked at solving the first part of that, how to make Cassandra easier to run with bringing Cassandra to the cloud. And we launched our Astra cloud service earlier this year so that anybody who wanted to use Cassandra could do so. We also brought Cassandra to the Kubernetes world. And in fact, we did both of these things together. Running Cassandra on top of Kubernetes is how we make our Astra cloud service possible. But we also made it possible for any Kubernetes user
Starting point is 00:03:42 to easily run and scale Cassandra. And we talked about that at KubeCon a month or so back with something called Kate Sandra that is all about having Cassandra within Kubernetes. But the thing that we knew was really important was how do we make it really easy for developers who are building new applications to find Cassandra to be the easiest place for them to develop for? And we went and looked at the types of developers building new applications, the full stack developers, the people using JavaScript, the Jamstack developers, people using Node.js. And, you know, there's 12 million of these
Starting point is 00:04:27 developers, which is more than half of all the developers in the world are these full stack developers. And they weren't having a very easy time with Cassandra. And so we said, okay, if we want Cassandra to be the choice of developers out there, we need to make this a lot easier. And so we started this project called Stargate that is a data gateway on top of Cassandra that provides everything that Cassandra needs or everything that a developer needs to succeed with Cassandra.
Starting point is 00:04:59 So it gives you really easy APIs, gives you REST APIs, it gives you really easy APIs, gives you REST APIs, it gives you GraphQL APIs. And most importantly, it takes your data, your JSON data, which is the type of data format that virtually every developer these days knows how to work with, and automatically maps it into the database without you having to do any of what is called data modeling. So what this means is that if you're a developer out there that knows how to use JavaScript, knows how to use Node.js, or any of the similar languages, PHP, Python, anything that deals with this type of data. And you want to use really easy APIs to store the data using the frameworks that you already work with that it's super easy to do. And so that's what Stargate is all about.
Starting point is 00:05:57 We started it as an open source project. We made it. We've been working with developers since the beginning of the year. We opened it up to the public during the summer. And now it's available. It's generally available 1.0, both as open source and it's available as the official API of the Astra Cassandra service cloud service. And so that's what it's all about. It's our goal of making it easy for every developer to make Cassandra their first choice. Okay, thanks. That's a very good and concise
Starting point is 00:06:36 summary of what it's all about. And that was my idea as well. And I have to say that Mark was kind enough to flag this for me early. I think it was back in September and just having a look at it back then, you know, it kind of raised a few questions. Some of them have been answered by today and some of them have not. So it's great that you're here so that I can address those questions. But before I do actually, before we Είναι εύκολο ότι είστε εδώ, ώστε να μπορέσω να αντιμετωπίσω αυτές τις ερωτήσεις. Απ' ό,τι πρέπει να κάνω, πριν να προχωρήσουμε στις συγκεκριμένες, μία ερώτηση που είναι συγκεκριμένη για εσάς, αν δεν θέλεις. Επίσης, είπατε ότι ένα από τα προηγούμενα ρόλια σας ήταν στο Google και έχετε επίσης εμπειρία με το Apple G.
Starting point is 00:07:21 Πιστεύω ότι αυτό σας κάνει makes you kind of an expert on API. So actually, it's a question in two parts. One, I wonder if some Ramzi had anything to do with recruiting you, because if I'm not mistaken, he also has some Google background. And B, I wonder how much involved you were in Stargate, because it sounds like the kind of thing that someone with your background would probably come up with. Well, let's see.
Starting point is 00:07:51 There's your two part question. So first, you mentioned Sam Ramji, who's a good friend of mine. I've been working with Sam for for for over 10 years now. He he is the person who brought me to Apigee back in the day. And we both were at Google as well, although it was interesting that that happened almost by coincidence. He had moved on from Apigee and was doing some other things in the cloud native space. And meanwhile, myself and Chet Kapoor and a bunch of other folks were at Apigee and, you know, building it up in the API management space. And then when Google was expanding the Google Cloud platform, it was looking to pull in a bunch of people.
Starting point is 00:08:51 And so, you know, obviously they wanted Sam, who's an expert on open source and an expert on cloud native technologies. At the same time they were doing that, they said, let's bring in, you know, Apogee with all the people. We call them API geeks. Apogeeks was the term we called for ourselves. And so a bunch of us, well, they brought the whole Apogee company into Google Cloud.
Starting point is 00:09:19 So that was a really exciting time at Google Cloud. And Google continues to do great things over there. But, yeah, yeah. And then for your second question, yeah, I think, you know, the insight that I think we collectively had, by the way, Sam Ramji is here at Datastacks with us as well. He's our chief strategist. Yeah, this is why I'm asking, actually.
Starting point is 00:09:46 Yeah, we brought the band back together. I think the key insight that we had was that, if you think about it, a developer doesn't use, and I'm talking about a software developer, doesn't use a database. They don't use a cloud or they don't use infrastructure. What they use is the APIs to those. And I think we've seen time and again, if you build the great API, that's the most important thing because you can have the most powerful technology in the world. you could have the most powerful database, but if the API is not rich and expressive and flexible and able to empower the developer to work in a fast, agile, intuitive way with that API, then the technology is not unlocked. You're not able to leverage it. And so, yes, when we looked at Cassandra
Starting point is 00:10:51 and we said, why aren't more developers using it? APIs weren't the whole answer, but you can imagine that certainly with what we've seen and with spending this much time with developers and seeing how they actually build applications, that we knew that APIs had to be a big part of the answer. And so Stargate does represent that. There were a lot of folks at DataStax and a lot of folks in the community who recognize that. So I was really excited when I came in and started working with the teams here and started asking them, you know, what are you thinking about in terms of how to make things easier for developers? A lot of these ideas have
Starting point is 00:11:37 already been floating around. The idea of a data gateway that is designed for, you know, bridging between the database and the developer, people have been exploring this idea. And in fact, when we went out to the companies that were using Cassandra, and you see within our announcement, we talk about Yelp. Yelp was a great example of a company that's using open source Cassandra, but they had had to go and build a data gateway. And you look at many of the companies successful with Cassandra, and they've had to follow this pattern. And so what we said was, you shouldn't have to build that yourself.
Starting point is 00:12:19 It should just be part of the database distribution. And so that's what we were trying to do with Stargate was go and say, look, can we give you this out-of-the-box, very easy-to-use way that gives you these great APIs on top of your database? So, yeah, that's, you know, good observation on your part. Thank you. And yes, actually, you know, to give you my own personal view, let's say, I think both of the pain points that you mentioned initially are, you know, are spot on, basically. And I think that the steps you are taking to address them are in the right direction. So to get to the specifics, yeah, I totally agree. And, you know, having been a developer,
Starting point is 00:13:08 and actually I still do some development on the side myself, I think you're absolutely right. This is precisely how developers think. And yes, the API is a key part. And this is something that I've been seeing playing out with a number of databases in the last couple of years. And so when I initially became aware of Stargate,
Starting point is 00:13:31 and as I mentioned, that was back in September, if I'm not mistaken, so it had just barely been released, my initial thought was like, okay, so this looks a lot like GraphQL. Why would they want to reinvent GraphQL? And then checking back again as background for this conversation today, I realized, okay, so it actually kind of wraps GraphQL because in the meanwhile, you have also added support for GraphQL. And at the same time, it kind of duplicates some of the key notions, let's say.
Starting point is 00:14:07 So, you know, as GraphQL has these GraphQL servers that kind of sit in a layer between whatever it is they're serving and the client, you also have that notion in Stargate. So I guess where I'm going with all this is like, okay, so why not just adopt GraphQL? What is it that Stargate adds to GraphQL? Well, Stargate very much does adopt GraphQL. So I think per your question, what we're doing is we... So there's a couple of different things. GraphQL, as I'm sure you know, is, you know, it's an API, but it's an entire ecosystem. And, and there are different pieces of, of you've got your GraphQL servers, you've got, you, you have, You've got GraphQL middleware.
Starting point is 00:15:09 You've got your GraphQL clients. You've got tooling, things like GraphQL and GraphQL Playground. And, you know, so in terms of what we do, we solve a piece of the equation. If you're a GraphQL developer, you're using all these things. And in fact, that's part of the richness of GraphQL and why developers like it is that this whole ecosystem has sprung up. We're solving the piece of getting the Cassandra data mapped into GraphQL. And so if you've got your Cassandra database and you go and connect to it with GraphQL or GraphQL Playground, you'll suddenly see it. You'll see all of the, you know, you'll see all of the data within Cassandra
Starting point is 00:16:08 exposed as a GraphQL schema, and you can actually navigate and build queries and autocomplete works in the tooling and all of that because we expose it in that way. We have users who are using it in conjunction with things like Apollo, you know, Apollo GraphQL, where they go and they're combining the data coming from Stargate with other data so that the developer can go and do a single query and get the data that's in Cassandra as well as data from other sources. And they get that and it comes from Apollo GraphQL that is connecting to Stargate.
Starting point is 00:16:50 So what you have is now this ecosystem of software that's able to go and expose and interact with data via GraphQL. So it's not a case of us going and saying, you know, that we're not, or that we reinvented GraphQL. We are using Stargate, you know, one of the API mechanisms within Stargate and one that, as I said, a big part of what we've done here is the GraphQL mechanism. So does that make sense?
Starting point is 00:17:38 I'm not sure I see this in either or. It's more of just, you know, making it possible for these to work together. Yeah, yeah, that's why, you you know that was the purpose of my question i'm trying to understand you know the the philosophy let's say behind it and where you want to go with that basically and i also read a couple of the blog posts that uh you have on the on the stargate uh website and some of them were quite interesting from from technical point of view. They explained the architecture behind it and so on. And obviously, you know, I guess not all of it, not all of the vision, let's say, is there at this point because the diagrams I saw also had things like support for SQL. Yes. Yeah. So let me talk.
Starting point is 00:18:25 So I do want to make, it's a really good point. So when we look at Stargate, and we are talking a lot about GraphQL today, our first milestone for Stargate was to get to the full stack developers. And the full stack developers, again, there's, you know, I remember when that term first came out, but now, you know, per the latest developer surveys, it's something like 12 million developers identify as full stack developers today. What we heard was two things.
Starting point is 00:19:04 We heard you need really great REST JSON APIs. Because, by the way, I love talking about GraphQL because as an API geek, that's my favorite thing is when a new API comes out. But REST APIs, and it's funny to say this because I think we all remember, many of us remember when REST APIs first emerged on the scene and became a big deal. And now we're talking about REST APIs as if they were the whole thing. But REST APIs and JSON, by and large, were what the majority of developers are using right now. And so what we knew was that the first release, the 1.0 release of Stargate needed to have really great REST JSON APIs and really great GraphQL. But Stargate is also meant to go and be the mechanism by which other APIs
Starting point is 00:20:09 can be brought to the Cassandra world. And so, you know, when you look at some of those diagrams, one of the things we talk about, you know, that you'll see very soon in the first half of the coming year, hopefully very early on in the year, is GRPC. Because we have a lot of developers that are building microservices that want to be able to access data in a high performance way. And, you know, GraphQL is very expressive, but it's typically meant for front end clients. But you also have people that are going and doing very high performance read rights to the database who want who also want a fast API to do that. And, you know, if you look at how we do things like that at Google, we do use protocols like gRPC to do that. And so part of what Stargate is designed for is to make it possible for us to go and address each one of these API types that people want to be able to go and use within, you know, within their architecture. So that's why, you know, Stargate should be viewed as, it's not just, you know, it's not just GraphQL, even though today we're talking a lot about that. It's the way that we're going to make it possible for any developer, you know, in whatever stack that they're using to build, you know, their apps, whether it's for front endend or back-end, that they'll be able to build those.
Starting point is 00:21:50 Yeah, and actually that's a good point that you made about GraphQL, you know, being actually primarily designed, I guess, to serve front-end requirements. And even though many databases have adopted it, and, you know, through mutations, basically, it's not always a natural fit sometimes um yeah both the api makers and designers and the users have to well bend or tweak things around a little bit to to make it work to to their needs so actually what you have come up with i think it it does make sense, you know, if you think about it and if you have actually used GraphQL to interact with databases.
Starting point is 00:22:33 The question I have, however, is like, okay, I totally get how, you know, primarily it's intended for Cassandra because this is your number one use case. Do you see, however, on my part at least, I think that this is a broader need, let's say, to be addressed. Is it anywhere in your roadmap? Would you like to see that being adopted by other databases and kind of creating an ecosystem of its own, kind of like GraphQL for servers, for the database of sorts? Yeah, it's a really good question. And it's one of the things that we look at and really talk about every day. Um, we, um, so first of all, with, with Stargate, um, um, you know, it's, we're an open source company. And so when we go and build something, even something that, that we're putting in the cloud, um, you know, we start with, are we developing this in the open?
Starting point is 00:23:42 Can people run it themselves? Is it licensed in such a way? So we use the Apache license for everything we do so that anybody can go and, you know, contribute to it and issue pull requests and so on. And so I put that out there because we have looked at this question of, you know, could Stargate go and talk to multiple databases? The technical answer is, of course, it could. It's written in a very modular way. We, you know, the thing that we bring to the table is, you know, we're experts in Cassandra. And what we tried to do within Stargate, it's a very modular architecture. When you call that API, there's an extension
Starting point is 00:24:31 mechanism behind it that loads the, you know, appropriate data access logic that goes and does things like take that JSON object, schema-less JSON object, and turn it into the specific set of Cassandra CQL commands. So that to the developer, it just looks like this super simple REST API that you might get from something like Firebase. But behind the scenes, we're turning it into very high-performing Cassandra CQL, Cassandra query language. So that's the expertise we're able to bring to the table. If somebody that was, you know, a deep expert on a different database came in and said, hey, we've gone and created this, you know, pull request that now lets you go and send, you know, send this to another database.
Starting point is 00:25:25 Absolutely. We would be, we would be overjoyed. Um, again, we, we are big believers in open source, big believers in, in, in the community, um, and, and the community being able to take, take things in the direction, uh, you know, that, that, that people, you know, feel that they want to, want to solve these problems with. Um, but right now, you know, we're, we're, that people, you know, feel that they want to solve these problems with. But right now, you know, we are, you know, the Cassandra people, and we want to make sure that we, you know, we want to solve these things for Cassandra first. So what you'll see within the roadmap is that we've created the room within it, both within the architecture, it's all pluggable. We've documented it. We've created an open source environment for people to help collaborate with us.
Starting point is 00:26:11 We'll do a little bit in that area because we're naturally, we're curious and interested. We do want to, and particularly want to be able to integrate some of these other technologies in there. We'll tackle a few of those ourselves, but as I said, we're creating the room
Starting point is 00:26:27 for other folks who want to come in and tackle some of these other databases to come in and handle that piece. Okay, yeah, it makes sense, certainly. So yeah, I guess since we're actually ready over time to wrap up, I guess for me, Υποστηρίζεται. Εντάξει, εγώ πιστεύω ότι είμαστε έτοιμοι με την ώρα. Για να κλείσω, πιστεύω ότι για εμένα το πρωτοβουλίο από αυτή την παρουσίαση, να πούμε, στο Stargate είναι ότι, ναι, βλέπουμε πολύ,
Starting point is 00:26:56 στην αρχή, φιλοσοφικά, ή ακόμα και στην αρχιτεκτορία, όπως το GraphQL. Και, πραγματικά, μπορείς να το χρησιμοποιήσεις σε συμπέραση με το GraphQL. And actually you can use it in combination with GraphQL. And if you're already using GraphQL, it will basically look pretty much the same, I guess, to you. So you can, as you mentioned, you can aggregate Stargate and GraphQL and just have your queries that span multiple endpoints. And the other thing that I wanted to ask to wrap up this conversation, and actually to tie it into the last question. So obviously, you know, it's just the beginning, and I see that you have some traction and some of the early adopters, let's say, were also mentioned in the press release that's going to be issued. You mentioned names like Yelp,
Starting point is 00:27:46 and there's a few other big ones in there. So, yeah, basically I wanted to ask if you could give me like a brief overview of which organizations are adopting it so far. And you also mentioned that it can be used both on the open source Cassandra and on Astra, the cloud version of Cassandra run by Datastacks. And what are the next steps in that journey? And just to close a closing comment on my part, so on the forward-looking question that I asked you previously. So I guess if it's going, you know, if there's any chance, let's say, of Stargate becoming somewhat of a server-side GraphQL, to call it that way, then I guess this will probably go through your users.
Starting point is 00:28:38 So as we know, polyglot persistence is a fact of life. So typically people use more than one database for their systems. And if they start liking Stargate, then it's quite possible that some of them may want to adopt it for accessing more of the databases. So that could possibly be the way. Absolutely. I mean, I think that what I would say is that a couple of different things. So first, I'd say that people want to access their data. The big idea is here, people want to access their data via APIs. The old way where I would go and have a driver and
Starting point is 00:29:23 try to find that driver for my language and all of that sort of thing. And then each driver is different. And when I go and I try to switch my app from, you know, Postgres to some other, you know, database, then the drivers look different. You know, most developers, they want to use services. They want to use APIs. They want to use APIs. They want to use microservices. They want to use APIs that they know. They want to use REST APIs, or they want to use GraphQL APIs, or they want to use gRPC. And, you know, there's a few others. And so I think what you'll see is, you know, I think you'll see, you'll see, you know, that Stargate, you know,
Starting point is 00:30:09 definitely for people who want to use Cassandra in the mix, that Stargate is going to be the best way to do that. I think just in general, from a trend standpoint, I think the big idea is look at, you know, where people are going with these services. And so that's, you know, where, where people are going with, with these, with these services. And so, so that's, you know, that, that's, that's the part that's really important to us. Um, I think like you pointed out, um, there's, there's a lot of folks, um, we're really excited that we've been able to get a lot of the people that are doing interesting things with, um, with Cassandra to, uh, to, to, you know, look at this and jump on board. And there's, um, there's a bunch of folks, um, that, that we're talking about and actually, you know, we, we are in the process of, of adding to that list. There's actually, uh, as I'm sure you're aware of,
Starting point is 00:31:00 uh, you know, there's always a process where you have a lot of people using it, but, but, uh, then you've got to get them to, to, to, you know, approve that a process where you have a lot of people using it but but uh then you've got to get them to to to you know approve that they're that they they want to talk about it publicly so so we're adding to that list there's a few more that are coming in um uh that that that have been doing it but they the our goal has been to go to people, big internet companies and Yelp was a great example of that. People that are serving, they've got a large amount of data, but they also have to have their front end developers and their mobile developers able to get at that data.
Starting point is 00:31:39 We have some retailers, big retailers. We have Burberry that is a really innovative retailer in e-commerce. They create Facebook apps. They create mobile apps. They have that same issue, which is that they want to use Cassandra. It's massively powerful. They can use it around the world in a globally distributed way. But they also need their front-end developers and their full-stack developers to be, you know, able to do these things super simply. We have a couple of enterprises on that list as well, financial services companies and, you know, that are using it, you know, traditional enterprises. So this is one
Starting point is 00:32:25 of these things that when I talk about developer productivity, every company has developers, uh, that are building these, these, you know, apps, um, and they're building a lot of these apps and they need to be very productive when they, when they do it. Um, it isn't just, you know, the world is not like it used to be where you had, you know, the enterprise companies and then you had the startups. These days, everybody is using the latest technology. Everyone reads Hacker News. Everybody wants to make sure that they're building these things, you know, in the best way possible. And so we've tried to work with as many of these companies,
Starting point is 00:33:05 um, you know, as possible. And, and now that we've got, you know, you know, many, many people go and say, that's great. I'll wait until it's, it's at 1.0, uh, before I get started with it. And that's part of what we were, uh, you know, that's part of what, what this announcement is about is to tell, tell all all the rest of the developers that are waiting on this that that now's the time to start and so so that's um you know i i think you'll you'll see a lot more of this from us we're doing a bunch of great events uh we have a react react with cassandra event uh this week where you know we we have an online event where a bunch of of of people in the react and and full stack and jam stack community uh that are going to be talking about how to use this to
Starting point is 00:33:55 you know to build these types of apps yeah yeah that that makes sense and yeah i guess the uh uh on, I guess on the broader side of things, even though my takeaway would be, okay, so this kind of wraps GraphQL on the broader side of things. It's like, well, okay, version 1.0 is here, so you can actually go ahead and use it. Yes, yeah, absolutely.
Starting point is 00:34:23 And I think GraphQL, you know, the thing I love about GraphQL is it is, well, the two things I love about GraphQL, but first thing I love about GraphQL is I've always been a strong believer that the purpose of APIs is to make the application developers life easier. And GraphQL is completely designed for the idea of that, of having the front end application developer gets to say,
Starting point is 00:34:55 this is the data I want and GraphQL presents it to you. But the second thing I love about GraphQL is we're still in the early stages of GraphQL and there's like, and there's so much innovation happening and so many cool startups. And, and so, you know, as, as we got in it and we started working with, with these developers, you know, it really is changing very quickly. So, so, you know, our goal is you're going to, you're going to see us being very active within that community and, sure that we're providing the best way for Cassandra developers to participate in it.
Starting point is 00:35:32 Thank you. I hope you enjoyed the podcast. If you like my work, you can follow Link Data Orchestration on Twitter, LinkedIn, and Facebook.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.