Orchestrate all the Things - Stargate, a GraphQL for databases from DataStax. First stop - Cassandra. Featuring Ed Anuff, DataStax CPO
Episode Date: December 9, 2020A flexible API is key to database accessibility and developer friendliness today. Apache Cassandra was lacking in that department, and DataStax is trying to address this with the release of a new... API layer called Stargate. A discussion with Ed Anuff, formerly of Apogee and Google Cloud, and currently DataStax Chief Product Officer, on the rationale behind Stargate, its architecture and operation, how it compares to GraphQL, and a roadmap for the future. Article published on ZDNet
Transcript
Discussion (0)
Welcome to the Orchestrate All the Things podcast.
I'm George Amatiotis and we'll be connecting the dots together.
A flexible API is key to database accessibility and developer fairness today.
Apache Cassandra was lacking in that department,
and Datastacks is trying to address this with the release of a new API layer called Stargate.
A discussion with Ed Anoff, formerly of Apple G and Google Cloud,
and currently DataStax Chief Product Officer,
on the rationale behind Stargate, its architecture and operation,
how it compares to GraphQL, and a roadmap for the future.
I hope you will enjoy the podcast.
If you like my work, you can follow Link Data Orchestration on Twitter, LinkedIn, and Facebook.
Okay, so finally on the record, I guess we can start, as mentioned, with you saying a
few words about the occasion today, which is, I guess, general availability of Stargate and what led to this
announcement and, well, actually a few words about yourself, if you will.
So how you are involved in this, what you do in data stacks and this kind of thing.
Sure.
Yeah.
So my name is Ed Anoff.
I am the Chief Product Officer of DataStacks. I joined DataStax at the beginning of the Google, where I was for about three years.
And then prior to that was at Apogee, the API management company that Google acquired back in
2016. So just in terms of giving you a little bit of background of why we're doing Stargate and what brought us up to this.
You know, at the beginning of the year, we went and talked to a lot of people who were using
Cassandra. We wanted to know why weren't more people using Cassandra. It's one of the most
powerful, scalable of the NoSQL databases. It's very well used. It's proven by, you know,
a lot of companies and sites are powered by it.
It's well adopted within the enterprise.
But people don't talk about Cassandra
as much as some of the other databases
that you might read about on Hacker News
or, you know, different people,
you know, new developers look at. And so we found was that, you know, there were really two things. Cassandra was
challenging to run, even though it was very powerful. Deploying it was challenging. Running
and operating it was challenging. And then once you did that, it wasn't very easy to develop for them.
And so we looked at solving the first part of that, how to make Cassandra easier to run with bringing Cassandra to the cloud.
And we launched our Astra cloud service earlier this year so that anybody who wanted to use Cassandra could do so.
We also brought Cassandra to the Kubernetes world.
And in fact, we did both of these things together.
Running Cassandra on top of Kubernetes
is how we make our Astra cloud service possible.
But we also made it possible for any Kubernetes user
to easily run and scale Cassandra.
And we talked about that at KubeCon a month or so back
with something called Kate Sandra
that is all about having Cassandra within Kubernetes.
But the thing that we knew was really important
was how do we make it really easy for developers
who are building new applications to find Cassandra to be the easiest place for them to develop for?
And we went and looked at the types of developers building new applications, the full stack developers, the people using JavaScript, the Jamstack developers, people using Node.js. And, you know, there's 12 million of these
developers, which is more than half of all the developers in the world are these full stack
developers. And they weren't having a very easy time with Cassandra. And so we said, okay, if we
want Cassandra to be the choice of developers out there, we need to make this a lot easier.
And so we started this project called Stargate
that is a data gateway on top of Cassandra
that provides everything that Cassandra needs
or everything that a developer needs
to succeed with Cassandra.
So it gives you really easy APIs,
gives you REST APIs, it gives you really easy APIs, gives you REST APIs, it gives you GraphQL APIs.
And most importantly, it takes your data, your JSON data, which is the type of data
format that virtually every developer these days knows how to work with, and automatically
maps it into the database without you having to do any of what is called data modeling. So what this means is
that if you're a developer out there that knows how to use JavaScript, knows how to use Node.js,
or any of the similar languages, PHP, Python, anything that deals with this type of data. And you want to use really easy APIs to store the data using the frameworks that you already work with that it's super easy to do.
And so that's what Stargate is all about.
We started it as an open source project.
We made it.
We've been working with developers since the beginning of the year.
We opened it up to the public during the summer.
And now it's available.
It's generally available 1.0, both as open source and it's available as the official API of the Astra Cassandra service cloud service.
And so that's what it's all about. It's our goal of making it easy for every developer
to make Cassandra their first choice. Okay, thanks. That's a very good and concise
summary of what it's all about. And that was my idea as well. And I have to say that
Mark was kind enough to flag this for me early. I think it was
back in September and just having a look at it back then, you know, it kind of raised a few
questions. Some of them have been answered by today and some of them have not. So it's great
that you're here so that I can address those questions. But before I do actually, before we Είναι εύκολο ότι είστε εδώ, ώστε να μπορέσω να αντιμετωπίσω αυτές τις ερωτήσεις. Απ' ό,τι πρέπει να κάνω, πριν να προχωρήσουμε στις συγκεκριμένες,
μία ερώτηση που είναι συγκεκριμένη για εσάς, αν δεν θέλεις.
Επίσης, είπατε ότι ένα από τα προηγούμενα ρόλια σας ήταν στο Google
και έχετε επίσης εμπειρία με το Apple G.
Πιστεύω ότι αυτό σας κάνει makes you kind of an expert on API.
So
actually, it's a question in two parts.
One, I wonder if some Ramzi had anything to do with recruiting you,
because if I'm not mistaken, he also has some Google background.
And B, I wonder how much involved you were in Stargate, because it sounds like the
kind of thing that someone with your background would probably come up with.
Well, let's see.
There's your two part question.
So first, you mentioned Sam Ramji, who's a good friend of mine.
I've been working with Sam for for for over 10 years now.
He he is the person who brought me to Apigee back in the day. And
we both were at Google as well, although it was interesting that that happened almost by
coincidence. He had moved on from Apigee and was doing some other things in the cloud native space.
And meanwhile, myself and Chet Kapoor and a bunch of other folks were at Apigee and, you know, building it up in the API management space. And then when Google was expanding the Google Cloud platform,
it was looking to pull in a bunch of people.
And so, you know, obviously they wanted Sam,
who's an expert on open source
and an expert on cloud native technologies.
At the same time they were doing that,
they said, let's bring in, you know, Apogee with all the people.
We call them API geeks.
Apogeeks was the term we called for ourselves.
And so a bunch of us, well, they brought the whole Apogee company into Google Cloud.
So that was a really exciting time at Google Cloud.
And Google continues to do great things over there.
But, yeah, yeah.
And then for your second question, yeah, I think, you know,
the insight that I think we collectively had, by the way,
Sam Ramji is here at Datastacks with us as well.
He's our chief strategist.
Yeah, this is why I'm asking, actually.
Yeah, we brought the band back together. I think the key insight that we had was that,
if you think about it, a developer doesn't use, and I'm talking about a software developer,
doesn't use a database. They don't use a cloud
or they don't use infrastructure. What they use is the APIs to those. And I think we've seen time
and again, if you build the great API, that's the most important thing because you can have
the most powerful technology in the world. you could have the most powerful database,
but if the API is not rich and expressive and flexible and able to empower the developer to
work in a fast, agile, intuitive way with that API, then the technology is not unlocked. You're not able to leverage it. And so, yes, when we looked at Cassandra
and we said, why aren't more developers using it?
APIs weren't the whole answer,
but you can imagine that certainly with what we've seen
and with spending this much time with developers and seeing how they actually build applications, that we knew that APIs had to be a big part of the answer.
And so Stargate does represent that.
There were a lot of folks at DataStax and a lot of folks in the community who recognize that. So I was really excited when
I came in and started working with the teams here and started asking them, you know, what are you
thinking about in terms of how to make things easier for developers? A lot of these ideas have
already been floating around. The idea of a data gateway that is designed for, you know, bridging between the database and the developer,
people have been exploring this idea. And in fact, when we went out to the companies
that were using Cassandra, and you see within our announcement, we talk about Yelp.
Yelp was a great example of a company that's using open source Cassandra, but they had
had to go and build a data gateway.
And you look at many of the companies successful with Cassandra, and they've had to follow
this pattern.
And so what we said was, you shouldn't have to build that yourself.
It should just be part of the database distribution.
And so that's what we were trying to do with Stargate was go and say, look, can we give you this out-of-the-box, very easy-to-use way that gives you these great APIs on top of your database?
So, yeah, that's, you know, good observation on your part.
Thank you. And yes, actually, you know, to give you my own personal view, let's say,
I think both of the pain points that you mentioned initially are, you know, are spot on, basically.
And I think that the steps you are taking to address them are in the right direction. So to
get to the specifics, yeah, I totally agree.
And, you know, having been a developer,
and actually I still do some development on the side myself,
I think you're absolutely right.
This is precisely how developers think.
And yes, the API is a key part.
And this is something that I've been seeing playing out
with a number of databases
in the last couple of years.
And so when I initially became aware of Stargate,
and as I mentioned, that was back in September,
if I'm not mistaken, so it had just barely been released,
my initial thought was like, okay, so this looks a lot like GraphQL.
Why would they want to reinvent GraphQL?
And then checking back again as background for this conversation today,
I realized, okay, so it actually kind of wraps GraphQL
because in the meanwhile, you have also added support for GraphQL.
And at the same time, it kind of duplicates some of the key notions, let's say.
So, you know, as GraphQL has these GraphQL servers that kind of sit in a layer between whatever it is they're serving and the client, you also have that notion in Stargate.
So I guess where I'm going with all this is like, okay, so why not just adopt GraphQL?
What is it that Stargate adds to GraphQL?
Well, Stargate very much does adopt GraphQL.
So I think per your question,
what we're doing is we... So there's a couple of different things. GraphQL, as I'm sure you know,
is, you know, it's an API, but it's an entire ecosystem. And, and there are different pieces
of, of you've got your GraphQL servers, you've got, you, you have, You've got GraphQL middleware.
You've got your GraphQL clients.
You've got tooling, things like GraphQL and GraphQL Playground.
And, you know, so in terms of what we do, we solve a piece of the equation.
If you're a GraphQL developer, you're using all these things.
And in fact, that's part of the richness of GraphQL and why developers like it is that this whole ecosystem has sprung up. We're solving the piece of getting the Cassandra data
mapped into GraphQL. And so if you've got your Cassandra database and you go and connect to it
with GraphQL or GraphQL Playground, you'll suddenly see it. You'll see all of the, you know,
you'll see all of the data within Cassandra
exposed as a GraphQL schema,
and you can actually navigate and build queries
and autocomplete works in the tooling and all of that
because we expose it in that way.
We have users who are using it in conjunction with things like Apollo,
you know, Apollo GraphQL, where they go and they're combining the data coming from
Stargate with other data so that the developer can go and do a single query
and get the data that's in Cassandra as well as data from other sources. And they get that and it comes from Apollo GraphQL that is connecting to Stargate.
So what you have is now this ecosystem of software that's able to go and expose and interact with data via GraphQL. So it's not a case of us going and saying,
you know, that we're not,
or that we reinvented GraphQL.
We are using Stargate, you know,
one of the API mechanisms within Stargate
and one that, as I said, a big part of what we've done here
is the GraphQL mechanism.
So does that make sense?
I'm not sure I see this in either or.
It's more of just, you know,
making it possible for these to work together.
Yeah, yeah, that's why, you you know that was the purpose of my question i'm trying to understand you know the the philosophy let's say behind it and where you want to go with that basically and
i also read a couple of the blog posts that uh you have on the on the stargate uh website and
some of them were quite interesting from from technical point of view. They explained the architecture behind it and so on. And obviously, you know, I guess not all of it,
not all of the vision, let's say, is there at this point because the diagrams I saw also had
things like support for SQL. Yes. Yeah. So let me talk.
So I do want to make, it's a really good point.
So when we look at Stargate, and we are talking a lot about GraphQL today, our first milestone
for Stargate was to get to the full stack developers. And the full stack developers, again, there's, you know,
I remember when that term first came out,
but now, you know, per the latest developer surveys,
it's something like 12 million developers
identify as full stack developers today.
What we heard was two things.
We heard you need really great REST JSON APIs.
Because, by the way, I love talking about GraphQL because as an API geek, that's my favorite thing is when a new API comes out.
But REST APIs, and it's funny to say this because I think we all remember, many of us remember when REST APIs first emerged on the scene and became a big deal.
And now we're talking about REST APIs as if they were the whole thing.
But REST APIs and JSON, by and large, were what the majority of developers are using right now. And so what we knew was that the first release,
the 1.0 release of Stargate needed to have really great REST JSON APIs
and really great GraphQL.
But Stargate is also meant to go and be the mechanism by which other APIs
can be brought to the Cassandra world. And so, you know, when you look at some of those diagrams,
one of the things we talk about, you know, that you'll see very soon in the first half of the coming year, hopefully very early on in the year, is GRPC.
Because we have a lot of developers that are building microservices that want to be able to access data in a high performance way.
And, you know, GraphQL is very expressive, but it's typically meant for front end clients.
But you also have people that are going and doing very high performance read rights to the database who want who also want a fast API to do that. And, you know, if you look at how we do things like that at Google, we do use protocols like gRPC to do that. And so part of what Stargate is designed for is to make it possible for us to
go and address each one of these API types that people want to be able to go and use within,
you know, within their architecture. So that's why, you know, Stargate should be viewed as, it's not just, you know, it's not just GraphQL, even though today we're talking a lot about that.
It's the way that we're going to make it possible for any developer, you know, in whatever stack that they're using to build, you know, their apps, whether it's for front endend or back-end, that they'll be able to build those.
Yeah, and actually that's a good point that you made about GraphQL,
you know, being actually primarily designed, I guess,
to serve front-end requirements. And even though many databases have adopted it,
and, you know, through mutations, basically,
it's not always
a natural fit sometimes um yeah both the api makers and designers and the users have to well
bend or tweak things around a little bit to to make it work to to their needs so
actually what you have come up with i think it it does make sense, you know, if you think about it and if you have actually used GraphQL to interact with databases.
The question I have, however, is like, okay, I totally get how, you know, primarily it's intended for Cassandra because this is your number one use case.
Do you see, however, on my part at least, I think that this is a broader need, let's say,
to be addressed. Is it anywhere in your roadmap? Would you like to see that being adopted by other databases and kind of creating an ecosystem of its own,
kind of like GraphQL for servers, for the database of sorts?
Yeah, it's a really good question.
And it's one of the things that we look at and really talk about every day. Um, we, um, so first of all, with, with Stargate, um, um, you know,
it's, we're an open source company. And so when we go and build something, even something that,
that we're putting in the cloud, um, you know, we start with, are we developing this in the open?
Can people run it themselves? Is it licensed in such
a way? So we use the Apache license for everything we do so that anybody can go and, you know,
contribute to it and issue pull requests and so on. And so I put that out there because we have looked at this question of, you know, could Stargate go and talk to multiple databases?
The technical answer is, of course, it could.
It's written in a very modular way.
We, you know, the thing that we bring to the table is, you know, we're experts in Cassandra.
And what we tried to do
within Stargate, it's a very modular architecture. When you call that API, there's an extension
mechanism behind it that loads the, you know, appropriate data access logic that goes and does
things like take that JSON object, schema-less JSON object, and turn it into the specific set of Cassandra CQL
commands. So that to the developer, it just looks like this super simple REST API that you might
get from something like Firebase. But behind the scenes, we're turning it into very high-performing
Cassandra CQL, Cassandra query language. So that's the expertise we're able to bring to the table.
If somebody that was, you know, a deep expert on a different database came in and said,
hey, we've gone and created this, you know, pull request that now lets you go and send,
you know, send this to another database.
Absolutely. We would be, we would be overjoyed. Um, again, we, we are big believers in open
source, big believers in, in, in the community, um, and, and the community being able to take,
take things in the direction, uh, you know, that, that, that people, you know, feel that they want
to, want to solve these problems with. Um, but right now, you know, we're, we're, that people, you know, feel that they want to solve these problems with.
But right now, you know, we are, you know, the Cassandra people, and we want to make sure that we, you know, we want to solve these things for Cassandra first. So what you'll see within the
roadmap is that we've created the room within it, both within the architecture, it's all pluggable. We've documented it.
We've created an open source environment
for people to help collaborate with us.
We'll do a little bit in that area
because we're naturally,
we're curious and interested.
We do want to,
and particularly want to be able to integrate
some of these other technologies in there.
We'll tackle a few of those ourselves,
but as I said, we're creating the room
for other folks who want to come in
and tackle some of these other databases
to come in and handle that piece.
Okay, yeah, it makes sense, certainly.
So yeah, I guess since we're actually ready over time
to wrap up, I guess for me, Υποστηρίζεται. Εντάξει, εγώ πιστεύω ότι είμαστε έτοιμοι με την ώρα.
Για να κλείσω, πιστεύω ότι για εμένα το πρωτοβουλίο από αυτή την
παρουσίαση, να πούμε, στο Stargate είναι ότι, ναι, βλέπουμε πολύ,
στην αρχή, φιλοσοφικά, ή ακόμα και στην αρχιτεκτορία, όπως το GraphQL.
Και, πραγματικά, μπορείς να το χρησιμοποιήσεις σε συμπέραση με το GraphQL. And actually you can use it in combination with GraphQL. And if you're already using
GraphQL, it will basically look pretty much the same, I guess, to you. So you can, as
you mentioned, you can aggregate Stargate and GraphQL and just have your queries that
span multiple endpoints. And the other thing that I wanted to ask to wrap up this conversation,
and actually to tie it into the last question. So obviously, you know, it's just the beginning,
and I see that you have some traction and some of the early adopters, let's say,
were also mentioned in the press release that's going to be issued. You mentioned names like Yelp,
and there's a few other big ones in there. So, yeah, basically I wanted to ask if you could
give me like a brief overview of which organizations are adopting it so far. And you
also mentioned that it can be used both on the open source Cassandra and on Astra,
the cloud version of Cassandra run by Datastacks. And what are the next steps in that journey?
And just to close a closing comment on my part, so on the forward-looking question that I asked
you previously. So I guess if it's going, you know, if there's any chance,
let's say, of Stargate becoming somewhat of a server-side GraphQL,
to call it that way, then I guess this will probably go through your users.
So as we know, polyglot persistence is a fact of life.
So typically people use more than one database for their systems.
And if they start liking Stargate, then it's quite possible that some of them may want to adopt it for accessing more of the databases.
So that could possibly be the way.
Absolutely. I mean, I think that what I would say is that a couple of different things.
So first, I'd say that people want to access their data.
The big idea is here, people want to access their data via APIs.
The old way where I would go and have a driver and
try to find that driver for my language and all of that sort of thing.
And then each driver is different.
And when I go and I try to switch my app from, you know, Postgres to some other, you know, database, then the drivers look different.
You know, most developers, they want to use services.
They want to use APIs. They want to use APIs. They want
to use microservices. They want to use APIs that they know. They want to use REST APIs,
or they want to use GraphQL APIs, or they want to use gRPC. And, you know, there's a few others.
And so I think what you'll see is, you know, I think you'll see, you'll see, you know, that Stargate, you know,
definitely for people who want to use Cassandra in the mix, that Stargate is going to be the best
way to do that. I think just in general, from a trend standpoint, I think the big idea is
look at, you know, where people are going with these services. And so that's, you know, where, where people are going with, with these, with these services. And so, so that's, you know, that, that's, that's the part that's really important to us. Um,
I think like you pointed out, um, there's, there's a lot of folks, um, we're really excited
that we've been able to get a lot of the people that are doing interesting things with, um,
with Cassandra to, uh, to, to, you know, look at this and jump on board. And there's,
um, there's a bunch of folks, um, that, that we're talking about and actually, you know, we,
we are in the process of, of adding to that list. There's actually, uh, as I'm sure you're aware of,
uh, you know, there's always a process where you have a lot of people using it, but,
but, uh, then you've got to get them to, to, to, you know, approve that a process where you have a lot of people using it but but uh then
you've got to get them to to to you know approve that they're that they they want to talk about it
publicly so so we're adding to that list there's a few more that are coming in um uh that that that
have been doing it but they the our goal has been to go to people, big internet companies and Yelp was a great example of that.
People that are serving, they've got a large amount of data,
but they also have to have their front end developers
and their mobile developers able to get at that data.
We have some retailers, big retailers.
We have Burberry that is a really innovative retailer in e-commerce.
They create Facebook apps. They create mobile apps. They have that same issue, which is that
they want to use Cassandra. It's massively powerful. They can use it around the world
in a globally distributed way. But they also need their
front-end developers and their full-stack developers to be, you know, able to do these
things super simply. We have a couple of enterprises on that list as well, financial
services companies and, you know, that are using it, you know, traditional enterprises. So this is one
of these things that when I talk about developer productivity, every company has developers,
uh, that are building these, these, you know, apps, um, and they're building a lot of these
apps and they need to be very productive when they, when they do it. Um, it isn't just, you
know, the world is not like it used to be where you had, you know, the enterprise companies and then you had the startups.
These days, everybody is using the latest technology.
Everyone reads Hacker News.
Everybody wants to make sure that they're building these things, you know, in the best way possible.
And so we've tried to work with as many of these companies,
um, you know, as possible. And, and now that we've got, you know, you know, many,
many people go and say, that's great. I'll wait until it's, it's at 1.0, uh, before I get started
with it. And that's part of what we were, uh, you know, that's part of what, what this announcement
is about is to tell, tell all all the rest of the developers that are
waiting on this that that now's the time to start and so so that's um you know i i think you'll
you'll see a lot more of this from us we're doing a bunch of great events uh we have a react react
with cassandra event uh this week where you know we we have an online event where a bunch of of of people in the react
and and full stack and jam stack community uh that are going to be talking about how to use this to
you know to build these types of apps
yeah yeah that that makes sense and yeah i guess the uh uh on, I guess on the broader side of things,
even though my takeaway would be,
okay, so this kind of wraps GraphQL
on the broader side of things.
It's like, well, okay, version 1.0 is here,
so you can actually go ahead and use it.
Yes, yeah, absolutely.
And I think GraphQL,
you know, the thing I love about GraphQL is it is,
well, the two things I love about GraphQL,
but first thing I love about GraphQL is
I've always been a strong believer that the purpose of APIs
is to make the application developers life easier.
And GraphQL is completely designed for the idea of that,
of having the front end application developer gets to say,
this is the data I want and GraphQL presents it to you.
But the second thing I love about GraphQL is we're still in the early stages
of GraphQL and there's like,
and there's so much innovation happening and so many cool startups. And, and so, you know, as,
as we got in it and we started working with, with these developers, you know, it really is
changing very quickly. So, so, you know, our goal is you're going to, you're going to see us being
very active within that community and, sure that we're providing the best way
for Cassandra developers to participate in it.
Thank you.
I hope you enjoyed the podcast.
If you like my work, you can follow Link Data Orchestration
on Twitter, LinkedIn, and Facebook.