Orchestrate all the Things - ScyllaDB’s incremental changes: Just the tip of the iceberg. Featuring ScyllaDB CEO & Co-founder Dor Laor

Starting point is 00:00:00 Welcome to the Orchestrate All the Things podcast. I'm George Amadiotis and we'll be connecting the dots together. Is incremental change a bad thing? The answer, as with most things in life, is it depends. In the world of technology specifically, the balance between innovation and tried and true concepts and solutions seems to have tipped in favor of the former. Or at least that's the impression reading the headlines gives. Good thing there's more to life than headlines. The ScyllaDB team is one of those who work the garage doors up and are not necessarily

Starting point is 00:00:33 after making headlines. They believe that incremental change is nothing to shun if it leads to steady progress. Compared to the release of ScyllaDB 5.0 στο ΣΥΛΑ Summit 2022, αυτό που έφερε το ΣΥΛΑΔΜΕΙ Summit 2023 μπορούσε να γίνει αυτοκίνητος αυτοκίνητος. Αλλά αυτό είναι απλά το τύπο του ασπέργου, όπου ο Αρθές πιο συνεχίζει. Είχα τη συμφωνία με τον CEO και τον κομμάτια του ΣΥΛΑΔΜΕΙ, τον Δόρ Λάουρ, για να συζητήσουμε τι έκανε το τύπο βιζιό του 2022, πώς χρησιμοποιούνται τα ΣΥΛΑΔΜΕΙ, όπως και τα τρόμια και τα αντιπροστάσεις στον κόσμο της υπερπερφόμενης πληροφορίας και διαστασίας. busy in 2022, how people are using ScyllaDB, as well as trends and trade-offs in the world

Starting point is 00:01:05 of high-performance compute and storage. I hope you will enjoy the podcast. If you like my work, you can follow Link Data Orchestration on Twitter, LinkedIn, and Facebook. So hello, everybody. I'm Dor, the CEO of ScyllaDB, also a co-founder. I'm an engineer in my roots. I was involved in lots of different projects in the past from Terabit router to creation of the KVM hyper together with my other co-founder, Avi,

Starting point is 00:01:38 from zero to the point where it was enterprise solution and got propagated to the different cloud providers. And also with Scylla, we initially created an operating system from scratch in open source that exists today. But we pivoted to the database space. And in the past eight years, we've been building the Scylla DB database. Great. thank you. So with that said, I should also add a little bit of context from my side that well, it's not the first time we connect actually, we go back a few years, I think, I believe the first time was actually in 2017, if I'm not mistaken. But I'm pretty certain about when the last time was. It was precisely one year

Starting point is 00:02:27 ago. And at the time, there were a few things going on. So first, same as now, you just had your main event, the Scalability Summit. And you had also released a new version back then so back then it was version 5.0 of CLDB and I thought a good way to start the conversation today and see what's going on around not just CLDB but in general would be to pick up on the main themes that sort of emerged from last year's conversation. So the main themes were cloud and real-time data. And one of the things that you mentioned last year was how much growth you were seeing around Ciladb's cloud offering. And I just thought I would ask by asking you, so has that growth been sustained throughout the years of from 2022 to 2023 today? So the answer is simple. It's absolutely yes. We've been growing over 100% year over year in Scylla Cloud.

Starting point is 00:03:46 That's our database as a service. So we see stronger growth and it will continue to grow. That product, the service exceeded 50% of our revenue. So today, a majority of our paid subscriptions are coming from the database as a service component. And we'll see this theme continue. In general, everybody needs to have the core database, but the service is the easiest way to consume it, also the safest. And this theme is very strong with Scylla, but it's also very strong with the entire industry across the board with other databases

Starting point is 00:04:33 and also sometimes beyond databases with other type of infrastructure. It makes lots of sense. Yeah, indeed. And the one thing that sort of stood out for me about your cloud offering from last year was the fact that, well, obviously you support AWS and Google Cloud, but for some reason, at least last year, there was no support for Azure, and I'm wondering if that has changed. That's correct, and that hasn't changed.

Starting point is 00:05:15 We do support Azure in our core database, both the open source offering and the enterprise offering. We actually develop now a relatively simple mechanism to support Azure with the Azure marketplace, primarily in order to allow gaming companies to use the Epic Games stack that Scylla is part of it to consume Scylla simply on Azure. But we didn't have the time to add Scylla Cloud DBaaS to Azure. Not that there's lack of will or demand, primarily we just can't do it all. And we've done many, many improvements to our Scylla Cloud. We'll probably talk about many of them, but the primary one is to switch to serverless so this is our main focus to switch to serverless uh probably later it'll be easier to add azure as a target okay okay i see indeed yeah uh you did mention serverless and that's also something i picked up well basically from uh from your talk in the recent event that you just had.

Starting point is 00:06:25 It was very prominently featured there. So indeed, we will revisit it and talk about it in more detail. But since we're still in the introduction, let's move on to the other main theme from last year. So real-time data. And I know that CLoTB has, I think fairly recently, probably also a little earlier from last year, you have recently added support for CDC, so change data capture. And a big part of what we talked about last year

Starting point is 00:07:02 were basically use cases that were using that. So I'm wondering if you're seeing that momentum also maintaining or maybe picking up speed as well? Yep, users and customers like our CDC solution. I really take lots of pride in it, although I didn't invent the way it's implemented, but it's really nice that all of the CDC events, all of the changes events go to another table in there. Users can consume the other table as a regular table.

Starting point is 00:07:44 It's pretty much a very neat interface. And obviously it's connected to other connectors as Kafka each there is a need. So it's trivial to support it, to consume it either through Kafka or through the library that we allow to just to access it as a regular table. And the usage is picking up. And also we see sometimes that there are cases,

Starting point is 00:08:15 usually these days are more of exceptions, but we see the number of use cases that increase that Scylla isn't just used as a database, but also sometimes used as an alternative to Kafka too. So it's not that it's a replacement for Kafka, but if you have a database plus Kafka stack, there are cases that instead of queuing stuff in Kafka and pushing them to the database, you can just also do the queuing within the database itself. A year ago, Palo Alto presented their case where they eliminated Kafka, not because Kafka

Starting point is 00:08:51 is bad, it's great, but just to reduce the infrastructure and have less loading parts. And just in the summit just now, we had Numberly, one of our customers who also wanted to eliminate that they wanted to eliminate redis which they have and to eliminate kafka altogether yeah so they've done it and now they have a way easier step yeah indeed i mean uh you're right modern data stacks are getting more and more complicated with more and more layers and more and more vendors in the mix. So I'm pretty certain that if users have the chance to simplify, they will choose to go for it. I do remember the Palo Alto use case from last year. I also saw Numberly from this year's event. And that's actually a good segue to talk about what was featured in this year's event, besides Numberly, which you just talked about. Just browsing through all the presentations,

Starting point is 00:10:01 there were again a couple of main themes that stood out for me. And the first would be migration. You had a number of use cases in which people from different organizations such as Discord or ShareChat or Confluent even, presented use cases in which they migrated from other solutions to CLA-DB. And so I'm guessing that this is something that probably happens a lot in your users. So I was wondering if you'd like to just share a few words about, well, how do people approach it in general, and then maybe even if you wanted to comment specifically on those use cases that were presented this year. Yeah, so usually as you said, and cheers for following the summit and all of the

Starting point is 00:10:59 content, usually since we're a relatively very scalable solution, sometimes users start with another database because they don't have the scale. And over time, they hit a wall and they need to switch to a better database that can scale like us. And then it's time to migrate so we see a lot of migrations and migrations from we support two compatible API is Cassandra API and the DynamoDB API is something but by the way sometimes CDC can come into play or more for dynamic streams and and it can help migrations and we have several migration tools and one of them is a project called spark migrator where it scans the source database and it's rights to a target database

Starting point is 00:12:02 and and if that source database supports changing the capture, we also capture the changes. Otherwise, you need to do double rights, which is another possibility. Discord specifically wanted to, they rewrote the migrator in Rust and got better performance. Although there's not necessarily that much of a performance concern with the Spark migrator since it's scalable, so you could just run several engines. ShareChat also, they just run dozens of services.

Starting point is 00:12:42 So it's not just migration of one or two, it's migration of 20. So that's sometimes can get complicated. And sometimes there is an existing data structure called counters, which is more challenging to migrate. And it's not indomitant. So it's a little bit more challenging. But we're doing it. We used to do that. Many other customers, by the way, choose to rewrite their stack without keeping the API compatibility. We see that too.

Starting point is 00:13:16 Many times it's an opportunity to change a programming language. So if someone was used to do JavaScript or PHP in the past, now it's time to move to a newer language like Go or Rust. And the other theme that was sort of emergent, let's say, by just browsing through the presentations in CDLA-DB 2023 was high performance. I mean, of course, that's sort of not new, really,

Starting point is 00:13:52 since this has been sort of kind of the epicenter of CLA-DB's value offering from the beginning. It just occurred to me that I saw a number of presentations that were focused specifically on that. So getting the most out of CLA-DB. For example, there was Epic Games, there was Optimizely and Strava. Those were the ones that stood out for me. And you also mentioned Epic Games specifically previously when you were talking about how you are part of their stack that was something I missed so I wonder if you wanted to clarify how

Starting point is 00:14:31 exactly do you work then with Epic Games? So I'm really excited with the Epic Game case because it's part of the Unreal Engine and all of the graphic engines for many, many of us grew on the Unreal Engine. So it's super gratifying to see something like this. Also Strava, since I'm

Starting point is 00:14:58 a mountain bike rider. So it's really nice to see this usage translate to day-to-day life. The Epic Game usage is such that they basically store Scylla, use Scylla as a blog store, as a fast distributed storage. They already have a storage layer that they store in S3. And S3 is slow, is fantastic, but slow,

Starting point is 00:15:28 primarily the latency is the problem. So they utilize the cache over S3, and they store both metadata and also smaller objects, not gigantic ones, but also smaller objects in the size of megabytes and and they build that stack for um for many of their gamers it's not just for the some of the games that they develop but also it's kind of a game engine and game library so hundreds of organizations organizations can and can use that stack um and they can either use open source or enterprise or as a service with that use case. Okay, I see. And you also mentioned earlier when again you were talking about Azure that this is the infrastructure that Epic Games is using and therefore they're usingllaDB on Azure, right? Correct. So Epic Games and Azure have partnerships around providing the entire Epic Games Unreal stack,

Starting point is 00:16:33 which includes plenty of other components. And they wanted to make it available and easy to consume on Azure. And Scylla is one of the components. So we're adding Scylla to the marketplace. So it will be pretty trivial to consume Scylla on Azure together with the rest of the Epic Games components. You also mentioned something else

Starting point is 00:17:01 just a little bit earlier that some of these clients some of these use cases use the database as a as a service version and some others also use the the open source version so out of those that we mentioned so far so discord and search chat and epic games which one which version do they use? Discord and SharedChat are using the Enterprise version and Epic Games or the different usages that Epic Games provides use a mix. Epic Games in general strive to use... since they make a choice of technology and later on and sometimes they're not the

Starting point is 00:17:49 it's more of a reference architecture in the recommended stack then everybody else gets to implement it so they're trying to use the least common denominator and then every there are hundreds and thousands of gaming vendors and each of gaming vendors,

Starting point is 00:18:05 and each of them need to select their own preferred way of consumption. We see the same case that there is sometimes there are cases where a CLA is selected for cryptocurrency cryptocurrency and digital currency products by a committee of the technologists and they use open source and then everybody who want to implement that may need to may choose to use the enterprise version. Okay, I see. Right, so I didn't actually realize that that was the way CLDB was used in the Epic Games context. And actually, I didn't understand how exactly their stack works. I had the impression that, well, it was a stack that was used internally by them. But actually, based on what you just said, it sounds that's not actually the case.

Starting point is 00:19:05 I mean, yes, they do have an internal stack that they use, but they also sort of recommend, let's say, this stack is adopted by their network, the people, the companies that they work with that implement their games. So how does that work exactly for them? And so it's not that I know the overall details, but like in the past, where this Unreal was basically a library engine, and in the past, like my more

Starting point is 00:19:41 for childhood games and all that, like you just used to download the game and you you you got all of the binaries and then you just run it on a single computer and that was that so and and other other games could just buy or license the unreal engine and this was the way of consumption. And today, in our era, everything is more complex, and there is cloud, and everything is distributed. So the story continues. Now, the way Scylla is part of the stack, it's not for the actual game. It's for more for the game manufacturers to create the game because there is a lot of graphical objects that are really heavyweight

Starting point is 00:20:36 and game manufacturers are large team distributed across multiple geos, and they need to collaborate. And always to download all of these big heavyweight in access and to enumerate all of these graphical resources. And this is why they need to store it on S3 and also to have a distributed cache, which Scylla provides the functionality there to accelerate that. So everybody wants to develop a game using the overall distributed Unreal infrastructure including tools to develop the game will use it. Okay, okay

Starting point is 00:21:23 actually so yeah that's another thing that stood out for me in that description that you shared was the fact that CLDB is used as a CAS. I mean, traditionally, historically, let's say, I think it was mostly solutions like Redis

Starting point is 00:21:40 for example. So very, very lightweight, just key value stores, basically. It's the first time, at least for me, that I hear about such a use case in which CLDB is used as a cache, not as, you know, the main database, let's say, but as a cache. Is that something that you see happening elsewhere as well? Yes, the quick answer is yes, but there are different types of ways you can think about as a source of truth and a cache. Traditionally, Redis is a cache in front of another database. So in this case, it's less relevant

Starting point is 00:22:25 because Scylla itself doesn't need to be, doesn't need a cache. And so most of the time it's not the right use case for Scylla, but there are variety of ways that you can think about the data and caching. In this particular case, it's caching in front of S3 and it's basically a fast distributed storage. It happens to be a cache because they want to keep like S3 is obviously larger and you just need to save the most commonly used object in

Starting point is 00:23:07 Scylla because S3 is cheaper but it's quite common pattern there was another presentation at the summit that had another S3 caching type of layer of really interesting one and that caches a thousand tens of thousands of parquet files that are stored on S3, and C-LadB caches the metadata for these tens of thousands of files. But it can sometimes be, we have customers who keep the data in variety of sources, not necessarily one place, and they store data on Scylla for very fast access.

Starting point is 00:23:52 Now, it's possible to regenerate the data, but it's expensive, and also the access needs to be super fast. Okay, I see. Well, speaking of events by the way, there's something else I learned about CLADP just fairly recently. So besides the main event that you have, so the summit, it seems that you're organizing another one which is called the P99Conf and I was wondering if you'd like to just share a few words about it. So what's the the motivation basically behind doing that and what is this conference about and a little bit

Starting point is 00:24:42 obviously you participate in that as well. So, yeah, who is it for and what's the main idea? Absolutely. It's something that I'm very excited about. So, in general, we're a database vendor, but we also were involved in different other development before the Linux kernel, our own OS that we created, OSV, a KVM hypervisor, and also sometimes, let's say, at the Scylla Summit, many people would like to listen to how the database is supposed to use and what are the features,

Starting point is 00:25:23 but it's not necessarily the best venue to provide the details of how we build our technology because it's too down in the width. And so we thought of coming up with a separate event, which is not unique to Scylla and it's more intended for developers to, it's an event by developers for developers in order to expand the knowledge and to focus around performance and real-time so and it

Starting point is 00:25:56 doesn't necessarily silly event or not a database event so we encourage people to present databases, including competitors, operating systems development, programming languages, special libraries. We have a mixture of all of these things. And the idea is not about products, but the emphasis is about technology. It's about code it's it's about improving in optimism optimizing infrastructure okay I should probably also mention here that well it's probably a very very obvious to you but maybe it's not necessarily equally obvious to everyone that the P99 refers to the 99th percentile so it's a direct reference to availability basically and well having high

Starting point is 00:26:52 availability and then all that and I think one of the presentations that caught my attention in that event was basically making the point that well maybe you know the 99.9999 availability is not always what you need because it's it's a lot of effort to reach that and well maybe sometimes the trade-off is not worth it. Yes so actually originally the P99 is supposed to be P99 latency, but you are 100% correct that it's also applicable to availability and actually I covered this in my recent talk there. So it's about availability, it's about predictability of the performance, it's about multiple things. And it's important, while it's all exciting and important, it's not necessarily important to everyone. It's expensive to have a near-perfect

Starting point is 00:28:00 system and not everybody needs a near- system all the time. So there is also interesting talk there by Gil Tene who talks a lot about latency and he mainly says, okay, don't necessarily look at the P99 alone. You need to actually think about the service level that your application needs. This is the number one OKR number that you need to track. Now, if in order to reach your targets, you need the great P99, then go ahead measure it. But it's important to not to switch the means with the actual goal. Yeah, indeed.

Starting point is 00:28:49 Another presentation and another presenter that you had, which I think was of note, was Brian Cantrell. And again, for people who may not know who he is, well's say he's some somewhat of a celebrity in software engineering uh used to work at uh sun microsystems and then oracle and then uh vpf engineering at soviet and well a person that many people in this in this trade let's say know and are are keen to to listen to and one of the points that he made in his talk was about using Rust. And, well, he was very enthusiastic about it. He basically said something along the lines of,

Starting point is 00:29:34 well, I've been using C throughout my career because it's the most performant one and the one I'm most comfortable using. But now that I've discovered Rust it seems to be everything that C had and then actually a little even better because it's easier to use basically and I noticed that CLTB is also embracing Rust as well as Go as well as Emply. It seems they're used, as far as I could tell at least, mostly for the moment in creating drivers for CLADP. And I'm wondering if you'd also like to share a little bit of the motivation, let's say, what's happening there. It seems there's a little bit of internal R&D going on

Starting point is 00:30:25 and using those languages is part of that. Yeah, we're fortunate to have fantastic people present at the P99 and it's also for the audience. It's free and it's online and it's also recorded and you can just go and watch all of the videos for free that were presented so it's it's never late including brian's talk and yeah rust is taking over the world and it's fantastic language we at sila started the silcyllaDB core in C++ before Rust was invented. And for us, C++ was a big improvement over C.

Starting point is 00:31:12 Also, the OS that we created was also based on C++, which is a big improvement if you used to code in C in the Linux kernel. So C++ is way better. But Rust has advantages. It takes care primarily, it takes care for object lifetime and memory allocation for your behalf.

Starting point is 00:31:38 And that's a significant advantage. And nowadays there is really lively community around rust so things are moving faster and this is one of the reasons that there was no point in rewriting sila in rust because uh rc plus score is extremely sophisticated um so and it's really good today. But in the driver case, we developed a Rust driver, which has fantastic performance. And it supports all of the fast use cases that people use Rust. And at the summit, you can observe that Discord moved from Go to Rust. And Numberly moved from Go to Rust, and Numberly moved from Python to Rust.

Starting point is 00:32:30 So many of our users are moving to Rust. Some of them take our shard per core architecture and implement it on the client side too, like Numberly. And now that we have a Rust driver, we like not to maintain the Scylla C++ driver. And instead, we created a C++ wrapper over the Rust driver. So it allows us to invest in a single driver. There was a network it allows it allows us to invest in a single driver a for the majority of the time and then just wrap some layers for other product programming language languages not for all of the programming languages like for JVNs will continue to develop the JVN.

Starting point is 00:33:28 Okay. Okay. Actually, so yeah, indeed. I was wondering because a key part again of your value proposition was basically originally reimplementing Cassandra's API in C++ as opposed to using Java, which is the language that it was originally written in, and that gave you, well, that and your C star library gave you a boost in performance. Just out of curiosity, if you were to start today, do you think you would maybe start doing that in Rust instead of C++? I mean, I know you just said that it doesn't make sense for you to port it now because you've already invested a lot in doing that and you have over time fine-tuned your code base and everything, but theoretically speaking, if you were to start today, do you think you may start, you would maybe start doing that in rust most likely yes uh avi or cdl

Starting point is 00:34:29 is the main figure who la to lead such a choice but uh probably yes um now we want ports uh c plus class we're using cutting edge c plus plus so we're using C++ 20. And now we're excited about C++ 23. And it's very different than the C++ that I learned in university. But still, it's not exactly like Rust. I believe that C++ will continue to evolve and maybe provide tracking of memory allocation itself, maybe. But one thing that we talked about internally is to add Rust binding to Scylla core itself in order to write some components in Rust and not C++. For example, some of the control path, for example, it's easier and safer to write in Rust, So for example, some of the control path, for example,

Starting point is 00:35:29 it's easier and safer to write in Rust and it's may not necessarily require lots of re-engineering. So it can provide value. It's something that we like to do. It's similarly to the work that was contributed to the Linux kernel to add Rust to it as well. Okay, so now we've eventually gotten into, well, talking about R&D stuff, I guess. And I noticed that you happen to have a new VP of R&D. And you also mentioned earlier in the conversation about serverless which is an area

Starting point is 00:36:06 that you're actively developing so then let's bring that all together so what's your main R&D directions then for the coming period? Nowadays we have several big directions, large efforts that we do, and the company could continue to scale. So we have a larger and larger team. In general, the team is divided between the core database and the as a service component. Plus we have a large team to handle kubernetes which is part of our the serverless offering but uh some teams there is a lot of cross cross team work projects uh because

Starting point is 00:37:00 we're working on several projects so a serverless serverless, it's also a cross-team project because it contains lots of different sub-projects. That's one major activity. Another project is our strong consistency based on Raft. It's a project that began two years ago. It's already part of the ScyllaDB offering, but now the release that we are about to ship 5.2 contains a strong consistency for some of the metadata operations by default. So we'll offer strong consistency for Scylla metadata,

Starting point is 00:37:45 which will make it easier to maintain and also make it more elastic, so we'll be able to scale faster. This is an area where it's pretty important for databases of service and also serverless. That's based on Raft. We're already an experimental mode in mode where we're not eventually consistent but we're strongly consistent based on on Raft and that's that will replace our previous a lightweight ability. Now beyond this, we have another big activity

Starting point is 00:38:31 around the object store. Once you move try to accelerate using Scylla. We also like to use S3 because it's a sea of storage, of extremely cheap storage on one hand, but on the other hand it's also slow. But if you can marry the two, then you manage to break the relationship between compute and storage and that gives you lots of benefits from extreme flexibility to lower TCO and this is a main project we were working on these days. Okay, yeah indeed that sounds quite major, I would say. So if you manage indeed to leverage S3 as your storage, I'm guessing that it won't be

Starting point is 00:39:33 your exclusive storage because as you very correctly pointed out, S3 is abundant, but it's also quite slow. So I'm imagining that you will probably want to combine it with some other form of storage, I don't know, NVMe or whatever else. But if you manage to get it, then I think that will be quite well received. Exactly, that's the idea. We'll store data on S3,

Starting point is 00:40:04 but we'll cache and hold multiple objects. Some of them will be in a distributed consistent CLA DB table just to access our own metadata, exactly like some of our users. And some of it will be in the form of caches, but not cache that you place in RAM, but cache that you place in local NVMe. I see. By the way, you also mentioned something else earlier about when some of these new features are going to be released. And you said that they're going to be included

Starting point is 00:40:46 in the upcoming CLAD-B version 5.2. So just to clarify, there's two versions of CLAD-B, right? So there's the open source version, I'm guessing that's the one that you're referring to, and there's the enterprise version as well. What may be a bit disorienting is that the numbering is not exactly the same so you have you just released a major version so version 5.0 last year and now you're following up the current one is i believe 5.1 and you're going to be releasing 5.2 soon while while for the Enterprise version, the numbering is different. So, I

Starting point is 00:41:26 believe there it follows a different pattern. So, it's characterized by the year of the release, and then you have a, you know, 0.1 or 0.2 or something. What exactly is the relationship between those two, and how do features, you know, move from one to the other? So the enterprise version is based on the open source and it's basically a fork of the open source version. And every time that we release another enterprise release, we fork probably the latest open source release, and we carry on all of the non-open source components, and they're not much. It's primarily security features and a couple of performance or TCO-based features. But just like that,

Starting point is 00:42:26 almost the entire code base is open source. And naming is, we handle naming for open source and the naming for enterprise is more of, is just functional. So we just base it on on the year of the release so enterprise users will have an easy time to figure out the lifetime of the enterprise release okay okay I see I'm guessing that you also have like

Starting point is 00:42:59 specific support periods for the enterprise version? So, for example, for the current one, which I think is 2022.2, what's the standard period of support? So, it's usually something like two and a half years of support. So, it's beyond the year of the release, there are additional two years of support. So on average two and a half years. Okay, okay I see. So what's in your agenda? When are the next releases both for the open source and for the enterprise So 5.2 release, we're going to have a consistent transactional scheme operation based on Raft. So that's a big improvement because originally in Cassandra, there were no consistent scheme operations and it was a big gap. For those who follow our project, we did, in the next release after this, 5.3, we'll also have transactional topology changes. For those who follow us, I can give an example.

Starting point is 00:44:19 We passed the Jepson reliability and consistency test two years ago. I think it was 2020 or so. Now we pass it that flying colors to be exact. If you're familiar with his style. Yes, he's very characteristic figure in this space. Now, when passing it, we had to make the limitation that you can pass it as long as you don't change the topology and you don't change the schema. So these were the consideration. But basically, you're allowed to do one topology change at a time. So you can't add two nodes at once, or you can add and remove nodes altogether at the same step. So that was a major limitation.

Starting point is 00:45:39 Also, schema changes. It's something that more sophisticated organizations may want to do programmatically and not just it's rare occasions where you change a table. So Raft allows us now to serialize all of these changes in 5.2 the schema and in 5.3 afterwards, we have it working on the master branch branch not yet released um the topology changes so it will be very agile and we'll be able to pass jeffson a together with many topology and schema changes and this is just a a preparation towards bigger changes, which will change the way we treat, we shard data.

Starting point is 00:46:31 Today, the sharding is automatic, but usually it's done on the timing that you add the node to the cluster. This is where the time that we allocate key ranges to that node. And we want to make it way more dynamic, so we'll be able to load balance. Even if the hardware doesn't change, we may want to load balance things across shards and work with smaller units, which we call tablets. And also if you add hardware, you'll be able to separate the process of adding hardware and the process of load balancing hardware. And practically, you'll be

Starting point is 00:47:14 able to use that new hardware that you added to the cluster way faster because tablets are units of work of the form of 10 gigabytes. So you can quickly send another 10 gigabytes and immediately serve data from a new node for that range, as opposed to what we have today, where every node, let's say if you have 10 terabytes, you need to complete to transfer 10 terabytes and only afterwards start serving from that new node.

Starting point is 00:47:45 So this is a big change and we'll probably deliver it towards the end of the year. Could be beginning of next year too. Because we're trying to do lots of tests before we release something to general availability. All right. OK, so I think we're close to wrapping up. And I want to pick up on something you mentioned earlier about people who follow the project. I noticed there's also something new, something else new,

Starting point is 00:48:21 which is, well, maybe otherwise minor, but somehow connected to what you said. So you have a new community forum. And so I was wondering if that means anything with respect to your contributor base, because I think based on, well, previous conversations, that it used to be that your contributors were mainly coming from, well, C-Lad B employees and for a number of reasons because it's a complicated project, it's in C++ and using your specific C-Star

Starting point is 00:48:59 library and which not many people are comfortable in using but have you seen that maybe change so are there more contributors as of late and this is maybe the reason why you have a new forum to serve them? So the contributions for the core database remains limited because it's quite complicated but we do have more contributions to the C-Star, which is complicated, but people pick up C-Star and use it for their own usage, and they build other things with it. And also we get lots of contributions to peripherals, whether it can be drivers like Go and sometimes Go tool sets on top of the Go driver.

Starting point is 00:49:53 One of them was presented by one of to open source graph tools that use Scylla under the hood. Both Quinn was at the summit and just recently JanusGraph added an official support back in support for Scylla. So there's a variety of options. And we added the community forum in order to have another place for people to ask questions that can assist other community members because we have a pretty vibrant Slack community with more than 5,000 people there. But usually Slack is kind of temporal. You ask something and it vanishes.

Starting point is 00:50:51 And the community forum stays and people can first search the forum before they ask questions. So it's a longer-term type of resource indeed yeah you just touched on a on a chronic pain point with with slack actually yeah I mean it's it's pretty good for ephemeral things but if you want to to refer to some conversation or some answer that took place a while ago, it's pretty hard to do that. So, yeah, I understand why you introduced the community forum.

Starting point is 00:51:31 All right, great. I think we've reached the end, more or less. So at least I covered everything that I had in my list. I don't know if there's anything else that we didn't touch upon that you'd like to mention. No, George, I think that we covered a lot. And I'm impressed with the depth that you have and how you track the project. And I hope that people enjoy it

Starting point is 00:52:01 and will be excited to follow the project, contribute or join the P99 conference as well. I hope you enjoyed the podcast. If you like my work, you can follow Link Data Orchestration on Twitter, LinkedIn and Facebook.

Orchestrate all the Things - ScyllaDB’s incremental changes: Just the tip of the iceberg. Featuring ScyllaDB CEO & Co-founder Dor Laor

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.