Orchestrate all the Things - Data is going to the cloud in real-time, and ScyllaDB 5.0 is going with the data. Featuring ScyllaDB CEO / Co-founder Dor Laor

Episode Date: February 18, 2022

ScyllaDB started out as with the aim of becoming a drop-in replacement for Cassandra. It's growing to become more than that. Article published on ZDNet ...

Transcript
Discussion (0)
Starting point is 00:00:00 Welcome to the Orchestrate All The Things podcast. I'm George Amadiotis and we'll be connecting the dots together. CLADB started out with the aim of becoming a drop-in replacement for Cassandra. It's growing to become more than that. I hope you will enjoy the podcast. If you like my work, you can follow Linked Data Orchestration on Twitter, LinkedIn and Facebook. Initially, before we go into the specifics since you know it's a recording and everything for the benefit of people who may
Starting point is 00:00:30 be listening and are not familiar with who you are and what you do and SkillADB and so on, if you could just say a few words about you know yourself and your background and SkillADDB in general? Sure. Thanks, George. So I'm Dor. I'm the CEO and the co-founder at ScalaDB. It's a NoSQL database startup. Just about myself a bit. So now I'm a CEO, but I'm still involved with the technology. And I'm an engineer in my roots.
Starting point is 00:01:07 My first job in the industry was to develop a Theravid router, which was a super exciting project at the time. Later on, I worked in different places and I met my existing co-founder in another startup called Kubernet. And that startup pivoted several times. Eventually the second pivot was around open source Zen hypervisor. And then we switched to another solution to the Tavi invented the KVM hypervisor. And I believe that many are familiar with KVM
Starting point is 00:01:45 and its powers, Google Cloud and AWS Cloud. And it was really a fantastic project. And later on, we founded this startup. And in this startup, there are a lot of power lines. We also started late after many other existing database solutions like the hypervisor and like the database today. And we continue to, we like open source a lot, so we have an open source first approach. And the database itself, we stumbled across Cassandra many years ago in 2014, and we saw many things that we like in the design.
Starting point is 00:02:32 Cassandra is also modeled after DynamoDB and Google Bigtable. But what bothered us, and we figured out that there is a bigger potential for improvement is that the database is written in Java, and it's not utilizing many modern optimizations that you can apply in operating systems. So we wanted to combine our experience into the database domain, and we decided to rewrite Cassandra from scratch. That was at late 2014. This is what we do since. And over time, we completely completed the Cassandra API. We also added the DynamoDB API, which we called Alternator.
Starting point is 00:03:23 And we also added additional features on top of that. And there's many changes that we're going to talk about that we develop nowadays in the presence. Great, thanks for the recap and introduction. And just as a kind of side comment for me, I think your effort was the first one that I at least personally became aware of, in which someone set out to reimplement an existing product, maintaining API compatibility and just trying to optimize the implementation. And after that, and lately, actually, I've seen others walking down this route as well.
Starting point is 00:04:06 So apparently it's becoming somewhat of a trend, let's say. And I totally see the reasoning, let's say, behind that. So as I was saying earlier, last time we spoke was about a couple of years back. And it was around the time that you had released your previous major version so that was version 4.0 and now I think you're about to release you probably have a release candidate or a beta for your upcoming version which is version 5.0 and so I would thought you know a good place to start with that would be if you could just well describe those two years both on the technical side so the kind of progress that you have been making what features you're working on and also on the business side
Starting point is 00:05:01 you can start with whichever you prefer. Okay, let's start with the technical side and we can also mix it up because many times the technical side is a response for a business problem and the opposite. So two years ago, as you say, we launched Scylla 4.0, which we celebrated achieving full API compatibility with Cassara, including transactions using the lightweight transaction API, which used to have Paxos under the hood. And we also completed at the time, we completed the Jepsen test that certified that API. I hope I'm not confusing all of the virtual CILA summits and the period, but I think it was together at the same time. Yeah, I'm absolutely positive. to be a better solution across the board, both with API compatibility and with performance improvements
Starting point is 00:06:29 and operational improvements. And also at the time, I think that we did have that unique change data capture that is in addition on top of the Cassano capabilities. And we also released the first version of the alternator, the DynamoDB API. That was then. Ever since, we continue to develop in multiple aspects. So there are two main it's more than two, but there are two main improvements that we've done over the years. So one thing is just to continue to improve all aspects of the database. And the database is
Starting point is 00:07:16 such a complex product. It's just amazing. We're, after years of years of work, we're still amazed about the complexity of things and the demand of people and what's happened in production. And so we improved different areas of the product, both in terms of performance in performance, we improved things like IO scheduling. We have a unique IO scheduler in the system. We guaranteed low latency for queries on one hand, but on the other hand, we have gigantic workload that is generated from streaming by vertically scaling and vertically shrinking back. So lots of data needs to be shuffled around. So it's very intense. And that can hurt the query latency. So that's why we have an IO scheduler in the system. And we're working on this IO scheduler for the past six years.
Starting point is 00:08:28 And it's not over even. Over time, we make it more and more complicated and also matches the real hardware more and more. And that scheduler, for example, a control for every shard, every CPU port. We have a unique design of shard per core. So every shard, every CPU port is independent and it's doing its own IO for network and for storage. That's why every shard needs to be isolated from the other shards. And if there is pressure in one shard,
Starting point is 00:09:05 the other shards shouldn't feel that. And we realized over time that discs are built in a way that if you do only writes, then you'll get one level of performance. If you'll do only reads, you'll get a similar level of performance, not identical. And several years back, we have a similar level of performance, not identical. And several years back, we have a special cap for writes and for reads in the IO scheduler. But if you do mixed IO,
Starting point is 00:09:34 then it's not like a simple function of mixing those two with the same proportion. It can be a greater hit with mixed IO. So right now we released a new IO scheduler with better control of mixed IO. And most workloads have certain amount of mix. And this greatly improved performance and also improved the latency of our workloads and reduces the effect of compactions, which is our effect of load structure mercury.
Starting point is 00:10:11 And also better with a repair and better streaming operations. So that's a big performance improvement and I'll get back to it. And another major performance improvement was that we improved large partitions. In Scylla and Cassandra and column stores, a partition is divided into many cells, basically many columns that can be indexed, but it can reach millions of cells in a single partition, and even more. And it can be tricky for the database and for the end user to control these big partitions. So we improved the indexing around these partitions and we cached those indexes.
Starting point is 00:11:12 We had indexes. Now we added caching of those indexes and we basically solved the problem of large partitions. Cassandra had this problem. It's hard solving Cassandra, but it can be a challenge. In Cilab, it was half-solved as well. We had users with a 100-gigabyte partition, a single partition. But the fact that we knew about those users, we knew about because they reported problems, and then we could figure out, hey, you have a 100 gigabyte partition. Now with the new solution, even a 100 gigabytes partition will just work.
Starting point is 00:11:53 So all of the operation will be smoother. With DynamoDB, for example, they limit the partition side. I think it's around two megabytes. So you just cannot use large partitions and the complexities move to the developer. They need to change the data modeling. So that's another performance improvement. We had also plenty of operational improvements. One of the major ones is what we called repair-based node operations. So node operations are when you add nodes, you decommission nodes, you replace nodes. It's an in-place replacement. So all of these operations need to stream data back and forth from the other replicas. So it's really heavyweight. And after those operations, you also need to run repair, which also basically compares the hashes in the source and the other replicas to see that the data matches. The simplification that we added, it's called repair-based node operation. So the streaming is based on repair. So we're doing repair, and repair fixes all of
Starting point is 00:13:14 the differences and take care of the streaming too. And the benefit is, A, there's only one repair, one operation. No streaming and repair, there's just one repair. So simplification and elimination of more actions. And the second advantage, which is even bigger, is that repair is the stateful and can be restarted. If you're working with really large nodes, with 30 terabytes nodes and something happened to a node in the middle or you just need to reboot or whatever, then you can just continue from the previous state and not restart the whole operations and spend two hours again for nothing. So that's a big improvement.
Starting point is 00:14:08 There's plenty of other improvements like reverse queries and improvements across the board that we have done. But this is the first major part of Scylla 5 improvements. The second major improvement of Scylla 5 is the consistency around in our shift from being an eventual consistent database to immediate consistent database. And it's all started when we finished consistent database to immediate consistent database. It's all started when we finished
Starting point is 00:14:54 the Paxos implementation with lightweight transactions. And we started to implement the Dynodb API and we did the Jepsen test. And it wasn't new to us, but we realized that with Jepsen, for example, we knew that those transactions with Raft are limited. And while the transactions really work, there are cases that are not transactional. So schema changes in Scylla and the Sandra are not transactional. So schema changes in Scylla and Tessandra are not
Starting point is 00:15:27 transactional. That's why you only need to make sure that you can do one schema change at a time. Now you need to rely on the user but it's tricky and also if you have a data center that got disconnected from the mothership and at some point will return to the mothership, there may be conflicts. And Scylla has a conflict resolution, but sometimes if people did changes that cannot really be automatically to converge them, then there'll be an issue. So that was one major issue. The other major issue was topology changes. In CLI and Cassandra, you can only add one node at a time,
Starting point is 00:16:18 and that's not efficient. You want to have faster elasticity and more control, but basically in the future, we also like to move to a state where we dynamically break. We do load balance by breaking sections, like ranges of keys, and split them across the existing cluster. So not just with a topology change, but more of a logical topology change. So we wanted to do many more of these topology changes in parallel, so we needed to have transactions for it. And another thing is is a DynamoDB has this leader folder mode and with Paxos transactions we implemented them with only three round strips for per transaction unlike an eventual consistent operation which just need one operation but it's not but it's eventually consistent operation
Starting point is 00:17:27 and DynamoDB doesn't need to have these three round trips Cassano needs to have four round trips for Paxos the DynamoDB just needs one or two depending on they do have eventual consistency option but they need between one to two for for reads and the rights are really heavy-weighted so we decided to adopt the raft consensus protocol and implement it in order to benefit from the operational side. So schema changes will be consistent, will be transactional, and you'll be able to apply multiple changes at once. And it will be ordered by the Raft implementation. And the same with topology changes
Starting point is 00:18:21 and also to offer to the end user ability to have transactions And the same with topology changes, and also to offer to the end user a ability to have transactions with zero performance penalty, just with a single round trip. So we've been working on this for quite some time over the course of the last one and a half years to finish the Raft implementation, which is unique.
Starting point is 00:18:45 We talked about it in great length over our summit last week. And the people can just go and listen to those presentations with much better speakers than myself. And also, we implemented the RA rough instantiation with schema changes. And so that's done and that's part of 5.0. There is also a beginning of topology changes and it will come in subsequent 5.x versions. And later on we're going to have tablets that will allow us to load balance partitions or loaded shards dynamically. And it will also offer to the end user the benefit that we do today with schema changes. The end user will be able to do immediate consistency with zero performance penalty.
Starting point is 00:19:47 And so that will come over this year, over the next releases. Okay, well, that basically means that you've been keeping yourselves more than happily busy, because well, you know, if you were to do like a bullet list of the features that you just mentioned, they probably wouldn't be like more than three, four, five. But the thing is that to implement those under the hood, I'm sure it has taken a lot of work to accomplish that. So if there was any doubt about what you mentioned initially, that well, you are, you still like to be fairly technical and handsome. Well, I think that that goes to
Starting point is 00:20:33 prove it, that the fact that you started with those. So let me, let me be the one to try and take you a little bit more to, I don't know, headlines, let's say, or things that may catch the attention of not so technically inclined people. And one of those things that I think is important, because I think it also plays into some of the use cases that you have, and well, the presentations that were given in your event last week is the change data capture feature which I believe and you can correct me if I'm wrong that that was already partially at least implemented in your previous version but has probably also evolved throughout the last couple of years and so for example I saw that first of all, you seem to have
Starting point is 00:21:26 many partners that are actively involved in streaming. So you work with Kafka, you work with Red Panda, and Pulsar, and I think a number of consultancies as well. And I specifically saw one use case, which was from the Palo Alto Networks, that seemed very interesting to me, because they basically said, well, we're using SILAs for our streaming needs, for our messaging needs, and we don't actually have a streaming platform. So I was wondering if you could say a few words on the CDC feature and how it has evolved and how it's being used in use cases. So indeed CDC is a wonderful feature that gives people functionality. For those who may not know, it's a chain data capture, so it allows you as a user to collect changes to your data in a relatively simple manner that you may
Starting point is 00:22:28 not know that data was written to these partitions and you like to figure out what was written recently. For example, what is the highest score in the past recent, let's say, one hour, or things that are related to a timely fashion to the changes that were applied to the database. Or sometimes you just need to consume those changes in a report or something like that without traversing through the entire data set. So the change data capture, it's implemented in a really nice and novel way where you can define that to enable change data capture on a table and all of the changes within a certain period that you define
Starting point is 00:23:23 will be written to a new table that will just have these changes, and the table will TTL itself, so it erases itself after the period expired. consume this in several languages, which we developed over time. And this is a really simple way to connect to streaming that need to the streaming implementers, whether it's Kafka, Red Panda and others, can stream into other consumers. So we have a Kafka connector and other connectors that can work with it. So it's extremely a usable feature for database developers.
Starting point is 00:24:21 And I really encourage it and the implementation is is quite neat and we also our dynamo db api also implements the dynamo streaming in a similar way on top of our streaming solution so it's also quite quite nice and the solution that you mentioned with Palo Alto, I believe I'm not that much in their details, but from really shallow check, I think that they're not using even change data capture because they know more about the pattern of the data pattern that they use. So they wanted to eliminate streaming due to cost and complexity. Maybe initially due to cost, they manage their solution on their own, and they have an extreme scale, like one of the most scalable deployments in the world,
Starting point is 00:25:24 not just of Scylla altogether. They have thousands of clusters and each cluster has its own database and needs its own streaming. So it's expensive. You can figure out why they want to eliminate so many additional streaming clusters. And they decided to use Scylla directly. There was a presentation from the summit of exactly what they do. I believe that their pattern is such that they have a, instead of having Scylla create automatically,
Starting point is 00:26:03 it changes a capital table of all of the changes. They know and expect changes within a certain time limit, which was just recently. And they decided that they can just query those time changes on their own. I'm not into the details to figure out what's better, whether to just use off-the-shelf CDC or to implement this on their own. It was their own decision. But it's possible probably in their case to do either way.
Starting point is 00:26:40 Maybe they made the best decision to do it directly. If you know your data pattern, that's the best decision to do it directly. If you know your data pattern, that's the best. Usually CBC will be implemented for users who just have no idea what was written to the database. And that's why they need change in the capture to figure out what happened laterally across an entire half a petabyte data set. Or potentially in cases where there is actually no pattern in terms of how the data is coming in,
Starting point is 00:27:16 it's irregular, let's say, so CDC can help trigger updates in that scenario. Exactly. And otherwise, it's really impossible other than to do a full scan. And even if you do a full scan, CDC allows you to know what was the previous value of the data. So it's even more helpful than a regular full scan. All right. Another topic that caught my eye from the other program of your event last week was around benchmarking. There was one presentation in which the person given it was detailing the type of benchmarking that you do and how you did it and the results from that. And I think that may be important and a bit of foretelling. I'm going to try to come back to the business aspect, which we haven't touched at all because well, that in my mind could be like an item you would like to highlight when
Starting point is 00:28:20 making a pitch, for example, or when trying to land new accounts and so on. So, if you could just summarize how the progress that you have outlined on the technical front has translated to performance gains and how was the process of verifying that? So, benchmarking is hard and we're sometimes doing more than one flavor of benchmarking. We have a nice open source project that automates things. So it takes things like, we need to use multiple clients when we hit the database and it's important to compute the latency correctly. So that tool automatically builds a histogram
Starting point is 00:29:15 for every client and combine it in the right way. If you don't do it right, then some of the information, the latest information wouldn't be accurate. So we have that tool and we mostly use it for complicated benchmarks. And we did several benchmarks over the past two years. In the summit, we presented benchmarks that compares the i3 Intel x86 solution with the ARM instances by AWS to figure out what's more cost effective. And also AWS have another instance family based on newer x86 machines, but they call it I4 with additional suffix
Starting point is 00:30:17 apologize for. There are so many instances that I don't remember exactly which suffix. Nobody does. Nobody does remember all of them. Don't worry. And so we compared all three instance families and the bottom line is that like the biggest headline is that the I4 family, first, all of these families are just outstanding. The local NVMe is extremely good. And within the new ARM instances and the I4, they have better disks.
Starting point is 00:30:59 And the NVMe before was fast. And if I talk that our new IO scheduler detects that mixed workload of reads in IO becomes an issue to some disks, it becomes an issue to the former instances like the I3s. But in the newer disks, the ARM instances and the I4s, they're just amazing. The IOS scheduler doesn't need to work hard on mixed workloads with these cases. They're highly performing, and end users will get amazing results with them. The IOS scheduler will improve the I3 results by a lot, especially for maintenance operations that are heavyweight and may hurt latency. So I4s, the new I4s, which are not released yet to the public, but by invite only, and
Starting point is 00:32:02 I believe that soon AWS will release them are more than twice more performing than the i3s so that's a huge boost the ARM instances are actually slower than the i3s in terms of CPU consumption but if you compare the price performance, then on some workloads, they're cheaper. On some, Intel is, the old i3 are better, but the i4s are better. So i4s are the best way. And altogether, all of these instance families
Starting point is 00:32:44 are really, really good and recommended far better than network storage for example okay thank you interesting findings and since you say that the i4s have not been made officially available yet. I guess many people will be interested in, well, this sneak peek about their performance. So let's go back then to the business side of things. Well, two years is a long time on that front as well. And I was wondering if there's any metrics, let's say, that you can share on how Sa has been faring as as a
Starting point is 00:33:28 business so um i don't know year-on-year growth uh retention um new use cases landed or this sort of thing um so um we have lots of progress all the time. We're extremely busy and we prioritize many times. We choose to invest where our customers are interested. So even now, so the most growth we have is around our databases of service. Our databases of service grew 200% last year and grew 200% the year before. And in the year before, we just introduced it. So we had an amazing growth of that service. Basically, it's 9x over the first year that it was introduced. And it will continue to grow fast. This year, we predict 140% of the database of the service.
Starting point is 00:34:31 And it will become a half of our business. And later on, continue to expand. But people just in terms of use cases, they just like to consume a service in general. It's hard to find talent to run a distributed database. Such a talent is also very expensive. In general, the vendors who maintain their own automation around these cases will also bring you better results because our implementation is the most recommended way. Most users who run a database on their own will be too busy to implement backup
Starting point is 00:35:17 and also implement that they can restore. And it's not the case with us. We must automate it. So that's why the entire industry, not just us, see growth around as a service, and we see tremendous growth around it. We've been expanding our AWS solutions. We have options to run as a service in our account and in the customer account. And also we expanded to GCP and GA did officially several months back and there's a lot of progress with GCP.
Starting point is 00:35:55 We're also getting into the marketplace. Eventually we'll also get to Azure. The only reason of not doing Azure is that we're extremely busy in automating and completing various aspects of the service from user management. We have a great log collection, but now we analyze automatic security inspection of those logs. And there are so many aspects of network access with public and private and proxies. So we're a really busy with all of those. And if I if I go back to use cases, then we always were a use case agnostic implementation. We have a specific edge with the time series, but for most users, whether if you are telco or streaming or e-commerce, they all pretty much look the same from the database perspective. And not just CRDB, but probably with many other standard industries.
Starting point is 00:37:13 But we do see the explosion of data that pushes people and pushes databases to their limits. And this is really good for us because this is where we shine and we see people come to us from various databases. It can be relational or it can be other NoSQL, not necessarily Cassandra or DynamoDB. And there are really cool use cases that are super exciting for us, whether it's the delivery services like Instacart. It's like I'm an Instacart user and it's really nice to serve the brand that serves you. And there are many other delivery services around the world, whether it's from food delivery to regular deliveries. And also nice to see different crypto use cases and sometimes I'm not even aware of what they exactly do with the blockchain and crypto but sometimes they use a because with
Starting point is 00:38:15 crypto there is the blockchain solution where a database is not needed but the database is needed to see and to evaluate frauds. And there is tons of fraud that happens. So that's a really common use case for setup in virus industries, in blockchain, in crypto wallets, in deliveries, and in any other company that offers any type of service it's a negative technology spread of technology that there is a lot of fraud but it's also good for us that we can offer a service even Also, we see NFT users use ScyllaDB today. We also see lots of mobile social services in different countries. So sometimes I read about them in the news after they are in existence, still a customer. So some of them are new and I'd rather not specify,
Starting point is 00:39:25 but there are some like that everybody know, including my son, like Discord, but some other services that maybe are more known in countries like India and other places where we have really good adoptions there. Yeah, actually, two comments. First, on the contribution of the managed cloud service in your overall revenue, you said you estimate it's going to be about 50% soon. If anything, I would actually imagine it would potentially be even larger than that.
Starting point is 00:40:06 And as you said, it's not just yourselves, but it's a general trend going on for, I guess, pretty much all database vendors. And the second one has to do with the mix of the use cases you are landing. And you kind of touched upon that. You said that, well, people are coming to ScalaDB from all directions, basically. Some, obviously, from Cassandra, some from Dynamo, and then some from relational or NoSQL database and so on. Do you have any idea of any specific quantification around that?
Starting point is 00:40:48 Like how many of your use cases are, let's say, Greenfield versus Brownfield? And by Brownfield, I would classify basically Cassandra and Dynamo to which you have like direct compatibility in Greenfield, all the rest. So today, still the majority of use cases come from Brownfield. But over time, we see that the Greenfield portion, it's really good question, grows as Scylla becomes more known, then people choose from the very beginning,
Starting point is 00:41:21 if they predict to have a big data implementation to choose Scylla from day one and not start with something else and switch to Scylla at some point where they hit a wall. And it's hurting their business. So we see more and more greenfield, but still the majority of cases are brownfield. And since there are so many databases in the world and so many companies hit the wall with scale, we continue to see good amount of conversions. Actually, one of our best white papers is SQL to NoSQL and not necessarily Cassandra, but but we see all types of databases. And another example from last week from the summit was Amdocs. It's an Israeli vendor that provides software for many, many telcos, really leading industry solution. And they decided to adopt Sela and they
Starting point is 00:42:36 considered three other databases. One of them was Cassandra and two other ones they haven't disclosed in their presentations. One of them at leastandra and two other ones, they haven't disclosed in their presentations. One of them at least was relational and there was another solution. And it was for them, they had the metrics, which they shared with the presentation of the different features, from availability to performance and community size etc.
Starting point is 00:43:07 And there was not, just speaking of the dry comparison, there was no clear cut which database is better. They had advantages for here and there, but only one database has managed to meet their performance and managed to finish their test on time. Two databases couldn't perform at all, and Cassara could perform, but was extremely far from their performance and TCO requirements, and only us managed to do that. And that's our promise. It's a affordable scale. Okay, great. All right. So then to wrap up, I guess you kind of hinted at least already, but to wrap up, what's in your roadmap for the coming period and how do you see the overall database and streaming, I guess, since you work with that a lot, landscape evolve?
Starting point is 00:44:15 So from a roadmap perspective, we're doing multiple things. We'll continue to insert the Raft consistency protocol into all aspects of our core database, and it will improve operational aspects, elasticity, and end-user convenience to have immediate consistency at no cost. And we're doing a big effort now around our databases of service. Today, our databases of service is automated in a way that every tenant is a single tenant. So basically as a tenant, you come, you choose, you decide what's your workload.
Starting point is 00:45:08 We map it into some type of a cluster size, let's say six servers of a given instance type and size. And then you start from there. Maybe you grow the cluster or shrink it over time. But it's a single tenant workload. And it has lots of advantages. You're on your own. It's very clear what you pay for, the performance and the cost aspects. And you're also not influenced by security or by noisy neighbors. However, there's a lot of advantages in serverless type of consumption where we use
Starting point is 00:45:57 multi-tenancy and we converge multiple cases together, which allow us to separate compute and storage. And that gives more flexibility, better pricing, and also faster elasticity because the larger servers are, there are servers under the serverless implementation, not a big surprise. And those larger servers are already pre-provisioned. So in the system, they can be really immediate. So there can be lots of benefits for the end user. So we're moving towards that direction. And we're also, we chose to do it with Kubernetes.
Starting point is 00:46:47 So we already have a Kubernetes operator, and we're going to reuse it and eat our own dog food within our Scylla Cloud implementation. So this is a quite gigantic change for us. Lots of work and lots of interesting details in it. And eventually, good benefit, another really good option for end users. So these are the two main world map changes.
Starting point is 00:47:15 We even have a really great driver projects with Rust and we managed to wrap Rust for other languages like even C++. The C++ implementation will be faster with Rust under the hood just because that Rust implementation is good, not necessarily because of the language of choice. So we're doing more things too, but these are the main two big projects, the consensus protocol and the consistency and the multi-tenant serverless implementation. Okay.
Starting point is 00:47:58 And about the overall, your projection, let's say of, I don't know, let's let's frame it that way in a kind of popular framing. So where do you see the database landscape in a couple of years? And so I don't necessarily have a big title for you. It's currently the thing for as a service is extremely strong. And of course it'll be the majority of our servers and probably other vendors as well.
Starting point is 00:48:34 Cloud also provides more options. It's not just having the same compute environment on demand. There are other infrastructure that a vendor like us can use. So things can become more sophisticated. And a database vendor can provide more services, not just the database, for example. I'm not saying that we will be because there's so
Starting point is 00:49:01 much depth in a database that you can also do that all day long. And it does allow more options like size tier storage, where some of the data doesn't necessarily need to be there all the time and be with the same SLA for the hotter data. And people do want to keep more and more data over time. And that's, for example, will be an important trend in the future or more options to integrate real-time in analytical projects. So it's possible to go and offer more and more analytical capabilities
Starting point is 00:49:48 and even have the ability to integrate with other analytical vendors, not necessarily do it all within the same vendor. So there are many possibilities in the market. It is exciting and the amount of data grows, the amount of offerings of hardware by the cloud providers also improves dramatically. So it's all, and also all of the usages, it's just crazy. Everything that you do in your physical life, there is even less physical life now that we live in the metaverse. So everything will be digitized. So the databases will continue to evolve. It's a good time to be related to these workloads. Yeah, indeed, I guess, you know, more more things going digital means more data to be managed means more, more demand for database vendors. So as you said, the good time to be in
Starting point is 00:50:56 that business. So good luck with with all your plans going forward. And thanks for for the update and the conversation. I hope you enjoyed the podcast. If you like my work you can follow Link Data Orchestration on Twitter, LinkedIn and Facebook.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.