Grey Beards on Systems - 121: GreyBeards talk Cloud NAS with Peter Thompson, CEO & George Dochev, CTO LucidLink

Starting point is 00:00:00 Hey everybody, Ray Lucchese here with Matt Lieb. Welcome to the next episode of Greybeards on Storage podcast, a show where we get Greybeards Storage bloggers to talk with system vendors and other experts to discuss upcoming products, technologies, George Dochev, CTO of LucidLink. So Peter and George, why don't you tell us a little bit about yourself and what LucidLink is all about? Thanks, Ray. I'm Peter. I'm the co-founder and CEO of LucidLink. And I'm George Dochev, and I'm the co-founder and CTO of LucidLink. at a storage software company that was focused on software-defined storage or storage virtualization as we called it back then. And what we created was a

Starting point is 00:01:14 solution called LucidLink FileSpaces, which is a high-performance cloud native file service that's targeted at distributed collaborative workloads. So Filespaces is built on top of object storage as the back end. We deliver it 100% in software and charge it as SaaS. We're supporting our file-based production workloads that provide on-demand streaming remote access for globally dispersed or distributed teams. So think about teams who are collaborating and require some kind of shared storage to get access to shared assets to do their work. Are you talking like media entertainment,

Starting point is 00:02:07 filming, film, video editing, and that sort of stuff? Or is it more like Slack developers co-developing some solution across the world? Or both of those? Yeah, so it's a general usage storage technology. But what we realized early on was we were targeting teams who were remote, working with especially large files. Because the larger the file is, the harder it is to get access to it and collaborate on it. And so media and entertainment is an excellent use case that came up early on, as is things like the AEC,

Starting point is 00:02:46 CAD CAM design, oil and gas, geospatial engineering, medical imaging. All of these things share those common characteristics. Which are big files that need to be accessed by different locations or different collaborators? Different collaborators in different locations, different parts of a workflow. It can be any number of things, but the commonality there is that they're accessing the same set of files and it's just hard to constantly move those around. Yeah. So, I mean, there's been a couple of solutions out there in the past, cloud gateways, NAS appliances that use cloud backing storage and stuff like that. How would you suggest, how does LucidLink differ from those sorts of solutions? So, we're fundamentally different in the following sense.

Starting point is 00:03:45 Instead of synchronizing files back and forth, solve to address the same problem which is the problem of accessing data sets over distance over the internet environment i would say one of them falls in the file sync and share category which are software only solutions but fundamentally what they do is they um they replicate files locally from the source of truth, which is in the cloud, down to your laptop or your device. Yeah, I've got this box Dropbox stuff that does that for my laptop, my workstation, my Palm Pilot, not Palm Pilot, but my iPad. I am talking too old here. I'm sorry. Go ahead.

Starting point is 00:04:46 Exactly. And they all, they started as full synchronization. They would synchronize the entire set. Then they started doing selective synchronization because people had too much stuff that couldn't fit locally. Then they started doing on-demand synchronization. So you can see that evolution going on. What we do instead is we say we're not going to synchronize any data. We are actually going to stream it on demand. So to make an analogy, a local file system behaves the following, the same way, except that the storage

Starting point is 00:05:21 is on your local disk instead of being in the cloud. And when the application reads a file, that file doesn't get replicated in its entirety in memory. It gets fetched on demand, those bits and pieces, those blocks that comprise the file. So we do a similar thing, but we do it in a distributed fashion, and we do it very efficiently over the internet environment, which poses its own challenges. Yeah, yeah. You would think the latency would be a real challenge for files. If you're never keeping a copy local to where it's being accessed, then all that data sitting on the cloud has to be accessed and streamed across the WAN or the internet, I guess.

Starting point is 00:06:03 And that's exactly what we're addressing, the latency that's incurred by accessing files over distance. So are you streaming the entire file down or just what's needed at the time, like portions of video files that are being edited or that kind of thing? Yeah, so that's exactly right. We're streaming only those portions of the file

Starting point is 00:06:31 that the application needs at that moment in time. I'll give you an example. A typical example would be, let's say you use Adobe Premiere video editing software and you open up a 100 gig file or several hundreds of gigabytes of video you can start editing that video immediately because we don't need to download the file beforehand and and and same goes for writes so we provide a true we offer a true read

Starting point is 00:07:00 write random file system that's real hard to do over the internet and with object storage specifically. Talk to me about metadata and where that lies. Sure. So I would agree with that statement that it's really hard to do. And in fact, there's been a lot of attempts over the years. A lot of research goes back to Android file system. We haven't

Starting point is 00:07:27 seen a whole lot of successful commercial implementations of a truly distributed internet file system, partly due to the maybe mature infrastructure at the time but also to to the intent and desire to replicate the the the local file system what we did was we we somewhat relaxed the file system semantics and said okay how do we do this over the internet without sacrificing the user experience but still relaxing some of the semantics. So for instance, we also function in an eventually consistent manner in certain cases. This allows us to mitigate to an extent the issues of latency, et cetera.

Starting point is 00:08:17 When it comes to metadata, it's interesting that you brought this up. We're evolving also as a product and as a technology. We started out actually synchronizing the entire metadata across all devices, but streaming the data. The content of the files is streamed on demand. The metadata was synchronized. As we are getting larger and larger customers and their needs increase over time, we've come to the realization that even synchronizing metadata may be too much and involves too much traffic. And so we're actually streaming metadata as well as data in this new and upcoming LucidLink 2.0 that we're going to be rolling out this year. Wow.

Starting point is 00:09:10 You know, file locks and stuff like that. If the metadata is not here, if the metadata is here. I mean, metadata has to be present, I would say, in the host that's accessing the files, right? I mean, I guess you could partition it or, yeah, somehow split it up some to some extent. But so, yeah, Ray, you hit on the issue that that really sticks in my mind, and that's file locking.

Starting point is 00:09:37 So so how does file locking take place and what kind of versioning is able to be done if, for example, two people are trying to edit the same area within the same save video in your use case at the same time? How does that work? Well, it works and it works really, really well. We've done a lot of work in that area and it's a non-trivial problem. I agree with you. This is actually the crux of all the work that we've done. When it comes to file locking, we do support distributed file locking in a very efficient way. And we would actually switch between eventually and strongly consistent mode of operation on the fly, depending on what the application is currently doing.

Starting point is 00:10:26 So if the application tries to lock a file, we'll temporarily, for that particular file, we'll switch to a strongly consistent mode where we would, say, for instance, synchronize those, ensure that all the metadata and the bits and pieces of the data that the application needs are the most up-to-date. And as they are modified, we then upload them to the cloud.

Starting point is 00:10:53 And we switch freely between that and doing this lazily in order to improve the user experience. But there is a lot of, this is a classical distributed systems problem, and there's a lot of work we've done in that area to give you that near local user experience without sacrificing the performance, etc. Talk to me a little bit about the protocols you support. You mentioned it's like a cloud NAS. Does it support NFS and SMB access? We provide a solution that behaves like a cloud NAS. The protocols that you mentioned is what is typically used by other vendors. Unfortunately, those protocols were designed in the 80s for low latency, high throughput, local area networks.

Starting point is 00:11:54 They don't do well over the Internet and in some cases fail completely. And so the first thing that we had to do was to reimagine what an efficient file protocol would look like over the internet. So we did away with these old legacy protocols and we invented and designed our own protocol. Having said that, I want to point out the fact that the storage that we use is off-the-shelf object storage. We utilize any object cloud vendor and their object storage offerings and solutions, as well as on-premise object storage technologies as well. So the back end is objects that I guess can be in any cloud that supports object storage. Does it have to be S3 compatible or do you support, I don't know, blob native Azure or whatever the Google cloud equivalent would be?

Starting point is 00:12:57 Well, it's interesting that the industry has consolidated around Amazon's S3, except for Microsoft. So we support any and every S3 compatible object storage vendor, as well as Microsoft Azure. Google Cloud that you mentioned, for instance, they have an S3 compatibility layer that they actually prefer to use. So we work with them. Yeah. Yeah. And that's on-prem NES3 compatible object storage on-prem as well, I guess, right? That is correct. So talk to me about the components of the system. Obviously, you have the object storage, you have some software running on the remote hosts that are accessing the data, and you've got this metadata thing someplace?

Starting point is 00:13:52 Right. So there are actually three parts to the system. The first part is the LucidLink client, or it is installed as an application on the endpoints. This is a lightweight piece of software that installs as a parallel file system, as a true file system. And it supports Mac, Windows, Linux. You can install it on a virtual machine in a container. So it's really quite flexible where you would deploy that. And that's what handles the streaming, the prefetching, the writeback caching. That's where it presents itself as a mount point, or you can configure it as a drive letter. But it is the file system. The second part is the object storage

Starting point is 00:14:39 that we were just talking about. We've got two models for that. Customers can either bring their own storage account, configure their own bucket, and associate that with our service, or they can, we'll provide an end-to-end service that includes storage in a preferred vendor.

Starting point is 00:15:02 So we can do that either way. One thing that's important to note is the way that we use object storage also kind of comes back to the streaming and the prefetching and providing portions of files on demand. And that is that rather than utilizing the semantics of the typical object storage semantics of one file equals one object, we take these larger files, break them into smallerpoints, we have all of the information about the file that the application needs. And as it is requesting portions of that file, we're able to deliver these chunks as objects and cache those locally so that the access is very fast. So essentially, the longer you're working on a project, you're warming up that cache, more of the file that you're using is local, so you don't have to constantly stream it out on demand. And then it uses lower utilization times to stream that back up to sort of the centralized object store?

Starting point is 00:16:27 Well, so rather than in typical object storage, if you make a change to file, you rewrite entire objects all the time. We're just rewriting chunks of those files, which is a lot more efficient. So you mentioned a couple of things, this prefetch and write back. So prefetch would be, let's say I'm going to open up a video file. It's obviously that you're going to have to hurry up and get the first chunk, whatever that first chunk is, from wherever it's located. You have to make sure the metadata for the file is sitting on the client, I guess. So that's got to be fetched if it's not there. And the first portion of the file has to be fetched. But while I'm working on that,

Starting point is 00:17:17 you're going to prefetch a number of portions after that? Is that how this works? That's how it works. You correctly stated earlier that latency is the big issue here, and that's what I call enemy number one for us. And that's what we're trying to address, how to mitigate the effects of latency. Because ultimately, when you have to go out to the cloud and fetch data, incurs latency and if you're doing this say across continental us that could go up to 50 milliseconds plus and so so you you're doing everything you can to reduce the back and forth right and pre-fetching of course is a is a is an important part of that. By the way, local file systems do the same thing. They also prefetch from the local disk

Starting point is 00:18:09 into the main memory, and then they use the host buffer, the main memory, as local non-persistent cache. The difference is that we utilize a persistent cache, but in our prefetching is a lot more sophisticated because it's so much more important in our world. So we monitor the file access patterns and we try to predict and prefetch based on the application I.O. pattern, but we don't do this only within a file, we also do this within directories. We monitor which't do this only within a file. We also do this within directories.

Starting point is 00:18:47 We monitor which files you're accessing within a directory and try to prefetch those files. And this occurs not only for data, but also for metadata. Because like I said earlier, in this new Lucid 2.0, we don't have the liberty to synchronize the entire metadata. And so we will always try to predict where you're going and prefetch the metadata so that the hot working set is always locally stored on your local disk. Yeah, yeah. So you mentioned persistent cache. So that's the local disk or local SSD that you would define and assign to the client, I guess, is how this works? Right.

Starting point is 00:19:32 So when you install our agent or a piece of software that runs on the client device, let's say your laptop, during initialization, we will take a portion of your local disk to use as a persistent cache. And that's fully configurable, dynamic. It extends and shrinks on demand. So you have full control over that aspect. And whatever amount of local storage you give us, we will utilize as local cache to keep, like I said, the hot working set of the most frequently used data. Yeah, yeah. That hot working set is the real crux of the solution to the latency problem, right? I mean, if you can prefetch the metadata and prefetch the data that's going to be requested, then you're well off.

Starting point is 00:20:27 That's absolutely right. And this is very important. By the way, it's very important for a number of other technologies. Your modern CPU won't be able to work as fast and as efficiently without pre-fetching. Right. Instructions our system. Instructions and data. Absolutely. I got you. These L1, L2, and L3 caches, they achieve very, very high accuracy rates, and that's why the CPU is able to perform so well. Without caching, those CPUs won't work nearly as fast.

Starting point is 00:21:01 Maybe they'll be at a one-h 100th of their speed with caching. Same goes for a system like ours. Caching is crucial. You mentioned writeback, and you mentioned your system dynamically switches from eventual consistency to strongly consistent based on client access patterns, file locking. I'm not quite sure when the switch occurs. Right. Well, I didn't want to get into the technical details, but right. But let's try to keep it high level.

Starting point is 00:21:34 In the presence of file locking, when the application utilizes file locking, which in our world would transform into distributed file locking, obviously, because you might have multiple people across the world collaborating in the same files or datasets. In the presence of those file locks, we will do a full POSIX compliant strongly consistent file system semantics. But if the application doesn't utilize file locks, then we will switch, again, this is completely transparent to the end user, but we fall back to eventually consistent mode of operation

Starting point is 00:22:16 where we will write the data locally on the local disk and then lazily push it out to the cloud. And that's a very key point also, because when it comes to performance and user experience, that plays a crucial role as well. So a thought occurs to me, Ray, and this is about things like GDPR and regionally zoned data sets. Do you handle that or is that up to the end user company to make sure that their data doesn't cross international boundaries where inappropriate? Well, so that's a little bit of both. You know, part of the model that we had of bring your own storage was, and the fact that we support both hyperscalers as well as S3 variants that can be in a, you know, a regional cloud service provider, or even in some companies data center,

Starting point is 00:23:25 allows them to specify exactly where the storage is going to be. Now we have a third part of the system that, as we were kind of talking about that system breakdown previously, we talked about the agent, we talked about the object storage we also have the lucid link service and this is the sas component of the platform that we run on behalf of the customer and we we spin up a service on behalf of each and every file space which is the the nomenclature that we have for our product, customers create a file space. And when they create a file space, we spin up a discrete service on behalf of each and every one of those.

Starting point is 00:24:16 We can run that anywhere. And so we run that in a location that is appropriate, both in terms of distance and latency, as well as taking into account GDPR and data sovereignty requirements. And let me add to that. One important piece of our technology is the security model that we utilize. And unlike all the other solutions that we've seen out there, we actually use a full client-side encryption model. What that means is that all the metadata, or should I say the user-generated metadata, as well as all the content of all these files is encrypted locally

Starting point is 00:25:06 with keys that are only accessible locally and stays encrypted in transit and at rest so what that means is that we as a provider as a service provider don't have access to your data and neither does the object storage vendor let's say Amazon AWS or Google Cloud or Microsoft Azure and this gives you complete control over your data this is especially pertinent and useful to say our media customers. In some cases, they're working on movies that, you know, you've read in the news what happens when, you know, some of that leaks to the public. And so it's a very important consideration for our customers.

Starting point is 00:26:02 But you touched on GDPR. That also, to an extent, addresses some of the GDPR requirements because the data is encrypted. And we as a vendor don't have access to that. So, George, you mentioned that the metadata, user-generated metadata was also encrypted. I mean, how does that work in your system? If I'm going to go out and decide I want to lock file A and now that is my metadata, it's encrypted. So the file name is no longer, I would say, visible to the service? That is correct. It's no longer visible to the service, but it's visible to all the other users who have been given access to that file we obviously have a very rich user model etc right

Starting point is 00:26:55 that's right we don't have access to um to user generated metadata which means file names, directory names, extended attributes, all these things. That was a really important aspect to our business growing the way it has over the last year. During the pandemic, we saw more sacrificial cows sacrifice or sacred cows sacrificed um you know companies who would would absolutely not consider uh cloud suddenly and i'll give you an example uh you know one of the first ones that that uh led us into the media and entertainment space is a large broadcaster. And they found their way to us. They called us up and the media tech told us, look, I just sent home 80 editors. I gave them a laptop, a VPN connection, and a hard drive. this will get us over the next two weeks, but that's not going to continue. That's not going to allow us to continue our operations. So they

Starting point is 00:28:12 talked to us, they hadn't heard about us before. We're a startup where it was an unplanned, unbudgeted project. And within six weeks they had 300 people on the system and about 200 terabytes. You know, fast forward over about a year, and we have that same customer that is, you know, got about a petabyte and a thousand users. So it really has allowed innovation to happen in the cloud space that I don't think we would have seen previously. You mentioned, do you guys support things like snapshots and backups and things like that? I mean, the objects being sort of immutable would mean as I write these things, you know, old objects exist, but the new objects are created automatically. So I guess the question is snapshots. Let's start there. Sure, sure.

Starting point is 00:29:15 Why don't I give you the super high level and then George can give you some additional fidelity around what we're doing. You know, the simple answer is yes, we absolutely do snapshots. In fact, for every file space, we configure a snapshot schedule by default on behalf of the customer, so they don't even have to think about that. We do this because, and this is where I'll have George provide some more detail on this, but we're using a log-structured file system. That's the way we've developed this, so that every write is written as a new object, which means we have the entire stream of the file history, which essentially gives us free snapshots

Starting point is 00:30:02 to be able to configure. George, I think you probably need to give a bit more detail. Well, what happens to the metadata? You're talking about the object storage is great, but the metadata is all important as well. So is that stored on object? No, metadata is stored separately. This is provided by our Loose Link service. The metadata, just at a very high level, the metadata is stored and distributed based on a distributed key value store. So when we set out to build the system, we said we need to build a general for the metadata as well. So not only we save each and every write for the file content, but we also create snapshots for the metadata so that we can reconstitute the file system in its entirety as it existed at a prior point in time.

Starting point is 00:31:34 Very curious. It seems like you've got, from what I can see, all the ends covered. I'm curious about a couple of things. One of them is active directory integration. Sure. So it seems that the industry is moving towards SSO, OpenID Connect and those technologies. And we support currently Microsoft Active Directory Service,

Starting point is 00:32:11 their SSO service, as well as Okta. And I would venture to say that this probably covers 95% of our user base, right? Yeah. And it also increases to an extent the degree of security that we offer as a service because the way our security model works is the all the encryption keys are themselves wrapped technically speak or encrypted through user passwords and

Starting point is 00:32:45 so the the strength of the system is is basically hinges on the strength of your password and people don't like to type long passwords right so that's so in order to address this if you have if you had a third-party service that can store your quote-unquote secret that's then used to unlock all the encryption keys, and then you log into that third-party service, we've increased the overall security. And that third-party service is exactly the SSO providers that I mentioned earlier. So the net-net is that this actually brings the security even higher. Yeah, absolutely. And that was going to be my next question. Where do the encryption keys get stored?

Starting point is 00:33:37 So that's based on a best practice recommendation to any company implementing you? Yeah. Most of our enterprise customers today probably use some form of SSO. And there is a lot of additional requests because we have customers that have, well, upward to 100,000 users. I'm not suggesting all of them are on our system,

Starting point is 00:34:06 but they are managing these huge user bases. And so, I mean, this is deep enterprise territory that we're working in. And as a result, they're setting very high requirements for SSO and just user management. And we're working on that to, to satisfy those high end enterprise requirements.

Starting point is 00:34:28 It brings up a couple of other questions. So what would be a typical size of the data that you would support? I mean, obviously it appears you can support anything from terabytes to petabytes, but I mean, what, you know, what's an average size and what's a maximum size that you currently have installed, if that's the right word? Sure. So it's less about the size and the capacity. It's more about the numbers of files in terms of the scaling out our system. But that said, the way that people and customers specifically think about are in terms of capacity scaling out our system. But that said, the way that people and customers specifically

Starting point is 00:35:06 think about are in terms of capacity of their data. So I would say that, you know, we, you know, our business ranges from individual professional YouTubers and photographers all the way up to enterprise broadcasters. We, by default, have a minimum file space size of one terabyte. We provide five users when you set that up. And we scale up to where we've got customers in the petabyte plus range with thousands of users. If we were talking more about averages, I'd say that probably the sweet spot of our business where we just have daily people coming in, signing up and just starting to run with it would probably be in the 20 to 30 users, 50 terabyte range. And it generally grows within that because what often happens is we'll get a marketing department from a company that comes in to address a specific problem.

Starting point is 00:36:18 And after using it for a while, they say, you know what, why don't we put our user directories on here? Or why don't we roll this out to our finance department as well? They've got these great big spreadsheets they have to deal with. Yeah. So, you know, and that culminated into, you know, a really interesting example of one of our customers in the AEC space who found us, deployed it, tried it. They were replacing one of the kind of cloud gateway technologies out there that just didn't work with people trying to log in from home over a VPN. It just didn't satisfy the requirements and started using it. And gradually, this guy called us up one day and he says, guess where I am? I said, I have no idea. Where are you? He said, well, I'm out on the back

Starting point is 00:37:09 loading dock. What do you think I'm doing here? Again, I've got no idea. What are you doing there? And he says, well, I'm waiting for a truck to come in and pick up our NASs. I've got three NASs. They're going to come and pick them up. And we've moved everything on to LucidLink. We're all in. I was scared. Sweeping the floor kind of thing. files as if it were local, most companies don't want to have to deal with that vicious cycle of buying new, over-provisioning storage, migrating it all, managing it, and then doing that over and over again. So I guess that was great validation for us. But like I said, it kind of scared the crap out of us too.

Starting point is 00:38:05 Yeah, it's nice when a plan comes together. I'm wondering about licensing though. How does that work? Right. So we, as I mentioned, we have two models for the storage, bring your own, or we'll provide it. And the two components that we charge on are the amount of capacity under LucidLink control. So within your file space, how much capacity you have. And that is metered as SAS. So it is at the gigabyte per day level charged over the period of a month. And then the second component is the number of users that you have accessing that capacity. Sure. So if the utilization does shrink in terms of numbers of files, you're actually going to step that down over that period of time? You bet. Yep.

Starting point is 00:39:03 That's really amenable and no egress charges. Well, not by us. Okay. Let's go there. Yeah. Well, yeah. And this is an area that we are absolutely focused on. Now, remember that we do help mitigate egress because we're not constantly delivering entire files, right? And we allow you to configure your local cache on an individual machine anywhere between the default 5 gigs up to 10 terabytes. So the more you use the system and the more you have cached locally, the less egress you're going to incur. However, egress is an unpredictable charge

Starting point is 00:39:59 that is one of the highest components of cloud storage when you look at the actual storage cost components. So it's a real issue. We also kind of find that, you know, there is a cost of bandwidth. You know, we wouldn't dispute that. Now, whether the cost of bandwidth is the equivalent to the cost of egress being charged, well, there might be a little room for a lively debate in that area. So one of the other things we've done, and for example,

Starting point is 00:40:31 we've partnered with IBM Cause Cloud Object Storage. And we have put together special pricing that allows us to offer our customers an egress rate at about one third what the normal rack rates that you'd pay to the other hyperscalers would be. Yep, exactly. Well, that's nice. So, cause is an interesting solution in and of itself. So, in that case, this is where you're providing the complete solution, the storage, as well as the service. That's correct. Yep. And, and we're, we've got discussions with all the other vendors as well. I think that, you know, the, this is, this is the, the address,

Starting point is 00:41:23 what we're addressing is put all your storage in the cloud, but use your devices on the edge. And that cloud to edge is not something that most people are doing. Usually you've got the choice of, I've got to bring my data to the application. So I can either put my applications and run them in the cloud next to the data, or I can figure out ways and methodologies to bring the data down to my application and do it there. And usually that, as we've talked about, is synchronization or Cloud Gateway technologies.

Starting point is 00:41:59 What we're trying to do is separate those decisions, consolidate your data in the cloud, but access it with your local applications without having to worry about where that data is. Would you consider your solution a high availability solution? I mean, you know, with storage and stuff like that, you'd have multiple controllers and, you know, multiple paths to the data, that sort of thing. I guess because the data is all on cloud, that's fairly multiple paths exist. The metadata, I'm still kind of confused where the metadata resides and, you know, it's fault tolerance or high availability kinds of things. Well, it depends on how you define high availability,

Starting point is 00:42:47 but I would say the answer is yes, we are a highly available solution for the following reasons. So the file system is comprised of metadata and data. The metadata is, as we mentioned, provided by our own service, which is highly available in itself, and it lives in a hyperscaler in the cloud itself. And the data could be anywhere. Typically, that would be another hyperscaler object storage. And those are extremely durable and extremely highly available.

Starting point is 00:43:33 And you could take a client and run it effectively on any PC, laptop kind of environment you wanted, I guess, right? Absolutely. Let's say you've lost your laptop or something like that. You could take it to another, buy another laptop and install the Lucid client, I guess, and then you'd be up and running? Absolutely. Sure. This would be a different characteristic of a storage solution. That's fault tolerance that you're referring to.

Starting point is 00:43:53 And we're absolutely fault tolerant because all the data lives in the cloud. And the beauty of our approach is that all the data being, everything being encrypted, even locally in your local cache, if somebody were to steal your laptop, you actually haven't leaked any information because all the data being everything being encrypted even locally in your local cache if somebody were to steal your laptop you actually haven't leaked any information because all the data is encrypted the

Starting point is 00:44:11 same way it's encrypted in the cloud um but it's absolutely fault tolerant it's all also highly available because we we sit on top of a very very very highly available object store. Let me just also mention that similar to the high availability, we can talk about scalability. The object storage, say, let's give the typical example of Amazon S3 is a wonder of the world, essentially, what Amazon has built. It's extremely available, elastic, and durable. And it's also extremely scalable. What that means is that you could have a thousand editors editing video files simultaneously from their respective homes and the aggregate throughput, I can guarantee you,

Starting point is 00:45:01 will beat any NAS out there because there is no single point through which the information flows. So those thousands of video editors will probably be utilizing thousands upon thousands of servers in the cloud. So the system scales virtually horizontally unlimited. And that's the beauty of the cloud. So the system scales virtually horizontally and limited. And that's the beauty of the cloud solution. And not only that, but they can be working on their own Macs or the laptop or machine of their choice. Some of them might be on VDI editing stations. And then after they finish setting up the job, they may do the render in the cloud. And all of that is done using the same set of shared data

Starting point is 00:45:53 without having to keep pushing it. So I could have an EC2 instance and running LucidLink client? You bet. Yeah. You could do that. You could use Teradici. You could use Bebop. You could use anybody who's providing an editing workstation.

Starting point is 00:46:11 A typical use case for the media and entertainment would be proxy generation and rendering. So let's say you ingest raw footage and you don't want to or you cannot use 8k the 8k raw footage that you shot you could have an instance running in the cloud that creates those proxies on the fly they are stored on lucid link of course and all the video editors gain immediate access the beauty of this is that you could be doing this on the fly. In other words, those proxies could be generated on the fly, what we call growing files, and you could be editing that growing proxy while it's being produced. Both these guys would be editing the same file per se, right?

Starting point is 00:47:02 One would be creating the file and the other one would be editing it somewhere behind it, I guess. That's exactly what we do. And broadcasters love it because they're always in time crunch. They need to produce material instantly. As soon as it's shot, they have to start working on the final footage.

Starting point is 00:47:22 When I think file lock, I always think that the whole file is locked, not the final footage. When I think file lock, I always think that the whole file is locked, not the various portions. You're locking the file at a sub-file level? In this particular example that I gave you of the growing proxies, there is no need for actual file locking, by the way, because one is extending,

Starting point is 00:47:44 the proxy generation software is just extending the file while the video editor is just reading that file to create a project around it. But to answer your specific question, yes, we do support byte range locking. There's certain software that utilizes byte range locking, and we do support byte range locking. There's certain software that utilizes byte range locking, and we do support byte range locking, not entire file locking. Well, this has been amazing. Matt, any last questions for Peter or George? No, it's been a really interesting conversation. Peter or George, anything you'd like to say to our listening audience before we close? Well, we've been told that usually the customer finds us and say, why don't I know about you? And so we're very happy to be part of this

Starting point is 00:48:34 podcast and hope that the word gets out. We've tried to make it really easy. You can hit our website. You can spin up a free, completely free, fully functional two-week trial just to see if it works for your particular use case. And we'd love to talk to anyone out there who's got these kinds of needs. I mean, we learn more about our product from our customers and where we should take that product than from anywhere else. So, you know, please hit us up and we just, we'd like to continue the conversation with you. All right. That's it for now. Bye Matt.

Starting point is 00:49:13 Bye Ray. Bye Peter and George. Bye. Bye guys. Until next time. Next time we will talk to the system storage technology person. Any questions you want us to ask, please let us know. And if you enjoy our podcast, tell your friends about it.

Starting point is 00:49:30 Please review us on Apple Podcasts, Google Play, and Spotify as this will help get the word out. Thank you.

Grey Beards on Systems - 121: GreyBeards talk Cloud NAS with Peter Thompson, CEO & George Dochev, CTO LucidLink

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.