Grey Beards on Systems - GreyBeards talk global storage with Ellen Rubin CEO & Laz Vekiarides CTO, ClearSky Data
Episode Date: October 16, 2015In this edition we discuss ClearSky Data’s global storage service with Ellen Rubin (@ellen_rubin), CEO & Co-Founder and Laz Vekiarides (@lazvek), CTO & Co-founder of ClearSky Data.  Both Ellen an...d Laz have been around the IT industry for decades and Laz in particular was deeply involved in the development of EqualLogic storage systems both at Dell and at … Continue reading "GreyBeards talk global storage with Ellen Rubin CEO & Laz Vekiarides CTO, ClearSky Data"
Transcript
Discussion (0)
Hey everybody, Ray Lucchese here and Howard Marks here.
Welcome to the next episode of Greybeards on Storage, a monthly podcast to show where
we get Greybeards storage and system bloggers to talk with storage and system vendors to
discuss upcoming products, technologies, and trends affecting the data center today.
Welcome to the 25th episode of Graybridge on Storage, which was recorded on October 8, 2015.
We have with us here today Ellen Rubin, CEO and co-founder, and Laz Vecchiarides, CTO and co-founder of ClearSky Data.
Why don't one of you tell us a little bit about yourselves and ClearSky Data?
Good morning. How are you doing? We're good. That was pretty good, Ray. I have to say that was
pretty close on the last name. You know, it's always a challenge. Thank you for having us
join you guys today. So I'll just give a quick overview and then let's just do it, you know,
as interactive and questions as you guys want to go to. Laz and I are co-founders and we started
the company about two years ago.
Both of you know because I think we definitely were hanging out with you at VMworld.
We launched the company back at the end of August,
and then we're at VMworld exhibiting and speaking,
and it's just a very exciting time for the company.
So what we have is a global storage network
that is a service for enterprises
with large amount of primary storage.
So traditionally, they would be EMC, NetApp, Dell types of customers.
And what we're doing, we believe, is offering kind of a radically different approach
to how enterprise storage can be consumed and delivered.
So the idea of a global storage network is that we will always be able to deliver
the customer storage to them wherever they are and very,
very close within a very, very small amount of latency at the edge next to where they're located.
And the philosophy that we have is that we really want to deliver the performance and availability
and security and latency of a traditional storage array that would be sitting in a customer's data
center, but we're doing it as a fully managed service, SLA-based,
with the scalability and the economics of the cloud.
And I'm sure we'll talk a lot about how we do it,
but we think it's kind of a unique and a differentiated approach
where essentially the goal is to always be within two milliseconds of our customers
and to handle the full lifecycle of the data from primary to backup to disaster recovery,
where we can guarantee five nines of
availability, hundreds of thousands of IOPS, and very low latency that the customers need,
so that they could run traditional enterprise workloads or could run workloads in the cloud,
whatever they need to do. And we do it at a third the cost of what they would be spending today.
Gosh, it looks like you're trying to take on the whole industry in one shot.
Well, I'll let Laz talk about his background because that's not new for him. Yeah, I guess. I had early days at Natiza where we challenged a bunch
of incumbents as well. But Laz, you want to chat a little bit about taking on the storage industry?
Well, he's certainly not lacking for ambition, but if you're going to do something, go big, right?
Oh yeah, yeah, I agree. So you're working under the pinky in the brain theory? What are we going to do tonight to take over the world?
Perhaps you have to start thinking like a Bond villain. I have a white cat
in case you were wondering, Howard. That explains much.
I guess. The fact is that
one of the problems that most of the customers
that we talk to, especially at the size that we're talking about, that they have, is really just the ongoing lifecycle management of all this gear.
And if you look at what's new in storage, you really just have more of the same thing.
And I keep harping about that in casual conversations.
Anyone who will listen to me, people just want to build new boxes that do something slightly incrementally better.
So you constantly have that.
The real fundamental problem is that your data footprint continues to grow.
Storage, it really should be a service if you think about it, because all this gear, it just more or less expires after a couple of years anyway.
So you're constantly in this treadmill of replacing it. And so if you think about it that way, the storage industry has not changed ever, really. And a lot
of vendors are enjoying basically selling the same thing over and over and over again for the
same capacity for only incremental new benefits to the customer. And I think what we're proposing
here is a completely different way to consume storage, basically acknowledging the fact that you have these lifecycle issues with your data.
Your data is immortal.
Your gear is not.
Data is immortal.
Gear is not.
I like that.
I wish it weren't true because the solution to so many data problems is really deciding to throw it away.
But it is true.
It's true. If you think about it that way, we offer something that's very, very different from
an economic and operational standpoint. So you don't really deal with all this gear.
It's all hidden from you. It's inside of our network and it's in the cloud. And after all,
you know, the big new innovation of the last five years that everyone is excited about.
I'm here at Amazon Reinvent, and we have 20,000 people almost here.
You never would have guessed that five years ago.
$7 billion business for Amazon alone. Jesus, yeah.
That's right. And people want cloud models for IT infrastructure.
That is what the main theme of everything that's going on here is.
And so this is...
But if that's entirely true, then why do I want storage on-premises? You know,
I want to just move everything to AWS, don't I?
Not true. You know, so there's this notion of latency that you have to deal with. And
as you know, Howard, we've been working very hard at increasing the speed of light to no avail.
186,000 miles per second.
It's not just a good idea.
It's the law.
It's exactly.
We're going to live with it forever.
Right.
And so you can, for example, in Boston, where we are,
and be able to access remote storage over the internet.
It's just not feasible for any type of primary
workload where you are expecting 10 millisecond or less in latency or, you know, if you have a
flash array, you want, you know, sub millisecond latencies. You can't do that with remote storage.
And what we do here at ClearSky is to apply our networking and caching technologies to make that possible.
So you don't have to have storage on-premises in order to have high-performance workloads
running locally. And that's what the global storage network really does.
So Howard, I wanted to comment as well, because I think, especially this week,
the week of reInvent, it's a fair thing to ask, well, isn't everything going to the cloud? Isn't
everything going to be in the cloud? So know, so who cares anymore about customers having data centers or,
you know, connectivity and stuff? And I'm a cloudy from 2008. You know, my last company
was totally focused on hybrid cloud and, you know, very much, you know, a believer. But
I think what's true for the class of customers that we're interested in that are these, you know,
sort of high end of medium up into large enterprise is they got data centers and they're trying to get out of their data centers and they're trying to
embrace cloud. But if you really look at what's going on, a very tiny percentage of what they
have has made its way to the cloud. You know, so they've got a little bit of SaaS. They've got,
you know, some footprints, you know, some test devs, some stuff that's just getting thrown out
there because people decided to do it. But the core IT infrastructure and applications that are running, which could be, you know, hundreds,
it could be dozens, really the majority is still some combination of, you know, VMware or, you know,
private cloud and traditional models. And what we're really doing is sort of honing in on customers
who really are going to live in that type of a world for the foreseeable future and helping them as they more and more put new things into the cloud. That's where we're
targeted. So how does this all work, Laz? I mean, you know, if you're going to have data that's
sitting in the cloud and a data center that's sitting at, let's say, Boston and the cloud is
in Virginia, for instance, let's say Amazon, how does all this hang together with sub two
millisecond response time?
Well, you have to avoid talking to the cloud as much as possible. And if you do talk to the cloud,
you have to talk to the cloud over a dedicated connectivity that is very low latency. So we do
both. We have built a caching network where we have lots of very small footprint points of
presence in various metros. And in those points of presence, we keep caching infrastructure and a relatively small amount of high endurance flash durable storage.
So what we've done in our architecture is to minimize the amount of durable storage that we need to keep.
Remember, the really vexing thing about primary storage
is that you have to keep all that data durable.
You can't lose a disk and lose data.
I keep trying to explain that to the VMware guys.
All the VX VMware guys are here this week,
so I can relay that message to them if you want.
So the big trick that we've applied to bend the economics is that the amount
of durable storage in our network is very, very minimal. It's a write-back cache that sits in
that point of presence, and it's made of... A write-back cache, not read. That's interesting.
That's right. The read is sitting in the edge, which is separate from the pop.
When you say edge, you're talking about the data center itself. Yes, the data center itself. We also do have in that POP a large warm cache, which is basically an overflow for the edge.
And it's also multi-tenant. So we're trying to bend the economics there as well by not having
to provision storage for any one particular customer. And so you have this bucket of cache and you have this
small bit of durable storage, which is the write-back. Everything else, all the rest of
the data that's sitting in our network is cache data and it's backed by the cloud. So eventually
all your data goes to the cloud and the cloud, if nothing else, has amazing durability. So
six or seven copies, 11 nines of durability, you're not going to get that in a physical array that you have on premise unless you really want to have six or seven copies or some really elaborate erasure coding that might be able to approximate that for you.
So Laz, you have devices at the edge as well as the pop, and then the storage is backed by the cloud itself. Is that how I understand this?
Yes. And then one last key component of the architecture is backed by the cloud itself. Is that how I understand this? Yes.
And then one last key component of the architecture is the networking.
We build and run our own network. It's a private network with private connectivity down to, in the east, it's Ashburn, Virginia for Amazon East.
And then we're going to have a similar connection arrangement when we open our Las Vegas pop momentarily, I guess, with Amazon West.
And so in those cases, you basically can get to Amazon within a couple of milliseconds, 10 milliseconds in Boston.
And then from the pop to the edge, this is where the sub-two millisecond thing comes in.
So because we're in metros, we can reach out and touch pretty much
any data center with a private line in Boston, Philadelphia. And we have certainly in Boston,
we're actually at sub millisecond latencies. And these are all private lines, metro Ethernet,
which is very abundant and surprisingly cheap. And it's all included in the service. So the
connectivity is something that we provide for the customers without them
really having to manage it. We provision it, we deal with it, we deal with the carriers,
we make sure that there's diversity, et cetera, et cetera. That's it. That's the architecture.
So how much cash is at the edge versus the pop? In historical days, there would be some sort of
a percentage that would be cash versus backend storage. Do you have that sort of a percentage that would be cash versus back-end storage. Do you have that sort of relationship or ratio in your system?
Yeah, we do, actually.
So we have these rules of thumb, but they're going to evolve over time
as we learn more and more about how data behaves over extended periods.
So we've been in beta for a while.
We sized everything according to these guidelines.
So the edge we wanted to be roughly 10% of the overall active
footprint, so online lunge that a customer is using. And then the middle tier shouldn't be more
than 30% or so. And that is sort of the sizing guidelines. Now, it turns out that we tend to over-provision a little bit just because
we can, and it doesn't hurt. Flash is cheap. Reputation is expensive.
Exactly. And so, when you're looking at our edge devices that we provide our customers,
it's six to eight terabytes in general to start. All our beta customers are in that range,
and they're doing really, really well
with that. So they're getting excellent, very, very close to flash performance.
When you say six to eight terabytes, you're talking DRAM cache or flash?
It's flash.
Flash, flash. And then there's some sort of DRAM cache in there as well, I assume.
Oh, yes. So the Edge appliance is actually a very interesting box. It is a 2U.
It's a storage array chassis, but it's just a cache.
So we don't have to deal with RAID.
And because of that, we can optimize for capacity, which is really what you want with a cache.
With 24 slots, we have the potential to get pretty large, obviously.
We keep all the metadata in RAM, so there's a ton of RAM on that box, and there's a ton of compute. The other thing that we do in order to optimize the network is to compress and deduplicate the data before we send it out to the network.
We also encrypt it, of course, obviously, if it's going to go out of premise.
Our customers expect sort of rock-solid security and encryption.
But that box actually does all of that and does all the coordination. So it's really a
compute-heavy workload at the edge. So when you say encrypt, is it SSL kinds of encryption across
the network or data at rest encryption or both? Well, I keep calling it belt and suspenders
because it really is both. So we have self-encrypting SSDs, first of all, so we encrypt at rest.
The minute we ingest a piece of data after we hash it and match it suitably,
before it goes out the back into the network, it gets encrypted with AS256.
Using keys that are actually physically present,
we actually are using the TPM technology that intel motherboards have so we have
tpm modules uh on both ends of this wire so both entities can identify each other using that
technology and just to be even more paranoid even though it's a private line we're using uh you know
tls to encrypt the entire communication channel between the edge and the POP. And in fact, even beyond
that, the networking itself, once you get into the POP, every customer is isolated. And so there's
networking, network level layer to isolation as well. So it's a very, very sort of paranoid,
isolated, very secure environment where each customer only gets to
see very, very specific things. And we make sure that there's no crosstalk between customers.
I got a couple of questions. So the data is encrypted via AES before you send it to the POP.
And I, as the customer, own that key and you don't know it?
That's exactly right. We don't want to know it. Yeah. I don't want you to and you don't know it? That's exactly right.
We don't want to know it.
Yeah, I don't want you to and you don't want to.
That's right.
But that would mean that in the POP device, which is multi-tenant, you can't dedupe across
multiple customers because they're encrypted with different keys, right?
That's absolutely correct.
And that's always been a stated non-goal for this company.
Each company's dedupe domain is their own. And that's established at the edge then? Yes.
So can I jump in for one second? So there's such an extensive set of things that enterprise
customers need from us as a service provider, taking their data outside the firewall. And
a lot of them can be addressed
with the key management and encryption. That's like, you know, table stakes and critical.
And then there's a whole set of things, which I'm sure, you know, we won't have time to get
into a lot of detail on that have to do with just, you know, physical security and operational
security and personnel security and all that kind of stuff that. That's why the guys at the super
nap like to show off their assault weapons. Exactly. There you go.
It makes everybody feel safer.
That's how I like to describe it.
Oh, God.
Assault weapons don't make me feel safer.
Yeah, this is like our customers are already in beta.
They were already compliance and regulatory sensitive, right?
We have financial services and healthcare and biopharma already and just a lot of, you know, just that's it.
You don't get to work with customers on that type of data until you've proved that you've pretty much gone,
you know, through the, you know, the heavy lift and the belt and suspender stuff that Les was talking about.
So we just did it straight on from the beginning.
The elephant in the room is the eventual consistency.
How is that being dealt with in your system, Les?
So the POP actually does that.
What we're doing is, you know, this is why we have a write-back cache.
We accumulate writes, and then we push out a whole bunch of large amounts of data, really, into the cloud at predefined points in time.
And we use—we actually— there are ways to do this. You can never rewrite
an object and that's how
you get tripped up with
eventual consistency.
Instead, we've used other tricks like
versioning in order to make that possible.
And we've handled the problem, but we're
definitely not going to be, we're not going to try and do something
crazy like being synchronous to the cloud.
We're point in time consistent to the cloud,
which is one of the things that I expect.
We're here at reInvent.
I keep explaining that to customers.
You're synchronous to the POP and point-in-time to the cloud.
And that's really how we solve it with our architecture.
So when you blast out this periodic destage to the cloud effectively, you're intermingling
all the data from that POP,
I guess, and from all the edge systems out there. Is that how I read that?
Each customer has his own software workload that does that independently for them. And then we size
the infrastructure, including the networking, so that they could all be on at the same time,
pushing their images out at exactly the same time. But it's done on a customer-by-customer basis.
So like a bucket at Amazon would be associated with one customer effectively?
Yes, exactly.
That's the sort of thing.
In fact, Amazon doesn't support that many buckets, so we wish there was an architecture
like that.
But we're the namespace that suitably separates everything, and then we added access controls,
just customer stuff.
So is it iSCSI or Fiber Channel at
the edge? Well,
today, in our first release,
we're iSCSI, but
in the first year, we definitely have
plans to extend this to
Fiber Channel and NFS, and
SMB as well is in there. So
we expect this to be a multi-protocol
edge.
Anything that talks SAN protocols or file protocols
will be included inside the box.
Okay, now you got me wanting sync and share from the pop.
Sync and share.
I've gotten to the point where I think of sync and share
as a protocol, not an application, but I'm weird, so.
It's a Dropbox infection here or something.
If you're going to support SMB at the edge,
and so everybody, when they're in the office,
can access all those files,
then it's already at the pop.
Yeah.
Encryption and all that stuff
needs to be carried through the sink and share.
It's non-trivial.
Absolutely.
Absolutely.
You know, it's great.
Every time I talk to Howard, I get a new requirement.
Yeah, this could be a problem.
That's why Ellen doesn't let me talk to Laz more than once a month.
Exactly.
You have to talk to me.
That way we can talk about all of the cool use cases that you think we could be tackling.
I think the real issue, of course, for us as a startup is, you know, we're dealing with large enterprise, like they want fiber channel, right?
That's something that's pretty urgent versus maybe when we were starting, we thought, oh,
we can hold off on that one, for example, for a little while. We're just, you know,
we're just kind of taking it as it comes from the customer input.
Yeah, no, I think that's very clear. I think Nimble demonstrated that,
you know, very dramatically. As soon
as they introduced Fibre Channel, their large customer sales went up dramatically.
Exactly.
So you mentioned Philly and Boston and Vegas. Are there other pop locations currently available?
Or I know you probably have a roadmap.
Yeah, not yet. The launch was for the three initial pops, but the plan is to be in every major city in the United States.
And, you know, the good news for us, of course, is that things tend to cluster, right?
You know, there are sort of the obvious places you'd want to be, you know, New York and San Francisco and Seattle and Denver and Texas and a couple of different locations.
Like, those are all obvious and definitely on plan.
But the other thing is, you know, it's a global storage network. So even in 2016, I think we'd like to, you know, sort of be out with at least
an initial location in Europe. And, you know, the nature of our customers is that many of them are
at least multi-site, if not multinational. And, you know, it just, we need to be where the customers
are. Okay. Let's talk for a minute about how the system works multi-site.
Because it seems relatively clear that, you know, I've got an edge device that I connect my servers to iSCSI and it talks to the POP.
What happens when I have, well, what happens when I fill that edge device up?
Let's start with the easy one.
You know, knowing where Laz came from at Equalogic logic i suspect i know the answer to this well
essentially we started expunging data um and destaging more yeah yeah we basically move all
of you so all of your data you know it flows it it's sort of like a spill and fill it flows out
into the um into the pop and so um what ends up happening uh in POP is you have an even bigger cache where you can...
I actually meant more, you know, what happens when I need more cache at the edge.
Oh, well, when you need more cache at the edge, we have a couple of different strategies for you.
Obviously, we have a lot of extra slots in these boxes if you have only 6 to 8 terabytes.
So we can simply just walk up to one of them and put in more devices.
Of course, we wanted to be able to
scale past that. So we also have scale out at the edge, which means bringing another device and
having it sort of transparently cluster with the first device so you can manage them together as a
single logical entity. So we have scale out. And when you say scale-out, Laz, you mean the LUNs can actually span both edge devices, let's say.
Exactly.
So the edge devices, all it does is there's a private network, obviously, that goes to the POP.
But on the back end, outside of the SAN network, these devices talk to one another.
So effectively what you're doing is you're creating an even bigger tier of cache on the edge, and the devices talk to each other. So effectively what you're doing is you're creating an even bigger tier of cache on the edge
and the devices talk to each other. So the miss path changes to be more interesting, whereas one
device, if it is a cache miss, it'll go ask its friend before it goes across the wire.
Right. And then if I have multiple offices in the same metro, I understand it does interesting
things. Yes. So it's very easy to exploit the fact that the POP is sort of, as you say, RPO0.
Because there's only one metadata master in a metro area for every LUN, that means that all these caches are effectively synchronously replicated to one another.
This means you can do things like synchronous replication, geoclustering.
Even with something like vVols, which is really interesting, effectively we create this logically
consolidated metro-wide storage array where vVols can surface in any edge location. So you can
basically shut VM off in one place and bring it up in another place without losing any data.
But in this case, it's going through the POP.
So it's the edge to the POP and the POP back to the other edge, I guess.
That's right. That's right.
The trick there is that you have all of the metadata coordinated in the POP.
And so everyone is seeing the same image of the LUN.
So hence, you know, it's all synchronized.
And you mentioned vVols.
Do you guys support vVols?
That's right.
We do.
I like to say that I've managed to implement vVols twice in my career, which is twice as many times as most people.
Yeah, good thing.
I think.
And then we came here and we did it and in fact
you know we are uh we're in the process of getting certified right now before we ga so
we're uh we're quite proud of the fact that we managed to pull that off our team did that once
uh at equalogic and we're doing it here again and in this case we're doing it in a in a much
more scalable cloud-like fashion which it really fits the original architecture of
vVault.
Oh, good, because the Equalogic implementation is not one of my favorites.
Sorry.
Hey, Howard, you know what I like to say?
I like to say that this company is so Laz can fix anything he didn't like about the
things that he did in Equalogic.
Laz Company, yeah.
He always can improve, right?
Yeah.
It's all fun.
So if I have, you know, my development office in Brooklyn where rents are cheap and my main office on Wall Street, then when an application is ready for dev, I can just vMotion it, not even storage vMotion it from one site to another.
That's right.
All right.
So on the request list that you're building, I want a pre-worm function.
Yes, yes, I know.
Yes, so do our customers.
I want to say, I'm going to move this on Wednesday and have the system be smart enough to figure that out.
Right.
There's some interesting effects of caching, especially in a dedupe cache, that as long as you have active workloads in both locations,
it's a large amount of similarity between what's being cached in the two edges
just because there's a similarity in the workloads
if everything's a Windows virtual machine or a Linux virtual machine.
But yes, pre-warmed caches are certainly on the roadmap.
We were thinking about that as another feature that I think can be very, very interesting for customers.
Even if we try to do a cross-metro, not just intra-metro, but also inter-metro,
pre-warming is a very, very useful thing to be able to do.
Whoa, whoa, whoa. All right. So intra and inter metro, so across metropolitan areas
between Philly to Boston,
something like that?
It's, you know, remember the origin of all your data.
It's all cache.
It's all cache.
And your canonical copy of your metadata
is actually sitting in the cloud
with point-in-time consistency.
So there are use cases where that's very possible. So you could just move things
across the country. Say
it's not Brooklyn, Howard, because Brooklyn and Manhattan
have the same weather, and the developers might like Miami.
So it's almost a disaster
recovery service as well, to some extent.
Exactly.
So when we think of the lifecycle, we're really thinking about not just the lifecycle of the equipment, but the lifecycle and all the care and feeding of all your data.
So disaster recovery is built into this architecture.
In fact, that's one of
the things you don't have to worry about. You get synchronous replication to our metro pop,
and then you get point-in-time consistency to the cloud. And so you have a whole set of
disaster recovery scenarios that are covered by that.
What's the point-in-time granularity?
Obviously, in the pop, the point-in-time granularity um, well, there is no granularity. That's every transaction. Yeah. In,
uh, in the, uh, in the cloud we are, so you set up snapshot policies for your, your LUNs and VMs
and your VBALs, uh, as you ordinarily would. Uh, and we. The thing that is interesting is that we push
all that stuff out every 10 minutes. So you have all of your snapshots up to about 10 minutes ago,
or perhaps less, depending on when the last time the last push completed.
Okay. But that would enforce a 30-minute RTO SLA.
Right.
Which for a lot of applications is fine. That's right.
Yeah. If I can make a quick comment just before you guys move on, which is, you know, I think when
we were starting this and we said, okay, well, primary data obviously is hardest thing, has,
you know, the most, you know, heavy requirements. But if we don't really tackle some of the
additional parts of the lifecycle with backup and DR, in a way we've left the customer still
handling infrastructure. And our goal is for them really not to have to handle infrastructure
at all. That requires us to really think pretty holistically about this. And, you know, over time,
the hope is that these customers will say, yeah, we don't, you know, we don't need a separate set
of software or gear or whatever it is to do the other pieces of this because it's just automatically
in the service. And the backup solution is a snapshot versioning kind of thing?
Exactly, exactly. So we have a VMware vCenter plugin that does VM consistent snapshots,
and we expect to be delving further and further into the application stacks that are living
inside those VMs in order to manage their state when that snapshot is taken.
Yeah, once we start talking about backups,
there's the state information problem,
and then there's the catalog problem.
That's right, that's right.
And both those things are on the table for ClearSky.
We're going to start picking away at that.
The problem, from my perspective, with backups to the cloud
and primary data in the cloud, and all this stuff is flowing to the cloud is that it's all in the same medium.
If Amazon goes down like they just did, you know, your data is sort of not there anymore.
Well, so there's more than one way to get to Amazon.
And Amazon, one of the things that it does, I think, really well is that it scatters
your data across multiple facilities as well. So we certainly have options for customers that
want to deal with that problem with making copies in different regions. So we can actually move
copies of your data cross-country to West, if that's what you want.
And that would usually when I haven't seen an outage where all of Amazon across the entire
country goes down. So you have this east-west thing. And again, a lot of this can be solved
with networking. What we're designing is a system where, you know, things like data centers and buildings are also part of the redundancy schemes
that we have to think about in order to keep the availability up.
Well, it's certainly simpler than storage networks who would buy any array you wanted
and put it in the cloud data center next to yours.
That's right. Actually, that was the cloud in 1999. So when you say your POP devices are obviously high availability, high capacity.
You mentioned the Edge devices, 6 to 8 terabytes, typical.
What's a Boston POP look like device or set of devices look like today?
So in the POP, it's really not a lot of gear.
So it's a half rack to start gear uh so it's a half rack to start maybe actually
not even a half rack um in fact i don't think we're still yet at a full rack of gear at the
pop and we have quite a few customers in beta trials right now and so you know it's it's
basically commodity servers two of them you know some uh JBot. And then we have the caching infrastructure, which is also increments of 2U.
So you can get quite a bit of cache that represents a huge amount of data in that footprint.
But it's all commodity stuff, and it's also highly available.
So we have two of everything there.
We have a shared storage architecture very similar to what's happening in the edge.
It's happening in the pop, except it's denser because it's multi-tenant.
And we're taking advantage of commodity economics a little bit more there
for a number of reasons, operational most of them.
But basically, ClearSky is building is a software company.
And if you actually just looked at the rack, you'd say, oh, that's servers and JBODs.
Shouldn't be surprising, but, you know, it's all...
All that storage gear looks alike nowadays.
Yes.
How is it priced?
I mean, does the customer have to pay for your service and then Amazon services separately,
or is all kind of buried within your bill, I guess? Yeah, we feel like it's really important
that it be one integrated price for the customer and that they never have to be thinking about the
pieces and parts that we've put together for this. You know, in the end, we're an SLA to the
customer. So, you know, it's capacity-based, you know, per gig per month. We have minimum buy-in in terms of, you know, kind of 20 terabytes and, you know, a year commit.
But we feel like the customer should just see that as, you know, any piece of, you know, the architecture that's involved in terms of the edge appliance or the networking or the use of Pop or Cloud or, you know, any of that stuff.
That's just all in as well as the operations and support.
You know, a lot of the goal here is the simplification of the model for consumption of storage.
So from our perspective, like if we have to, you know, whatever it is,
you swap out, you know, drives or, you know, expand or scale or patch or all that stuff,
that just automatically happens for the customer versus them ever having to be involved with that.
Interesting.
So, and behind the S3, I mean, it seems like Amazon has multiple tiers now of storage capabilities.
Well, right.
And so we're using S3.
We're using the basic S3 service.
And, you know And we didn't want
the reduced durability stuff.
Obviously, durability is the reason
we use the cloud in the first place.
There may be some use cases
for some of the newer things.
They have this infrequent access
form of S3,
which may be out there,
but we're still looking at it.
All these things,
because we're basically a single price per gig per month.
You know, all these things are sort of included in our pricing model.
So it's kind of, it's opaque to the customer.
But as our costs go down, this is one of the great things about being a service.
As our costs go down, the price to the customer just automatically ratchets down over time anyway.
And so I could see Ellen Shutter from here at that thought.
Well, we live in a storage reality, which is that everyone assumes that costs will continue
to go down.
And we're right, you know, look, we're right in the commodity curve, just like everybody
else is.
So, you know, that's a given.
You know, that's one of the bad things that comes along with buying huge amounts of physical storage and putting it in a data center is you're paying today's prices for tomorrow's capacity when you know tomorrow it's going to be cheaper.
You just know that because the history is self-evident, right?
Yeah, that's one of the things that always annoys me when I see somebody say, yeah, we bought that VMAX fully populated so that we don't have to touch it for three years.
Exactly, exactly. And so in three years, you basically depreciated a lot more money than you had to in order to accomplish that goal.
Yeah, the sad part is that I've been in cases where at the end of the three years, they still hadn't put any data on it.
Wonder about those guys.
So you're obviously tied to Amazon pretty tight.
You know, the way we think about the cloud is that just like, you know, Flash and just like Metro Ethernet and just like other things that we're using, it's a component.
And, you know, that's something that, you know, Lats has really stressed very highly in terms of the way we've architected stuff. And so, you know, my feeling about this is there are
reasons why a different backend cloud will make sense for different, you know, customers, regions,
times in the, you know, evolution of the company and stuff like that. And so we never want to be
in a situation where we've, you know, sort of tightly bound ourselves to anybody because I am sure over time we'll be using multiple clouds as backend, you know, even just
in terms of, hey, there are parts of the country or the world where the best and closest, you know,
public storage cloud may not be Amazon, and we can't worry about that. So...
As a truly paranoid customer, I want my data sent to two clouds.
For sure. Or we've had a couple of retailers tell us that
they would not like their data to go to the Amazon cloud because Amazon is a competitor.
So how about that? Well, and that was the AES-256 is for? Among other things.
I actually just have to make the disclosure that I spent some time up in the Clear Sky offices
in July and tested the system in one of its beta states.
And how did that run?
It ran as expected, which, you know, for early beta is always a good sign.
Yeah, I would have to say.
That, you know, read performance felt like an all-flash array because I was hitting the
edge cache, and write performance felt like it was arrays doing synchronous replication
so i you know had that additional latency but it was consistent additional latency you know we're
not i'm not saying anything about this particular performance because it was an early beta version
and as i was walking out the door last gave me 10 minutes on the next beta version that was a lot
faster so it just doesn't make any sense but you know that this if you think of the system as
you know a virtual array that synchronously mirrors back to the pop that's how it behaves
and you know I can see a lot of use cases you know kind of the whole city of Greenwich and Stanford with all those little hedge funds seems like a great place for this kind of thing.
And, Howard, it's time for you to come back and visit us again because, you know, obviously a lot has evolved since you were here in July.
And, you know, we're coming to the end of our beta testing completely and ready to head this thing out the door.
So, you know, we look forward to you doing some to you doing some more checking in on how performance looks.
Yeah, a whole bunch of questions I would have in the networking, but I'm not a networking
expert, so I'm not even sure I'd understand what your answers are.
But it's a private network at the metro level, and then it's also a private network to the
cloud?
Right.
That's pretty impressive.
Yeah, so it's AWS Direct Connect from the POP to AWS East.
And then it's just amazing how many vendors many cities have for Metro Ethernet. I mean,
when I was ending my consulting career in New York, you had your choice of Time Warner Telecom and Con Ed Telecom and Verizon and three or four other vendors in a lot of buildings
in Midtown and downtown, you know, in the basement, there were multiple sets of fiber you could
connect to. And this is a data center customer where you could have hundreds of carriers, right?
You know, we're not doing office buildings. So, it just that that's really um available to us and it's a great component yeah okay uh howard any final questions then no i think we
got it ellen last is there anything you'd like to add to the discussion no i always feel with you
guys we really uh you know get right into the heart of stuff so i think i think we're uh pretty
pretty well covered and just really happy that we could spend time today. We have to admire how Howard doesn't hold back any.
We know Howard's going to tell it like it is, so that's always good.
I got a reputation to uphold.
Well, this has been great.
It has been a pleasure to have Ellen and Laz here with us on our podcast.
Yes, it has.
Next month, we'll talk to another startup storage technology person.
Any questions you want us to ask, please let us know.
That's it for now.
Bye, Howard.
Bye, Ray.
Until next time, again, thanks, Ellen and Laz.
Thanks.
Thanks for having us.