Software Huddle - NoSQL Transactions in DynamoDB with Akshat Vig & Somu Perianayagam from AWS
Episode Date: September 5, 2023Amazon's DynamoDB serves some of the highest workloads on the planet with predictable, single-digit millisecond latency regardless of data size or concurrent operations. Like many NoSQL databases, Dyn...amoDB did not offer support for transactions at first but added support for ACID transactions in 2018. Akshat Vig and Somu Perianayagam are two Senior Principal Engineers on the DynamoDB team and are here to talk about the team's Usenix research paper describing how they implemented support for transactions while maintaining the core performance characteristics of DynamoDB. In this show, we talk about DynamoDB transaction internals, performing user research to focus on core user needs, and staying on top of cutting-edge research as a Principal Engineer.
Transcript
Discussion (0)
One thing you need to learn, and this will go throughout your career, never trust anyone in the distributed system.
That's the default rule.
But I think a key point which Dynamo was emphasizing on and we wanted to do is that we want to build a protocol which is scalable and predictable.
And what is the interface you want to provide to customers?
Generally, transactions are considered at odds with scalability.
One of the things we really actually considered and debated
a lot was multi-version concurrency control. But supporting multi-version concurrency control in
Dynamo would actually mean we have to change the storage engine. Hey folks, this is Alex Dabri and
I just love today's episode. You know, I'm a huge DynamoDB fan and today we have Akshat and Somu.
They are two of the senior principal engineers on the DynamoDB team.
I have huge respect for them.
They've both been there, you know, before DynamoDB was released.
So it was a great conversation with them.
The DynamoDB team has written some really great papers, one each in the last two years, just talking about some of the infrastructure behind Dynamo.
And the one this year was about distributed transactions at scale in DynamoDB.
So we talk about that paper here.
We talk about database internals.
If you like to nerd out about this stuff,
I think this is a really good talk.
One thing I always love about these Amazon papers,
especially the DynamoDB team,
is just how well they talk about
thinking about user needs and what users actually want.
How can we simplify this down
and what's the technical implementation
to make that happen for them?
One thing after we got off the call,
Akshan and Sunwe, they wanted to say,
hey, make sure you shout out
the other people on those papers.
So thanks to all the other
authors on that paper.
They especially called out
Doug Terry, who helped with
the paper, with their talks
and presentations.
And I think just with the ideation
and implementation of
transactions in DynamoDB.
So if you like the show,
you know, make sure you like,
subscribe, give us a review, whatever.
Also feel free to reach out
with suggestions, guests, anything like that.
And with that, let's get to the show.
Akshat Somu, welcome to the show.
Thanks, Alex.
Thanks, Alex.
Thanks for having me.
Thanks for that.
Yeah, absolutely.
So yeah, you two are both senior principal engineers on the DynamoDB team at AWS, which
is a pretty high position.
Can you give everyone a little background on what it is you do on the Dynamo team,
how long you've been there, things like that?
Yeah, I can go first.
So I joined Amazon, I think 2010.
And from there, I first was working in Amazon India.
And then when I saw AWS getting built, I was like, hey, I want to work here
because, you know, the problems are super fun.
So I joined first SimpleDB team.
And at the same time, DynamoDB was incepted.
So I've been with DynamoDB right from its inception
and have been able to contribute a lot of bugs and a lot of features
to DynamoDB over the years, like DynamoDB streams,
point-in-time backup restore, transactions, global databases.
And we're going to talk about transactions today.
So like Akshat, I've been with Amazon for about 12 years now.
I started in Dynamo, and I've been working in Dynamo.
I've worked in all components of Dynamo, front and back and control plane.
But my areas of focus right now are replication services,
transactions.
So replication services is global secondary indexes,
global tables,
what we're doing for regional table replication
and how we make it highly available.
So much of my focus has been around this stuff,
but around all the multi-region
services we have as well at this point in time. Awesome. Great. Well, thanks for coming on because
like, you know, I'm obviously a huge DynamoDB fan and big fans of you two. I'm excited to talk about
your new paper. You know, there's a really good history of papers sort of in this area, right? Like the original Amazon Dynamo paper, not DynamoDB, you know, in, in 06
or so really kicked off a lot in the NoSQL world. Last year, the, the Amazon DynamoDB paper that
basically said, Hey, here's what we took some of those learnings, made it into this cloud service
and what we learned and what we built with DynamoDB. And now this year, the, this new
transactions paper that came out,
which is distributed transactions at scale on DynamoDB,
if people want to go look that up,
just showing how you added that on
and how transactions can work at scale.
So I'm excited to go deep on that today.
Maybe just to get started, Akshat, do you want to tell us,
like, what are transactions?
And, you know, especially why are they,
what are the uniqueness of transactions in NoSQL databases?
Yeah, so I think if you look at NoSQL databases,
a lot of NoSQL databases either do not support transactions
because NoSQL databases, they are, you know,
generally the key characteristics that are considered good
or that the reason people
choose them is high availability, high scalability, and single digit millisecond performance.
DynamoDB provides all three. So specifically, generally transactions are considered at odds
with scalability. And scalability here I refer as two things one is predictable performance and
second is unbounded growth like your table can be really tiny in the beginning and as you do more
traffic it can scale it can partition so mostly i think previously we have seen like a lot of
no sql databases they shy away from implementing transactions so or some do implement but they implement it in a form which is
like constraint where you can do transaction on a single partition all the items that that reside
you know at a single machine so when we started hearing from our customers that hey we would like
to have transactions in DynamoDB.
So you're like, okay, first, let's just understand why do you actually need it?
Because we have seen a lot of workloads
that are running on DynamoDB without actual transactions.
So what exactly are you looking for in transactions?
So I think we went through that journey
and took the challenge that,
hey, we really want to add transactions which provide
the asset properties,
atomicity, consistency, durability,
and isolation
for multi-item
and multi-table writes
that you want to do,
reads or writes that you want to do
on your database table
in DynamoDB
or across tables in DynamoDB. And that's how we started. Absolutely. And so transactions were released
at reInvent 2018. So this is six and a half years after Dynamo's been out. I guess how soon after
Dynamo being out, were you starting to get requests for transactions? How long did that
sort of user research period last?
Like you're saying, like, what do you need these for?
What sort of constraints do you have here?
Yeah, so before we actually added transactions,
I think there was a transactions library that was built by Amazon,
like one of the developers in our team, David Janicek.
He built a transactions library that was essentially doing
trying to provide the same
experience of like asset properties
on your database
so this was I don't remember
exactly but this was I think
2016-ish time
2014-ish
2015-ish I think
something around those times
but I think the pattern that we were seeing
was a lot of, for example,
control planes that are getting built
or a lot of teams in Amazon
who are using DynamoDB.
And at that time, there was also a push
that, hey, we want to move
all the workloads to DynamoDB
and get away from the relational databases
that we have seen have like scaling
limitations so transactions became like really important for making that transfer from SQL
databases to NoSQL databases and at that point like transactions library was one thing which we
saw that okay the adoption of transactions library is increasing so that was one thing which we saw that, okay, the adoption of transactions library is increasing.
So that was one signal.
And second is people started telling us about, hey, the transactions library is great,
but there are certain limitations that we are seeing with that, which is for every write,
we have like 7x cost we have to pay because transactions library essentially was trying to maintain a ledger and the whole state machine of where the transaction is how far it has gone forward and
in case the transaction is not going to finish it has to do the rollbacks and things of that nature
so all the complexity was actually encapsulated in this library as an abstraction given to the
customers so overall i would say the signal of people
adopting that library a lot more
and direct conversations with the customers
hearing about it,
this is a specific use case we're building.
And it would really simplify
if there was like acid properties,
like full atomic transactions across multiple tables
and multiple items in DynamoDB.
Yeah.
It's interesting to see that trade-off
between the client-side solutions,
like the transactions library
or a few other ones that Dynamo provides,
and then the actual service solutions.
Given that Dynamo sort of gives you
all the low-level access to most of the stuff,
you can perform that
or be kind of like a query planner
or a transaction coordinator client-side if you want. But then it's nice when that can move up into that server layer.
I guess once you decided, hey, we're building transactions, how long does it take? You know,
you already had tens of thousands of users, you know, probably millions of requests per second,
things like that. How long did that take to build and deliver that feature
where it's available, you know, at reInvent in that November?
I think once we kind of decided
that we wanted to build transactions,
we had a bunch of people go and figure out like,
hey, what are the algorithms which is doable?
So going back to your initial question of like,
NoSQL databases usually shy away from
transactions, it's the scale and complexity of it, right? Like there are different algorithms
you can do or implement. And then you have, once your transaction fails, how do you kind of recover
transactions? And a lot that has to go into like, what is the algorithm you're going to choose and
build? But I think a key point which Dynamo was emphasizing on and we wanted to do is that we
want to build a protocol which
is scalable and
predictable and what is the interface we want
to provide to customers because traditional transactions
have been like, hey, begin transaction, end transaction.
And a lot of customers are used to that.
Right? But that
would take away a key
tenet of Dynamo, which is like predictable
performance because you now don't know how
long your transaction is going to be, right?
So how do we kind of balance
that trade-off?
How do we kind of expose this
to customers?
What are the protocols
we're going to choose?
I think we spent a lot of time
on that.
And then when we were closer
to knowing what the protocol was
and what the APIs were,
I think it was roughly
about a year,
I would say,
that it took us to kind of...
Yeah, and I think a lot of time, I would say, goes into, kind of yeah and i think a lot of time i would
say goes into as sumo saying a lot of time goes into like understanding state of the art like
what already exists and then doing trade-offs pocs to actually like actually you know figure out what
how much time it took us to uh decide this is the right one because you know there is like dynamo as i was saying right
acid like atomicity for single item was already there consistency like you have consistent reads
and eventually consistent reads and you know when you do a right you preserve the correct state so
consistency you you already get that isolation i think was the main thing and atomicity across
multiple items was the was the thing that we wanted to add.
So I think a lot of time,
I would say,
goes into two phases.
One is just figuring out what to do.
And once you figure out,
building, I think, is the fastest.
That last part is actually proving
what we have built is correct.
So, yeah.
You talked about different constraints
that different ones have on it.
You talked about, you know,
some of these only implemented
on a single char or node
or partition, whatever that is.
I assume that wasn't really feasible for Dynamo
just because that's sort of invisible to you
and because those partitions are so small.
But that other constraint of,
hey, it has to come in as a single request
and all get executed together,
as someone mentioned,
like, was that something you narrowed in on pretty early
of like, hey is this is what
we're going to do and where you check with users like is that going to be okay will that still give
you what you want or or is that something that came you know took a while to hash out and figure
out yeah so i think for for like that specific journey if i if i recall i think we did like a
lot of i would say experiments and research on that. And it involved trying out like some of the workloads.
So we actually went and talked to customers to understand,
hey, why do they use this concept of begin and end transaction?
And specifically, I think the reason we chose one of the biggest reason is that,
you know, if you let someone do like begin transaction and then send a bunch of
writes and reads and also like other operations maybe someone puts a sleep there so the resources
are tied up for that long for that particular transaction and then when the resources are tied
up you also don't get predictable performance so i think a lot of these decisions went into defining the tenets
for what transactions should look like.
So we essentially defined goals for it
that we want to execute a set of operations
atomically and serializably
for any items in any tables
with predictable performance
and also no impact to non-transactional workloads.
So a lot of like techniques, standard techniques,
like two-phase locking and, you know, the begin and end transaction approach,
like a lot of those just like did not make sense for us. And even, I think, for example,
one of the things we really actually considered and debated a lot was multi-version concurrency control.
If we could build something on that, you know, you get like read isolation.
So your reads could be isolated from writes.
But supporting multi-version concurrency control in Dynamo would actually mean we have to change the storage engine.
And if you build MVCC, you need to track multiple versions,
which means the additional cost that comes with it
of storing multiple items,
then you have to pass that cost to the customers.
So, you know, that particular also,
we had to, all these basically standard approaches,
we had to reject.
And then we nailed it down to,
okay, we want to do like a single request transactions
based on these goals or tenets that we have defined.
So then we went to some teams in amazon.com and said that hey if we provided these two apis would this would you be able to convert your like existing transactional workloads into
like a dynamo db transaction and we did similar exercise with some external customers as well
to validate what we are building has you know that does not have like
obvious adoption blockers and things of that nature and turns out all the use cases that we
actually discussed with the customers they were you know able they were they were we were able
to convert them into the two apis that we added two operations in the dynamo db apis that we added
one is transact write items and second is transact get items.
And just to explain transact write items and transact get items a little bit,
essentially with transact write items, you can do a bunch of writes, which could be update,
delete, or put request. And you can also specify conditions. The conditions could be on these items
which you're trying to update, the DynamoDB standard of like OCC, right, that you do. Or you can also do a check item, which is not an item
that you're updating on a transaction. And similarly for transact gets a separate API,
where you can do multiple gets in the same call, which you want to read in a serializable banner.
Yep, absolutely.
Yeah, I love that single request model.
And I think you're right that like almost anything
that can be modeled into it
and the ones that can't are probably the ones
your DBA is going to advise you against doing anyway
on your relational database,
like where you're holding that transaction open for a while
and maybe calling some other API or something like
those can really get you into trouble.
You mentioned sort of like,
hey, what's the state of the art
in terms of protocols and patterns and things like that?
Like, where do you go look for research on transaction protocols
or just different things that's happening?
Is that academia? Is that industry papers?
Or where are you finding that stuff?
Oh, right. I think there's been a lot of good work in academia
starting from the 60s about transactions.
It is very interesting because the inspiration we took was from one of the papers published by Phil Bernstein.
And this was in the 1970s when most of us were not even born.
Right.
So I think academia has a lot of the good research.
And then there's been a lot of good
research in the industry as well now, like
there's been, industry's been doing a lot of research
and we've been publishing recently as well, so industry's
also been doing a lot of research. So
we look back at a lot of the
papers which are published
in standard computer science conferences like
Usenet, Sigmar,
OSDI,
and then learn from what has worked in the past
and what has not worked in the past
and what will work for us technically, right?
Like in case of transactions, the timestamp ordering,
why does it work for us?
We will definitely go into details.
And there's an element of that as well here
as like what makes sense for us.
Yeah.
What does that look like at amazon like is it
mostly just informal like hey did you see this new paper or are there like you know scheduled
reading groups or or different things like that to make sure everyone's up on the latest stuff
what does that we have scheduled reading groups because we have people of varied interest and we
want to kind of learn a lot about what's happening what's not happening and we may not get to do that
in a on a day-to-day in a job basis, right? So we have people
who have focused reading groups
who read papers all the time
and talk about like,
hey, pros and cons.
What did we understand?
What did we not understand?
What did the paper do well?
What did the paper not do well?
Like we had them.
And we talk a lot about
how to use the different things.
Like, for example,
a big thing within Amazon
is like how do we use
formal modeling tools
like TLA Plus or P modeling, right?
And we have scheduled groups
which kind of go dive deep
into that stuff.
So there are scheduled groups
for everything like data structures,
algorithms, distributed systems.
And I know like I've seen a lot
on TLA Plus at Amazon
is that something that
you know
both of you are doing
or is that something like
hey there's a group
that's really good at that
or a few people
that are really good at that
and they'll come help you through
like how often are you
actually using those
those sort of methods
so there
there are very few people
who use TLA Plus
partly because
it's more complex
but it's very helpful
like for example
with the Plus style it's made a lot life a very helpful like for example with the plus style
it's made life a lot easier
for you and me to go write something
back in the day
the TLA plus specification
was harder to write
but with plus style
it's very easy
when they convert it to TLA plus
it's easy to write
the P modeling is
something which we kind of
have all developers now
kind of use
because it's closer to
the code you would write
and it is easier to kind of used because it's closer to the code you would write and it is easier to kind of
prototype and p model and take a model in p and then run with that stuff i think that's that's
something we have asked all developers to write um tla plus has usually been like a new set of
developers we use this stuff for a really um very critical set of problems like dynamo when we
started when we did dynamo first we we had a tla plus model of problems like Dynamo. When we did Dynamo first,
we had a DLA plus model for all of Dynamo operations
to ensure that everything is correct.
And that's still the foundation for Dynamo in some ways.
And same for transactions.
We did a similar thing for transactions as well
to prove the correctness of the algorithm.
And similar to that, we actually also have like a verifier,
ACID verifier, which runs in production to, you know, since like whatever time has the transactions
has been launched, we still run the ACID verifier on just to, you know, make sure that we have not
like any gaps that we have any blind spots or anything, things like that to, to ensure protocol
is correct. Yeah, absolutely.
Okay, one more thing
before we get into
internals of transactions.
Like, you're both
senior principal engineers.
You've been at Dynamo
for 12 years.
Like, obviously doing
a lot of higher level stuff.
I'm sure writing documents,
writing these papers,
giving talks.
But Amazon is also known
for being very, like,
practical hands-on
for their advanced people.
Like, how much
during,
how much time during the week do you
still sit down and write code?
So I think it varies, varies on like different phases of the project.
Um, like overall, I would say like in terms of if, if I look at the like full
year, um, a lot of time I think is spent in figuring out like what we are doing
and how we are doing and whether it is like you know correct or not and then second phase is I think where you write like
the p modeling stuff that that someone was talking about I think a lot of time gets spent in that
and third is I think POCs where you come up with an idea you write a POC to prove that hey this
actually makes sense or this actually whatever we are claiming is going to be what we would, is what we are claiming is actually going to be achieved.
So that's one.
And then third, I would say the last part is, you know, reviewing and ensuring that
operationally we are ready and ensuring that the testing that we are doing, we have like good coverage.
So I would say like writing code,
testing, remodeling, writing docs,
it's like equal split
in terms of like the time spent.
And if I am working on a project,
I would usually take something
no other developer wants to take
or non-critical
because I'm not blocking them
in any way or fashion because
I'm doing a bunch of
other things as well
simultaneously.
So I think,
like Akshat said,
it depends on the face
of the project.
If it's something
which is an ideation
at this point in time,
we would write a bunch
of code to kind of
prove it works,
it doesn't work.
Or we're doing some
modeling stuff
at this point in time,
right?
So that's how we can
ensure that we are
up to date and hands-on
on the stuff as well.
And the other part is also code reviews,
which still keep you very close connected.
So that because operationally,
I think if you're not connected operationally,
it's very hard to debug things
when you get paged at night at 2 a.m.
Yeah, yeah, exactly.
Cool. Okay, let's get into transaction internals.
First thing, two-phase commit, which is the pattern you use here on the transaction coordinator.
Do you want to explain how two-phase commit works?
Yeah, so before that, let's just talk through a high-level DynamoDB normal put request that comes and flows through.
And then I'll add the two-phase, how we implemented that.
So first, any request that like a developer or an application
sends to DynamoDB, it first hits like load balancer.
From there, it goes to a request router, which is like stateless fleet.
The request router has to figure out where to send this request.
Like if it's a put request, it sends it to a leader replica
of a partition.
Now DynamoDB table is partitioned
for like scale
and that number of partitions
are identified based on the size
or the read and write capacity units
that you want for your table.
So you might have a table
which has like 10 partitions
and this item that you're trying to put
will reside in a specific partition and that partition has three replicas and one of the
replica is the leader replica. So the write request goes to that leader and it replicates it to two
other replicas before two other followers which once it gets acknowledgement from at least one
more so two copies are durably written We acknowledge it back to the client.
To find out which storage node to route the request,
there is a metadata system which we use.
Now for transactions,
we introduced transaction coordinator,
which has the responsibility
of ensuring that a particular transaction
that is accepted has to go through completely.
And so a request that customer makes,
like a transact write item request,
it goes to the request router,
goes to the transaction coordinator.
First thing the transaction coordinator does,
it stores it in a ledger
and ledger is like a DynamoDB table
and we can come back to it.
But the main point of Ledger is to ensure that we can,
whatever request we execute it atomically,
like either the full request succeeds or it does not succeed.
And second part is fault tolerance,
that if a transaction coordinator,
which is processing a request crashes,
since the request is stored in the Ledger,
any other transaction coordinator can pick it up and run with it, right?
So transaction coordinator, once it stores it in the ledger,
it is kind of doing like checkpointing and state management
of where the transaction is.
So once it is stored in the ledger,
it sends prepare messages to all the storage nodes involved.
So let's say you are doing
a 10 item transaction, which are for 10
different tables, and there could be
10 completely different partitions,
all in the
same account.
Now, once that
request is sent for
prepares, at that point
all the check conditions, like if you're doing
an OCC write with a put item
or you're purely doing just a check item. And just to interrupt you what's OCC? Yeah so optimistic
concurrency control so if you want to do a write saying that hey I want this write to succeed only
if certain conditions are evaluated to true if that happens then only accept this right otherwise
you know reject this particular write
request that we are saying sending to you so prepare messages are evaluating that and it also
evaluates if there are any any of the validations that like item size 400 kb item things like that
if any of those will be will not be met then you should just reply back saying i cannot accept the
transaction but assuming that every storage node all the 10 storage nodes in the 10 item transaction will not be met, then you should just reply back saying, I cannot accept the transaction.
But assuming that every storage node,
all the 10 storage nodes in the 10 item transaction case,
reply back saying that, yeah, this particular transaction prepare,
we can accept.
The transaction moves on to the commit phase.
And once it has passed the prepare phase, i.e. the transaction coordinator got acknowledgement from every storage node
and it is also durably written in the ledger that the transaction has finished the prepare state,
it moves to the commit state, which is making sure the actual write happening at that particular point.
So the item is taken from the ledger and then sent to the specific storage node to finish the transaction.
And once the commits are done, your full transaction actually is finished. So at high
level, that's the two-phase protocol. Gotcha. Okay. So we have prepare and commit. Prepare is
just basically checking with every node saying, hey, is this good or not? If they all come back
with that accept, thumbs up, then it comes back and says, okay, go ahead and commit.
And once it's in that commit phase and then tells them all to execute,
is there basically like no going back?
Even like say one of those no's failed
originally or something happens,
like we're just going to keep trying until that,
like we've already decided this transaction
is going through at this point.
Yes.
Once the transaction has reached the commit phase,
then it's executed to completion.
Failure of transaction coordinator
or failure of a node
which is hosting the partition
was not going to stop it.
It's going to kind of finish it
complete to completion.
If a transaction coordinator fails,
another one is going to pick it up,
pick it up and say,
hey, the transaction is in commit phase.
I'm just going to send commit messages
to all the items
which are involved in the transaction,
no matter whether it knows whether a single item is sent,
commit has been sent or not.
If a storage node fails, it's the same thing.
When nodes fail all the time,
a new leader is elected
and the new leader can complete the commit.
It doesn't need any prior knowledge
of the transaction at this point in time.
Okay.
So tell me about that transaction coordinator failing.
How does a new one pick up that stall transaction and make sure it gets executed?
So all transaction coordinators run a small component of the recovery.
So they keep scanning the ledger to say, are all transactions getting executed?
And if they find a transaction which is not executed for a long period of time, then they would say, this transaction is not
executed at this point in time. So either we kind of take it forward. So let's say there's a
transaction in prepare state. So transaction coordinator may say, you know, this transaction
has not been executed for a long time. It's in prepare state. So I don't know what happened to
all the prepares. What I'm going to just do is cancel this transaction, I'm not going to execute this
transaction. So I'm going to move this into a canceled phase and then send cancel notifications
to all the members involved in the transaction. Or it can decide, oh, the transaction is in commit
phase, let me just take it to completion and send everybody a commit message at this point in time,
right. So this is a small recovery component. There's a
small piece
we missed, which is like when we do
the prepares for an item,
every storage node has a marker saying, well, this item
has been prepared for this particular transaction.
And let's say that
for some reason that
the transaction
has not been acted upon for some period
of time and the storage node looks at the item
and says,
hey, this item is still
in prepaid state
for quite some time.
It can also kick off
a recovery and say,
hey, can you please
somebody recover this transaction,
recover this item for me
because it's been like
a long time
since the transaction started.
And when you say a long time,
how are we talking here?
Are we talking like seconds
or like a minute
or what does that look like?
We're talking like
five minutes, seconds at this point in time, right? Yeah we talking like seconds or like a minute or what does that look like? We're talking like five minutes,
seconds at this point in time,
right?
Yeah, seconds, seconds, yeah.
And I think
the most interesting part
out of this also is
there is no rollbacks.
That's why there are
no rollbacks here, right?
Like because
the prepare phase
is actually not writing anything.
It's just storing that marker
that Somu pointed out.
And hence,
if any of the prepare fails
or we identify that this transaction cannot be completed,
we just send cancellation,
which is basically not, yeah, aborting the transaction.
Gotcha.
And if anything is in the, I guess, the prepare phase
where a node has accepted it and then send back accept,
but maybe the transaction is stalled for whatever reason.
Are rights to that item
effectively blocked at that point
until it's recovered?
Yes.
So the rights to that particular item
cannot be now serialized.
So you would have to have
the transaction complete
to have the rights serialized.
So any other singleton right
would be kind of rejected.
Say, hey, there's a transaction conflict
at this point in time.
We can't reject it.
But we can talk a little bit more about this because
we did talk in the paper about some optimizations we
can do there and we know
that we can do this optimization. But in reality,
we have not seen this happen. Customers
mixing traffic
of transact rights with singleton rights. So we
kind of don't see this thing
much in practice to kind of go and say,
we have to go and implement this optimization where
we can serialize these rights.
Oh, that's interesting.
So most items you see
are sort of either involved
in transaction rights
or singleton rights,
but not both.
That's interesting.
Which is kind of like
a recommendation, I think,
from Cassandra.
They're like lightweight transactions
because I think you can get
some bad issues there
with that.
But it's interesting that customer patterns sort of work out that way anyway. Yeah, and I think you can get some bad issues there with that. But it's interesting
that like customer
patterns sort of work
out that way anyway.
Yeah.
And I think the part
of like if there is
like a transaction
stuck, as someone
pointed out, if there
is a write request
that comes to it and
the transaction has
been stuck for a
while, that also will
go off like, you
know, recovery
automatically.
Yeah.
Plus, I think when
we devised these
algorithms, we
actually thought about,
you know, we want to
support for like contention as well.
So that's why we chose timestamp ordering and where we can do some interesting tricks, which we talked about.
And we actually, you know, also tried some of those implementations before we went ahead with this approach.
Yeah. Okay. And for a transaction that's stuck, like what happened to the client there?
Is that just hanging until, you know, it times out at like 30 seconds, whatever the client timeout is?
Or if something picks it up, is it going to be able to respond back to that client?
Or is that basically just like, hey, we'll clean it up, but the client, you know, they're short on their own at that point.
So the transact write item requests, they're actually item potent. actually idempotent so if let's say like a request that took longer than the client timeout clients
can just retry using the same client token which is the idempotency token and that token is used
to identify you know what based that token uniquely identify the transaction and based on
that we can tell you that hey this transaction actually succeeded if you come back or this
transaction failed.
But again, most of these transactions
are we are still talking millisecond.
We're not talking seconds to finish, right?
Most of the transactions are still finishing in milliseconds
and getting clients are getting an acknowledgement back.
Yep.
Should I, you mentioned the idempotency
in the client request token on a transact, right?
Should I always include a client request token?
Like there's no I mean
not cost
but even like
there's no like
latency
cost on that
or any sort of
cost is there
of just including that
that's a recommendation
from DynamoDB
if you're
if you're using
transact right item
request
use the client token
so that you can
recover really
easily
and retry
as many as many times.
There is a time limit
for which this client item potency token will work
because you might be trying to do a different transaction.
So there is a time limit after which it won't.
And so, yeah, it is recommended to use it.
So the nice thing about client request token, Alex,
is that let's say your client for some reason timed out, but the
request was executed successfully on Dynamo side.
You can come back with the same thing in Dynamo and say, hey, this transaction was successful.
You don't have to kind of execute this stuff.
I think that's a super nice thing about the client request token.
And also the fact that, let's say that if for some reason if for some reason
you come back
and the item potency token
is expired
I think it was that window
was 10 minutes
at this point in time
we would try to re-execute
the transaction, right?
But most of the transactions
usually have conditions in them
and the conditions will fail
and then we will say
okay, you know
this transaction has a condition failure
so we won't be able
to execute this stuff.
Yeah, and this client token
actually was not something we initially
planned to add. This was, again, when we
built it, we gave it to a few customers,
they tried it out, and they were like,
hey, this particular use case, you know, we don't know
if this transaction succeeded or failed because
we timed out. So this was, like, I would say
in the later part of the project, we designed it,
implemented it, and
launched it. So quite a
flexible and iterative process.
Yeah, cool. And, okay, so you mentioned that there's like the 10 minute window where that
request is sort of guaranteed to be identifiable if you're including that token. So are you
just keeping records in that transaction ledger for 10 minutes, like expiring at some point,
but at least they're hanging around for 10 minutes is the point there. Okay. Okay. And then you mentioned like looking for stalled transactions. Is that
just like, you're just like sort of brute force scanning the table, like taking all the transaction
coordinators, each one's taken a segment and just continually running scans against it?
It's a parallel scan.
So the letter is a DynamoDB table. I think we talked about this before. And I think it's
very heavily
sharded, to put it
nicely. So you can do a lot
of scans on this table. And it's
a
paper request table, right?
So it's, and we have
all the transaction coordinates. They can pick
a small segment of it and say, I need to scan
a thousand items.
So,
and they all can scan it quite quickly
and figure out any
transactions
that are stalled.
Yeah.
Okay.
Tell me about
that DynamoDB table
that's used
for the ledger.
Like,
is that,
is there like a
different Dynamo
instance somewhere
that's used for
these internal type
things like the ledger
or like, or is it just like a loop of writing back to itself? Like, what does that, you know, different dynamo instance somewhere that's used for these internal type things like the ledger or
like or or is it just like a loop of writing back to itself like what is that you know dynamo as a
service right is a multi-tenant service so all these customers across a region or a lot of
customers within a region are using the exact same dynamo service you know they're um like i guess
like how does that sort of foundational dynamo instance work? Is that a separate instance that's sort of different and special or anything like that?
No, this is a normal user-level table.
Like, the transaction coordinators are just another user and there's no normal user-level table at this point in time.
As you mentioned, there is a circular dependency here, so you can't use transactions on this table,
but we don't have a need to use transactions on this table, right? So
this is a Dynamo-level table. I normally
use a level table, so we get all the other features
of Dynamo, which we can
use.
Wow, that's pretty amazing. Okay.
Alright, you mentioned
timestamp ordering a couple
times. What, I guess, what is
timestamp ordering? How does it, how do you
use it in transactions?
Yeah, so timestamp ordering. How does it, how do you use it in transactions?
Yeah. So timestamp ordering.
So we talked a lot about atomicity till now, the two phase protocol, right?
Like for serializability, we decided to like borrow timestamp ordering technique,
which so-
And hold on serializability, this is like a confusing topic, but just like high
level, we'll spend hours on that.
If you could do like what no one else has managed to do and describe that in like one
or two sentences, like what's the high level idea of serializability?
So I think it's mainly around concurrent access.
If you have like concurrent access of like data in a database, right, you need to define
an order in which these transactions are executed.
So timestamp ordering has this like very nice property that if you assign a timestamp to each transaction, the timestamp basically is the
clock that is being used from the transaction coordinator. The assigned timestamp
defines the serial order of all the transactions that are going to execute on a set of tables that you're doing.
So that basically defines the serial order
of the transaction,
even if you have like concurrent access
from multiple users trying to do
like transactions on the same set of items,
timestamp ordering give this nice property
where we can serialize
or define a serial order of these transactions.
It's like kids are coming and asking us something, right?
And then you say, hey, hold on,
your brother asked me something first.
I'm going to kind of execute his request first because we need one parent at this point.
Right.
So that's exactly what timestamp ordering allows us to do is to have concurrency control to say, hey, which transactions get how, what is the order in which transactions will get executed.
Awesome.
Awesome. awesome awesome and then I love that example because that helps bring it up
what again
sort of like
two-phase
like what other
options were there
in terms of
ordering and
serialization
that were
considered over
or single tax
two-phase
locking is one
where you
like lock the
items on which
you're executing
the transaction
and then you
finish the
transaction
then move on
to the next one
but locks means deadlocks.
Locks means like a lot of things
that you have to take care.
So we didn't want that.
That's why timestamp ordering,
which gives you this nice property of,
like if you assign timestamp, as I said,
then and then the transaction executes
or appear to execute at their assigned time,
serializability is achieved.
And if you have like,
the nice property is,
if you have the timestamp assigned,
you can accept like multiple transactions.
Even if let's say one transaction is prepared,
I accept it on a particular storage node.
If you send another transaction
with a timestamp,
you can like put it in the specific order
and execute them because there is a
timestamp associated with it.
There are certain rules which you have to evaluate whether this particular second transaction
you should accept when there is already a prepared transaction or not.
But yeah, that's the key thing with timestamp ordering.
It's also simple in the sense that let's say that I accepted a transaction with timestamp
10 and I get a transaction with timestamp 9. I'm going to say, you know what, I accepted a transaction with timestamp 10 and I get a
transaction with timestamp 9 I'm going to say you know what I accepted already something with 10 I'm
not going to execute 9 anymore please go away and come back with a new timestamp right it's
it's in the order it's like in anywhere else like a dmv where maybe they kind of accept 9 but you
know they don't accept something which is very old still yep yep yeah i thought that was one
of the most interesting parts of the paper just talking about like the different interactions
and sort of optimizations on top of that you know interacting with you know singleton operations
rights or reads and how that can interact with conflicting transaction they're you know conflicting
uh or conflict conflicts among transactions and things like that. I thought that was really interesting.
I guess like one question I had on there,
it mentions that the transaction coordinator,
that's what assigns the timestamp, I believe, right?
The coordinator node.
Okay, so that's using the AWS TimeSync service.
So that should be within like a couple microseconds
or something like that.
But it also says like,
hey, synchronized clocks are actually not necessary
across these,
and there's going to be a little bit of discrepancy.
So I guess, why aren't they necessary?
And why is it useful?
You know, why aren't they necessary?
And then why is it helpful, I guess,
to have them as synchronized as possible?
I think, so for the correctness of the protocol,
synchronized clocks are not necessary, right?
Because the clocks just act as like a number
at this point in time, just a token number.
And if there's two transaction coordinators
which pick different numbers,
then it gets automatically resolved
and who comes out first, right?
So clocks don't have to be synchronized.
It's for correctness.
From an availability perspective,
you want to have clocks
as closely as possible,
synchronized as closely as possible.
So the same example
I just gave you a couple of minutes ago
is that let's say
there's a transaction coordinator
whose clock is
off by a couple of seconds, right?
It's behind by a couple of seconds.
Then always its transactions
are going to get rejected
for the same items,
which another transaction coordinator
assigns timestamp
because its time is behind. So it's always going to get rejected for the same items, which another transaction coordinator assigns timestamp because its time is behind.
So it's always going to get a,
I already executed a transaction of timestamp X
and your timestamp is less.
So I'm not going to execute your transaction.
So from an availability perspective,
it's nice to have clocks closely in sync.
And that's exactly why we have,
we use timestamp because we have some guarantees
around like how much a clock drift is going to be there and uh we can control the precision of the clocks yeah so it's
to avoid like unnecessary like cancellations because of these variable time stamps and for
load we have different transaction coordinators so time stamps could vary but we also have like
guardrails in the system where we identify a particular
coordinator has has the time drifting we just like excommunicate node that node out from the fleet
or you know if uh if storage node also has checks in place where if a transaction coordinator sends
a request which is like way out in future it will say that hey dude what are you doing i'm not going
to accept this transaction.
So we have guardrails across like, you know, different levels of guardrails in place to
ensure that we keep high availability for these transactions.
I was just going to ask that because it seems like everywhere in Dynamo, it's sort of like
everyone's checking on each other all the time.
And it's just like, hey, if I get something goofy, I'm going to like send that back and
I'll have to tell them to get rid of that one
this was like
when I joined
the SimpleDB team
I was working with
like a guy David Lutz
and he was like
I asked him
I had not built
distributed system
he's like you know
one thing you need to learn
and this will go
throughout your career
never trust anyone
in the distributed system
that's the default rule
that's amazing
yeah
I want to see it
okay we talked about serializability and I know like one thing that comes up I'm so really yeah I want to see it okay
we talked
about
serializability
and I know
like one
thing that
comes up
a lot
around
this is
like
isolation
levels
which again
is like
a whole
other
level
of depth
in terms
of that
but
tell me
a little
bit about
I guess
like the
the
isolation
levels
we'll get
especially
across like different levels of operations in Dynamo.
Yeah.
So I think like if you think about it, like transact get item, transact write item.
And there is actually a documented page as well on this.
But transact get item, transact write items, they are like serialized.
For get items, if you do like a consistent get request you are like
essentially getting like read committed data so you always get read committed data there is nothing
which you're getting which is not committed right and if you are doing let's say a non-transactional
read on a item which is which has already transactions going on,
as Somu pointed out,
those requests will be serialized with that transaction.
So if you have a transactional workload
and you do like a normal get item,
those will also be serialized.
But they also are giving you like a read committed data.
So your get request won't actually be rejected you will
get the answer back with the whatever is the committed state of that item at that particular
time um and then i think with batch rights and i would say for batch rights and transact
right items you have like similar at level, the same serializability.
I think that's a key part is that it's very hard to define these in some ways, because there are certain Dynamo APIs like batch writes that can span different items, which are provided
just as a convenience for customers, right?
Like customers don't have to come back and go back.
Then how do you define serializability
of a single batch write across a transactional write?
And it's hard to do that
because each of these individual writes
are serializable by itself,
but the entire batch write operation
is probably not serializable with the transact write item.
And helping customers understand that nuance
is very, very tricky.
And it's where we kind of have
this whole documentation,
lengthy documentation based thing that,
yes, each individual write within the batch write
is serializable,
but the entire operation is not serializable
against a single transact write item.
So I think the nuance is there for batch write
and likewise, even for scan, right?
Like when you're doing a scan in a query,
you're always going to get the read committed data so if a transaction is executing across the same
items in the um the scan then you're going to get the latest committed data um always
yep yep absolutely so yeah um and just so i understand it and maybe put into practical
terms like if i do a batch get item in dynamo, let's just say I'm retrieving two items. And at the same time, there's a transaction that's acting on those two items.
Each one of those get operations within batch got will be serializable with respect to that
transaction. But it's possible that my batch get result has, you know, one item before the
transaction and one item after the transaction. Yes. Okay. Yep.
And then there's the issue, I guess,
potentially of, I guess, read committed.
Okay, so read commit.
I always get tied up on this stuff.
I think some people see read committed,
especially like in like the query respect or also the basket respect.
Like, hey, I'm getting read committed.
It's not serializable here.
And then I think of like, okay,
what are the isolation levels
and what sort of anomalies can I get
if I think of like the sort of database literature?
And the thing that comes out to me is like,
yes, it's sort of, that is true,
but like you don't see the anomalies that you might,
from my point of view in a relational
database where you have a long-running transaction like if you talk if you look at like the read
committed isolation level now you can have what like phantom reads and and non-repeatable reads
but that's within the context of a transaction but that's not going to happen in dynamo because
you have like that single shot single request transaction you don't have like to begin run a
bunch of stuff and then yes whatever um type of thing so you don't have like the begin, run a bunch of stuff and then whatever type of thing.
So you don't see those type of anonymity just because you can't
do that type of operation.
Am I saying that right?
Yeah.
Yeah.
Okay.
And I think as you, as you pointed out, like just to reiterate, I think
the, between any write operations, serializable isolation is there.
Between like a standard read operation,
you also have serialized transact write item
and transact get items.
Like if you care about what you were saying,
where, you know, I did a transactional write
and then I want to get a fully serializable,
like the transaction should not give me an answer back
on a bunch of items because I read them as a unit.
Transact get item is what you should use to ensure that you're getting
like isolation as a unit as well.
But if you do batch write and batch get, you get an individual item level
serializable isolation, but not as a unit.
Yeah, gotcha.
Okay, on that same note, transact get items.
What like, I almost never tell people to use it.
What do you see people using it for?
Like, what's the core needs there around?
I think it's a use, I'm not saying it's not a useful thing,
but like, I think it's one of those things
like sort of the strongly consistent read
on a Dynamo leader that maybe you think you need it
less than you actually do need it.
Is that, I guess, where are you seeing
like the Transact get items use cases?
I think a lot of them I've seen
and like where you,
I agree with you
that most of the cases
you can actually model
with just like consistent reads
or eventually consistent reads.
But there are certain use cases
where you really want,
as I said, as a unit,
like you did an operation as a unit,
let's say you are moving
state in a state machine in control plane that you're building where you have like three items
which like together define the final uh thing that you want to show to the customer right and
you don't want to read any of those items in an individual uh individual manner and show something
something to the user so that's where I think it makes sense
to use Transact Get Item
where even if any of the one,
one of the items that you read
is you cannot accept that to be recommitted.
That's when you use Transact Get Item.
But the space is very narrow.
I agree with you.
For the classic example would be, Alex,
like this happened like a couple of days ago today,
is like you're transferring money between your two accounts, right? And then you want to view both the balances
together. If you'd land up doing to a batch get, you may have been a temporary state of euphoria
or like, surprise. So you want to use Transact Get Item to say, okay, I did the transfer, I need to
know what happens to use the Transact Get Item there, right? There are control planes have such
use cases, banks and banks and such use cases where you kind of finally
want to display this stuff.
So those are cases
where Transactual Item
is super useful.
It's almost like
preventing end user
or like user-facing confusion
rather than, you know,
your application
and some of the business.
Like if it's like
a background process,
you almost don't need
to use Transactual Item.
Yes, but if you're
depending on one of the,
if you're depending on both of them
to be consistent in the database, right?
This is a key word, right?
Like let's say that I see that
order status is gone from in warehouse to shipped,
then I expect something else to have been done.
Then that consistency,
you will not get with a batch get.
And if you want a consistent read,
then you want to do the transact get
to kind of read both items together.
Yep.
Okay.
All right, cool.
I want to stir some stuff up a little bit
because there was some like consternation on Twitter.
So at the end of this DynamoDB transactions paper
and also the DynamoDB paper last year,
there are some just like charts showing
different uh benchmarks and things like that that i think are really useful and um you know showing
um i guess how does like latency change as as the number of operations you have against your table
increase the number of like transactions you're running against your table increases or like
more items in your transactions or or moreention on it, all those things.
And some of those charts,
all those charts
don't have labels
on the Y axis
showing, you know,
how many milliseconds
it takes at all
these different levels.
Why, like, why no labels?
We just forgot it.
No, I'm kidding.
But I think...
Akshat forgot to put them in.
Akshat forgot to put them in,
even in the last check. No, I think partly, I think we could them in, even in the last check.
No, I think partly,
I think we could have done a little bit better job there.
The point was not to show the numbers as such, right?
I mean, the numbers, I think anybody can grok.
It's a very simple test to go run
and everybody can run the test and grok the numbers.
The point was to show the relative difference between,
like, for example,
singleton write versus a transactional write,
what's the cost, right?
Latency cost and it's X amount or more.
I think that was the whole point.
And we didn't want to kind of give absolute numbers,
which doesn't make sense, right?
One of the lessons was we could have done
a little bit better job of normalizing the numbers
and presenting the normalized number on Y-axis.
But I think that's a lesson for us
to kind of take away next time.
Yeah, I like it.
I agree.
Like last year when I first read that Dynamo paper, I was like, where are the numbers on
it?
Like, why wouldn't they show that?
But then the more you think about it, like Dynamo's whole point is consistent performance
no matter what, right?
It doesn't matter how many items you have in your table, how many concurrent requests
you're making,
you know,
all those different things.
And I think
these benchmarks
are trying to show that
at different levels.
Like, hey,
it's still the same
whether you're doing one item,
whether you're doing
a million transactions
per second.
It's still...
And we keep making
all these, like,
optimization in the stack
to improve performance
across the board as well.
So I think it's just like, again, as someone pointed out, these numbers will be more distraction than actually help because you might run an experiment like 10 years later and the performance will be like even better.
Right.
So what's the point?
And the key point is that you get consistent performance as you are scaling your operations that's the key message we wanted to actually like take away from the from that not that hey this transaction operated at like five millisecond
or 10 millisecond or 20 millisecond or whatever that is yep exactly yeah because a lot of those
benchmarks can be gamed or who knows like what's going on and and just are they representative of
things but i think yeah showing um like you're like you're saying like it doesn't really matter like what those other those other factors are are mostly unimportant um to the to the scale you're
going to get there um i guess like how um i you know consistent performance with dynamo is is just
like so interesting and such a key tenant on on on just everything in terms of like the the apis
and features that are developing and all that stuff.
I guess like how far does that go?
And if you had like, I don't know if this is even easy to think about,
but if you had some sort of change that would reduce latency for, you know,
your P50, your P90, something like that,
but would maybe increase your P99 by 10, 20%, like something like that.
Is that something that's like, no, hey, we don't want to, we don't want to increase
that spread.
We don't want to decrease our, our P 99 at any cost.
Like, is that maybe that sort of thing just never comes up, but, um, I guess like
how, how front of mind is that consistent performance for Dynamo?
I think it is like, as I said, it is one of the core tenants in the beginning.
That's like one of the core tenants.
Whenever we do a new thing in Dynamo, we have to ensure that.
So wherever we look at the lens of like improving latencies,
I think we start from entitlements.
Like what exactly, like if we have to do this operation,
like what each like hop in the overall stack,
how much latency is attributed or allowed for each hop to actually take from the
full request. And we go from there. So if there is like network distance between two, like that's
one of the entitlements, right? So it varies from like when you're looking at a problem,
if you find an opportunity to improve the latency at P50, I think the goal is to make sure the variance between P50 and P99
is also not too high
because consistent performance
is about giving you,
at any time when you make a call,
you get the same performance
on the read and write operations
that you're doing.
Very cool.
Okay, one thing on the latency
I wanted to look at, it was just like
on one of the charts, especially showing, um, how latency changes as you increase the
number of transactions you're running. There were, there was like a spike at the end of
P 99 for very high request rates. So if you're doing lots of transactions per second, there
was a little bit of spike at P99 uh compared to like you know even
slightly lower request rates and you mentioned it was like a java garbage collection issue um
i guess like is that something that like when you see that you're like hey we need to
you know if it's like a gc issue do we need i know you are like doing some stuff in rust is
that something you're like hey we need to change that because that tail agency is is um so unacceptable or is it also like you know if it always shows up i think it was like doing some stuff in Rust, is that something you're like, hey, we need to change that because that tail agency is so unacceptable? Or is it also like, you know, if it always shows
up, I think it was like doing a million ops per second, which was they're doing three ops per
transaction. So 333 transactions per second. Do you not have that many users doing that to where
it's a big issue and that is okay at that point? Or like, is that something you're actively thinking
about? I think so. That one was a very interesting one
because I know we went back and forth on those numbers
on what are the issues with that stuff.
And that was specifically with the 100 item transactions.
So when you're doing a 100 item transaction,
a transaction coordinator is holding onto those objects
for a longer period of time,
ensuring that they're kind of talking
to a hundred different nodes.
And so the P99 there has been higher.
We do kind of want to address the P99 issue there.
But the number of customers
using 100 item transactions
are also very,
the number of applications
using 100 item transactions
is also far low, right?
So we would address that on a,
if those customers,
those applications are using 100 item transactions,
they're already paying the latency penalty
at this point in time,
you have 100 transactions.
So as long as it's consistent, we are okay, weatom transactions. They're already paying the latency penalty at this point in time. You have 100 transactions. So as long as it's consistent,
we are okay.
We will address it,
but maybe not as soon as,
but we will kind of
definitely address it.
But we don't want that
to regress, right?
We want to keep it
where it is at this point in time
and measure it
and see what happens.
And we actually run Gendrys across all the different AZs, all the different like endpoints
that we expose to actually find issues in latency before our customers do. So we have like canaries
running all the time, acting like customers doing these variable size transactions to identify if
there is any issue in a particular stack,
in a particular region
or anywhere in the stack,
we get paged,
figure out what the issue is
and resolve it as well.
So yeah, we have like,
we don't take this lightly.
Yeah, very cool.
I remember that from last year's paper
about how you do monitoring
and sort of those performance
segregation tests
and you have like all those canaries,
like you were saying,
but also I think
some of your like high, high traffic amazon.com tables, right?
You should get direct access to their monitoring and be able to pick up some
of the latency degradation there, if any.
Yeah, pretty cool to see.
So that's it.
Cool.
Okay.
Transactions, that's great stuff.
I want to just sort of closing here, like, you know, you've been, you both been working on Dynamo since it was released now. What does that look like to, I guess, not do new feature development, but, but maintaining or updating the foundations of Dynamo? And how much does some of that stuff changed? And I mean, you know, you would know this better than me, but just like, I don't know,
as we've seen changes
from like hard disk drives
to SSDs to like NVMe,
like is that something
that is like a regular change
or even like the storage engines
you're using or like
how much of that foundational work?
How often does that,
is that something that gets updated
every couple of years
or is constant maintenance
or what does that look like?
So our architecture
is constantly evolving.
We're finding new things, right?
And the best part about Dynamo
is customers don't have to worry about the stuff.
Like that's the best thing.
There's a lot of things in the back changing all the time.
And our key tenet is like customer availability
or latency should not regress
because we're doing something in the behind.
And we do a lot of things.
A classic example would be like
when I worked on encryption at rest back in 2018,
I would say.
I keep forgetting these numbers,
but anyway, it's 2018, right?
There was a whole thing
where we kind of
totally integrated everybody
under the covers with KMS
and this was a whole sweep
and customers never saw a blip.
So yes,
there are things constantly
changing in the background.
We're trying to improve latencies.
We're trying to kind of
make things more efficient. And all these customers don't get to see and that's
the best part of being a fully managed service uh um and to answer your question it's constantly
happening but nobody gets to know about this stuff yeah and i think a lot of like developers also ask
this question to me like who are interviewing at our team as well, that, hey, you have been here
for that long.
Like, are you not bored?
I'm like, no, every year
there is like some fun problem
that we have to launch.
And the best part is
as soon as you launch,
you don't get one customer.
You get like so many customers
who want to use your feature
and traffic also,
you don't get one request
or two requests.
You get like, you know,
millions of requests.
So you have this like fun challenge
that you have to solve,
which has all,
like Dynamo has like
so many fun problems
that still keeps us excited.
Yep.
Yep.
Do you get the same thrill
of releasing a
like public feature,
a very visible feature
like transactions
as when you're releasing
something like,
you know,
adaptive capacity,
which,
you know,
for those listening,
it was more like just
how Dynamo is like splitting your provision throughput across the different partitions
in your table. And it was something that was mostly under the hood. You didn't even know
about it until you all published like a really good blog post on it and then further improvements
and including on demand note and stuff like that. But like, do you still get the same thrill when
like those sorts of releases come out and you're like, man,
we just solved a huge problem
for a lot of people
and they might not even know
for a little while.
Like, what's that like?
That one specifically, yes.
Because a lot of the customers
were complaining about it as well.
Like, you know,
like they do it right away.
And I think
it was super excited about,
yeah,
I think everything we do
in Dynamo kind of is
very exciting
at the end of the day, right?
Because you have
direct customer impact
one way or the other.
It's just boils down to what the impact is.
I remember once,
I don't know which year it was,
but I think me and Somu
actually worked on a problem
which reduced like number of like
operational ticket fees to get
from like really a big dent,
like 10x improvement on that.
So yeah, I think we get the same thrill.
It's where you want to put your mind and solve the problem.
And as I said, Dynamo has so much fun problems to solve.
Yeah.
Okay, cool.
Okay.
So last two years, you've written some really great papers, the DynamoDB paper last year,
transactions this year.
What are we getting next year?
What's the next paper coming down the pike?
Do you have one?
We have to think about it.
First, you have to build something and then we, you know.
Yeah, there is definitely a lot more we are thinking
and evaluating what we should do.
We have also started doing like a lot of talks
and at different venues and different conferences
and like, yeah, getting like feedback from customers.
Like transactions paper, actually, the way we decided
was also, I would say, customer driven.
We wrote the paper on DynamoDB
and I was just looking at like
how the response has been on different blogs.
And a lot of blogs had this theme
where people were asking like,
oh, I wish there was like details
about how transactions were implemented in DynamoDB.
It was like a bunch of people had left that comment.
So that's when we picked it up
and we wrote this paper.
So we'll see how the response
to this paper is,
figure out what customers want
and write back.
I think it's, yeah, that's,
like Akshat said,
it's mostly what's going to be
the next takeaway message here, right?
Like, for example,
with Dynamo, we said,
these are our learnings
from the past 10 years.
With transactions, we said,
you know what, you don't,
you don't always need
like a long running transaction on an OCQ database. You can build a fast scalable transaction with a single request
transaction. So the next one is going to be what's the next takeaway message for us from us to the
community in general. And that's what it'll be focusing on hopefully soon.
Yep. I agree. I hope we soon. And now that point you're making about like, you know,
what can we take away in long-running transactions? I think both papers are very good at just being like, really thinking about user needs from first principles and be like, okay, you know what, we can, you know, other things might have all these features, but you cut off like this 5% of features and you actually eliminate a whole host of problems. And as long as you're fine with that constraint, you can get a lot of other benefits as well.
So I think just like the framing
of user needs upfront in both papers
is so good and helpful
in understanding like how this is working.
So I love that.
Hey, Akshat, so thank you for coming on.
I respect you both much.
I love Dynamo
and I'm really grateful for you
coming on to talk today.
Alex, super thanks for having us, by the way.
And you're one of the biggest DynamoDB proponents.
Your book is probably referenced a lot.
So super thanks for having us.
And it's like a privilege to talk to you
about the transactions paper.
Yeah, same here.
I think you have been doing amazing work
and I've been following you for like a long time.
Thanks for all the great work that you do.
Cool.
Thank you.
I'll link to the paper,
but everyone be sure to check out the paper
because there's a lot of great stuff
we didn't even get into here.
So make sure you check that out.