Microsoft Research Podcast - 053 - Chasing convex bodies and other random topics with Dr. Sebastien Bubeck
Episode Date: December 5, 2018Dr. Sebastien Bubeck is a mathematician and a senior researcher in the Machine Learning and Optimization group at Microsoft Research. He’s also a self-proclaimed “bandit” who claims that, despit...e all the buzz around AI, it’s still a science in its infancy. That’s why he’s devoted his career to advancing the mathematical foundations behind the machine learning algorithms behind AI. Today, Dr. Bubeck explains the difficulty of the multi-armed bandit problem in the context of a parameter- and data-rich online world. He also discusses a host of topics from randomness and convex optimization to metrical task systems and log n competitiveness to the surprising connection between Gaussian kernels and what he calls some of the most beautiful objects in mathematics.
Transcript
Discussion (0)
You want science to be based on scientific facts.
That's what I like about
the theoretical computer science community is that they have
well-established long-standing open problems.
If you do make progress on those problems,
then people will listen to you rightfully because it
means that you have some new techniques,
some things that people before couldn't do.
You're listening to the Microsoft Research Podcast,
a show that brings you closer to the cutting edge of technology research
and the scientists behind it.
I'm your host, Gretchen Huizinga.
Dr. Sebastian Bubeck is a mathematician and a senior researcher
in the Machine Learning and Optimization Group at Microsoft Research.
He's also a self-proclaimed bandit who claims that despite all the buzz around AI, it's still a science in its infancy. That's why he's devoted his career to advancing
the mathematical foundations behind the machine learning algorithms behind AI. Today, Dr. Bubek
explains the difficulty of the multi-armed bandit problem in the context
of a parameter and data-rich online world.
He also discusses a host of topics from randomness and convex optimization, to metrical task
systems and log-in competitiveness, to the surprising connection between Gaussian kernels
and what he calls some of the most beautiful objects in mathematics.
That and much more on this episode of the Microsoft Research Podcast.
Sebastian Bubeck, welcome to the podcast.
Thank you.
You're a senior researcher in the newly minted or relatively newly minted machine learning and optimization group at MSR, which used to be called the theory group.
And now there's a couple of subgroups under this new umbrella.
So give us a quick picture of the org chart and who does what.
And then tell us how we could best frame the work you do.
What gets you up in the morning? Sure.
So I will start with the new organization
that we have here at Microsoft Research in Redmond.
So I guess a little bit more than a year ago,
we started a new lab,
which is called Microsoft Research AI
for artificial intelligence.
So this is following, of course, all the hype
and the real scientific excitement
around machine learning.
So we created Microsoft Research AI, and some of the members all the hype and the real scientific excitement around machine learning.
So we created Microsoft Research AI, and some of the members of the former Siri group decided to join this new endeavor. And that's when we decided to create the Machine Learning and Optimization
group. And then the other half of the group went to do the algorithms group in just a basic
Microsoft Research Lab in Redmond. So is that actually a group, the algorithms group?
There is a group called the algorithms group. So of course, in machine learning and optimization,
we also do a lot of algorithms. In fact, it is the main focus of our group,
but specifically around machine learning.
So you're in the machine learning and optimization side of this group.
Yes.
And you also work on algorithms. But what gets you particularly up in the morning?
What questions are you asking?
What problems are you solving?
So I guess in general,
I am really passionate about online decision-making
or decision-making in general.
I think no matter what you think will happen in the future,
humans or robots or AI,
there will always be a need for making decisions.
And the questions, the mathematics that surround decision-making,
I find them really fascinating, and they are still in their infancy.
So there is really a lot to be done there.
And somehow, traditionally, pure mathematicians
have not looked at those questions so much.
So there is really this gap between beautiful mathematics
that has been developed,
say, for physics or slightly more recently for computer science. But I would argue that decision making, online decision making is just as important today as those more traditional topics.
How does your group doing very theoretical work on the math of machine learning interact with
other groups at Microsoft Research or even the discipline writ large.
Right.
So, you know, these days in the media,
we hear a lot about artificial intelligence.
As much as we would want to have it,
an artificial intelligence that, you know,
make recommendations,
that tells us what we should cook for the next day,
you know, help us sleep better, whatever,
it doesn't exist yet.
And what this means is that we are still very much working on the foundations.
So we have a couple of techniques that do work great, including gradient descent,
convolutional neural networks, things like that.
They really do work very well.
And naturally, big companies are investing a lot in those and trying to, you know, engineer around them as much as possible, get the full stack, get the hardware ready, you know, all of those things.
But there are many, many other things that we would want to do that are out of reach for the current techniques.
So what we really need to do is to develop fundamentally new tools.
And that's what our group is doing. Well, we couldn't talk about the mathematics of machine learning and sequential decision-making
without talking about the multi-armed bandit problem. It's known, as you say, to be simple,
I'm making air quotes here, and yet incredibly rich.
So explain this essential problem and tell us what's new in the research right now.
Right. So the multi-armed bandit problem is actually the first real research problem
that I looked at now a little bit more than a decade ago. So it's a mathematical question
that conceptualizes the problem of exploration versus exploitation. So, you know, in our
everyday life, we have to deal with this problem, you know, do we order from the same restaurants
that we know we love or do we try this new restaurant that we don't really know yet whether it's good or not? So
maybe I should explain where does the name come from? That's a good idea. Yeah. So in American
slang, a one-armed bandit is just a slot machine at a casino. And you can imagine that in the
multi-armed bandit, you're facing, you know, many of those one-armed bandits.
And for every coin that you have, you have to choose which bandit you're going to put your coin in and, you know, pull the arm.
So we refer to the different actions as different arms of the bandit.
OK, so that's where the name comes from.
And it really got popularized in the 1950s by Herbert Robbins.
And at the time, it's actually quite interesting.
The multi-armed bandit problem was viewed as a negative result
in the sense that in the 1930s,
people developed classical statistics.
So people like Fisher or Neiman and et cetera.
So they developed classical statistics
and then they wanted very naturally to move to online statistics.
So what happens if the data is not given to you all at once,
but it's coming to you sequentially?
And then they realized that the questions were really tricky,
really difficult.
And then Robbins came along and said,
look at this multi-armed bandit problem.
It's so simple, yet we have no idea how to solve it.
So it was more viewed as a negative problem.
But then what happened is that in the early 2000, a huge, a very important application
appeared on the scene, which is ad placement on the internet. So ad placement on the internet is
exactly a bandit problem. You can imagine each user coming to your website as, you know, one
time step of your problem. And then for that user, you have to select which ad to present. So each
ad is an arm of your bandit. And then when you present the ad,
you see whether the person clicks or not.
You see whether the treatment was efficient or not.
And then you move on to the next one.
So it's a new important application of multi-armed bandit.
So now what has happened since 100 years ago?
So what has happened is that now we really care about applications
where the number of actions,
the number of arms is huge.
It's very, very large.
Another application, in fact, is a website display. So you can imagine that you have many
different ways to display your website and, you know, users are going to prefer certain ways.
But this is a combinatorial problem. You know, whether you place the ad at the top or at the
bottom of the page, whether you use the color green or the color red. So you have many, many possible choices,
which leads to a combinatorial explosion of possibilities.
And in turn, this means that you have to really think very deeply
on how to leverage the experience that you had with a certain arm to other arms.
You have to understand what are the correlations between those different arms.
So this is what modern multi-armed bandits is about.
Let's talk about a project you're involved in on the foundations of optimization.
So optimization methods are, as you call them, the engine of machine learning algorithms.
Give us an overview of the research directions in this arena, particularly convex optimization,
and explain why they're important to the theory behind machine learning algorithms.
So today's optimization problem in machine learning are really characterized by two key features.
One of them is that they deal with optimization problem in very high dimension.
So by that, what I mean is the number of parameters of your model.
These are the number of variables that you're going to do optimization over.
We are not talking about optimizing a single parameter in dimension one or, you know, two or three parameters in dimension two or three that are easy to visualize for
humans. We're talking about millions, tens of millions, probably billions in the future.
That's the space we're dealing with in optimization today. So that's one of the big
obstacles. The other one is just the sheer amount of data. So the data is so huge that you have to be very careful about the operations that you do with them.
You cannot visit your whole data set a million times.
It's just not possible.
You have to do that more parsimoniously.
So this begs the question, in turn, when you go back to the optimization literature,
on how to make the most of the calculations that you have already done.
And what are the basic calculations that you can do to optimize a function?
Well, the basic thing you can do is to calculate its derivative, its gradient.
And now you're asking the question, given that the gradients that you have observed so far,
how to best combine them to calculate the next gradient. So this optimal usage of your information,
it has a very long history in optimization,
in particular in the Russian school.
And they developed optimal methods in the 70s and the 80s.
And we're only now starting to really understand
the fundamental concept behind it.
Really?
Yes.
So it's actually a quite interesting story.
At the end of the 70s, Arkady Nemirovsky, who is the leading figure in the Russian school
in optimization, he discovered that when the function that you're optimizing is smooth,
you can actually do better than the simple gradient descent technique.
You can leverage the past information in a better way. So he designed an accelerated technique,
which later Yuri Nesterov,
who is now a legendary figure in machine learning,
simplified dramatically
and he made a very practical algorithm out of it,
which is now called Nesterov's momentum technique.
So this momentum technique was first designed
in the context of convex optimization
as a way to obtain an optimal method.
And nowadays, it's one of the most practical techniques for training deep neural networks, which are non-convex functions.
And, you know, we don't really understand why momentum is working there, but that's the story of that technique.
So there are things you do that you don't totally understand, but they're working, so you keep doing them.
That's right. But what our group is actually doing is trying to understand why they are working.
So in particular for this momentum technique, we worked a lot on it.
And a couple of years ago with my good collaborator, Yin-Tat Lee, who is now an assistant professor at the University of Washington,
we found a way to understand the momentum from a geometric
perspective. And this is very useful because when you start to be able to draw a picture,
you get better intuition. So this was, at least for us, a very important step towards understanding
better momentum. You have two big areas of interest. Well, you have a lot of areas of
interest, let's be honest. But you've noted that two big areas of your interest are the interplay between convexity and randomness in optimization, and also inference
problems on random graphs. So I wonder if you could just unpack those a little bit. Why are
you interested in them? What do they address? Why are they important enough for someone with
a brain like yours to tackle? So as I said before, nowadays we really deal with very high dimensional spaces.
And one key technique to explore those high dimensional spaces is to use randomness. And
in classical convex optimization, randomness was not really a tool that people were looking at.
They were more focused on trying to understand what can you do with deterministic methods. So it seems to me that there was, and there still is to some
extent, a real opportunity to bring in those randomized algorithm tools into convex optimization
and hopefully get a couple of breakthroughs on the way. So that was for the first area.
The second area actually was one of my areas of focus
for a couple of years,
which comes from the observation that nowadays
many data sets come in the form of a graph.
And what you want to do is you want to do
basic statistics with it.
What are, you know, the basic things
that you want to compute with those graphs?
To my surprise, there were
some questions that seemed really fundamental to me that were not looked at. So there are a couple
of models which are very well known now in random graph theory and in network analysis to explain,
say, how the Facebook graph grew or, you know, how the Twitter graph is growing. And one basic
question that I wondered about is, okay, so you have those
evolution processes, but they start with some seed graph. So the seed is not empty. It's not like,
you know, Facebook started with one user. It was actually seeded with hundreds of users or maybe
more. So what I was curious about is that there was this discrepancy between the reality where people were designing
very carefully a good seed graph in order to enhance the growth of the network and the
mathematical analysis, which was thinking about what if the seed is empty. So the first question
that I asked was, is a theory without a seed relevant to the practice with a seed? And what
we proved is that it's not. That,
you know, if you have two different seeds, you can get two completely different graph
evolution processes.
Quelle surprise.
Yeah, it was actually a surprise. Yeah. We didn't expect it. We thought that maybe, you
know, in the very large sample regime, at the end of time, once everything has settled
down, then maybe the seed doesn't matter anymore.
But that's not the case. Interesting. You have a blog called I'm a Bandit,
and it deals with random topics in optimization, probability and statistics.
That's right. So first of all, I'm glad you're not knocking over banks.
But the big topic here is bandit convex optimization. Talk about this in general and then talk about your polynomial time algorithm for bandit convex optimization.
We'll go kind of deep here.
I'd like you to get as technical as you like, but also we'll nod to the fact that you've got a great video on this on the website and three big blog posts on it that kind of goes deep into the weeds.
Yeah.
Yes.
So banded convex optimization is the following problem.
So as I said before,
the new research on the multi-armed banded problem
is really interested in the case
where there are many, many actions.
So now you can ask, okay, what about the limit case
where there is actually an infinite number of actions,
a continuum of actions?
So then, of course, if you don't make any assumption
about how the rewards are shaped over this set of actions,
there is no hope.
You cannot hope to find, you know, what is the best action.
But as we discussed already,
one of the very basic assumptions
which has found a lot of application in machine learning
is convexity.
So then the question becomes, in these bandit scenarios,
where you have a continuum of actions and the loss function is shaped as a convex function, what can you do?
Can you still recover the optimal guarantees that we had before with just two or three actions?
So this was an open problem that was published in 2004. And when I joined Microsoft Research
five years ago,
the theory group and some of the members in the theory group,
in particular, Ofer Dekel,
who is now the manager of the machine learning and optimization group,
and Yuval Peres, who is a distinguished probabilist,
they were talking and thinking about this problem.
And when I joined, you know,
I was coming with my own expertise on bandits,
and I was really excited to join the group and start thinking about the questions that those guys cared about.
Yeah.
And we were very fortunate that we made good progress on it.
So our first stride into the problem was, in fact, to show something a little bit unique in machine learning, which was we showed that the problem is solvable, but we didn't actually give a technique to solve it.
So that was both interesting and disappointing.
That don't make no sense.
Yes, it was strange.
We were all surprised by this result to some extent,
but also very excited because it meant that there is something to be discovered there.
The problem is solvable, but it is certainly not solvable with the current techniques that we have.
I want you to drill in there for a second because you're actually saying something that baffles me.
You could solve it, but you don't know how you solved it?
So it's more as follows.
There was this very concrete question.
Can you get square root T regret
for banded convex optimization?
What we showed is that the answer to this question is yes.
Yes, it is in principle possible
to get square root T regret
for banded convex optimization,
but we did not give an algorithm
that actually achieves this regret guarantee.
So then the question remained,
can you actually get a polynomial time algorithm,
like an efficient algorithm,
something that would actually work on real data
and that gets this quality type behavior?
Okay.
So that was the state of this problem
when Yin-Tat Lee came for an internship with me
in the summer 2016.
And we started thinking about this question.
And again, we were lucky to come up with a new approach based on kernels.
So kernel methods have a long and distinguished history
in machine learning.
In fact, they were the most popular technique
before deep learning overtook the field.
And what we found is a
connection between bandit-type information and kernels, that there is really an interplay between
the two. And while we were exploring this to our own bafflement, we discovered that there is a
connection between kernels and what are called Bernoulli convolution. So Bernoulli convolutions,
they are a mathematical object that were defined in the 1930s that Paul Erdos, the famous mathematician, thought of as one of the most beautiful objects in mathematics. And these
Bernoulli convolutions, they are directly related to the Gaussian kernel in machine learning. And they are exactly what we needed to solve banded convex optimization.
And it turned out that the world expert on this topic was, you know, my office mate sitting
next door, Yuval Perez.
So it was very convenient that I could go to him and ask him all the questions around
this.
And then later, Ronan Eldan joined us on this project, who was also one of the
members of the team with which we proved that there is a solution, but
we don't know what is the solution.
I just have the biggest grin on my face.
Every podcast I hear things that delight me and nobody can see my face.
Let's talk about task systems.
As a level set, tell us what they are and what they're for, and then talk about
metrical task systems and how they're different.
So we have discussed so far the multi-armed bandit problem, which is a very fundamental and important problem, but it's missing a notion of a state. So in most of the real-life problems
that we care about, it's not like you take an action and it has no consequence on the future.
In fact, when you take action, often the state of the world or the state of the system that you're acting on is evolving.
And you have to take that into account.
So task systems are a very fundamental mathematical modelization of this problem.
And it goes like this.
So in task system, you have a finite state space.
I'm going to say that n is the size of the state space.
And you're again going to play a sequential game, just like in Multi-Arm Bandit.
And at every time step, what you're presented with is a task. And what is a task? A task is
a cost function on the state space. So a task just tells you for every state, how long it's
going to take you to complete the task if you're sitting in that state. So now in the game, you're
sitting in a certain state, you know, because of what you have done in the past, you see the new task coming at you,
and the question is, to which new state do you move
to complete the task?
Now, of course, stated like this,
you would just move to the state
that has the minimal completion time for the task.
But that's where the metrical part comes into play,
the metrical task system.
So the metrical part is that play, the metrical task system.
So the metrical part is that states, they can be far away from each other.
It's not like you can easily move, you know, from a state where, say, you know no mathematics to a state where you're a world expert in mathematics.
This is going to cost you a lot to make this movement, you know, to change your own state.
So now you can think of the state space as a metric space,
meaning that there is a notion of distance between two states. And when you're sitting in a certain state
and you get your new task, when you move to a new state, you're going to pay the distance
to travel there, plus the completion time. So these metrical task systems were introduced
in the 80s in the theoretical computer science community. And as I defined it, it's a very discrete problem.
And as expected, people used discrete mathematics technique to try to solve this problem.
So the big conjecture, the big open problem in that subfield is the following.
So you say that an algorithm for the metrical task system problem is competitive if it has
the following property, no matter what the task sequence is,
the algorithm's performance,
the total cost of the algorithm
is going to be comparable to the best it could have done
had it known in hindsight
what was the sequence of tasks to be presented.
That's what it means to be competitive.
It basically means you can see into the future.
So we're going to say that you're alpha competitive
if not only you're comparable to the best you could have done,
but you're at most paying alpha times what the best could have been.
So, you know, maybe you want to be two competitive or three competitive.
And the conjecture in that field is that you can be log n competitive
for any metrical task system.
So we're not very far from proving this conjecture.
We are basically a polynomial
factor off. But the important thing is that people were really looking at this problem from a
discrete mathematics perspective. And what we brought in a year ago with my great collaborators,
Michael Cohen, Yin-Tat Li, and James Li, is that we looked at it from a convex optimization perspective. So we view this very discrete problem
in continuous space and continuous time.
And once you make that modification,
then it's very natural to again talk about gradient descent.
In fact, this other strategy, which is called mirror descent.
And using this new continuous strategy,
we were able to refine the guarantees
that we know about this problem.
So this was for the discrete metrical task system.
Now there is a convex version of the metrical task system.
Because just as for discrete multi-armed bandits, you know, there is a lot of interest into going into the case where there is a huge set of actions, set of arms.
It's the same thing for metrical task system.
You might be interested in a case where the set of states is set of arms. It's the same thing for metrical task system. You might be interested in a case
where the set of states is in fact infinite.
So then if n is infinity,
you know log n is also infinity
and you're not competitive.
So then the question, yes, indeed,
indeed, this is very disappointing.
And so the question is,
just in metrical task system,
when you task as a convex function
over the state space,
can you be competitive at all?
So this was an open problem from the early 90s,
from Lineal and Friedman.
And what they proved is that in dimension two,
you can be competitive.
Very restrictive.
I mean, certainly not talking about the millions of dimensions
that we opened with.
And what we just proved recently,
we just put out the paper a month ago,
we proved that indeed you can be competitive
for convex metrical task system in any dimension,
but the competitive ratio that you get
is going to depend on the dimension.
And a great open question is
whether you can get polynomial in the dimension scaling.
That seems very difficult.
All the rest of it seems very difficult also.
And this then, as you referenced,
solves a 30-year-old problem.
Right.
So in this literature,
they don't talk about convex metrical task system.
Instead, they talk about this problem
that they call chasing convex bodies.
So in chasing convex bodies,
you are moving a point in Euclidean space.
So let's say you're moving a point with D parameters.
And every time step, you're given a convex set of constraints.
And what you have to do is you have to move your set of parameters inside this convex set.
So you have to chase the convex set.
You have to go inside the convex set.
And you see this sequence of convex sets that are given to you and you keep moving.
And then again, you want to be competitive.
So you want to compare yourself to the best you could have done in hindsight
had you known the sequence of convex bodies right from the start.
And so indeed, we do show some of the issues with what we
call the current machine learning culture versus the culture of theoretical computer science,
which is more traditional. What are some of the things we ought to be paying attention to right now?
Is there anything that keeps you up at night? Anything we ought to be thinking about?
Yeah, sure. So I personally believe that science is a fundamentally slow enterprise.
It takes time to digest new concepts. It takes time to really recognize whether something that
looks new on the face of
it is actually new or it's just, you know, a slightly different point of view, a slightly
different perspective. Our time is a little bit different from, you know, say two decades ago.
And one reason is just because the amount of data that we have. So we have this influx of data and
people want to do something with it. So this, I believe, is not going away. And there will be a need for,
you know, data scientists and people who really think deeply about how to use and how to leverage
information that you can acquire from data. So that will not change. What I worry a little bit
more about is the culture which is centered around claims which have no, as far as I can tell, no mathematical basis.
You want science to be based on scientific facts.
And that's what I like about the theoretical computer science community is that they have
well-established, longstanding open problems.
And if you do make progress on those problems, then people will listen to you rightfully
because it means that you have some new techniques, some things that people before couldn't do. So there are pros and cons for these very well-defined
open problems. The cons is, of course, that those open problems, they are not solving burning
practical questions that we need to resolve right now. So then, you know, you come up with a solution
and it's not like people are going to implement your algorithm right away. So these are maybe not the most important question if you really want to get
new algorithm right away. But on the other end, progress on this question, this is deep progress.
This is progress where you really develop new tools, which hopefully maybe a couple of years
from now, some more practical methods are going to emerge out of it. Tell us a bit about your
history, academic and otherwise.
It's always great to hear personal anecdotes about where people came from and how they
ended up where they are.
What's your story?
Sure.
So I grew up in a small town around Strasbourg in France, next to Germany.
What's it called?
It's called Oberhausbergen, which means the house at the bottom of the hill.
And then, you know, there is a city next to it, which is Mittel-Osbergen, the house on the middle of the hill.
And then there is Nieder-Osbergen, the house at the top of the hill.
So these are the three towns where, you know, I was hanging out until I was 18.
So during those first 18 years, yeah, I didn't know anything about mathematics, science.
But then at 18, I went into this thing which is called PREPA. So that's famous in France. That's
where you train to go into the Grandes Ecoles, Engineering Grandes Ecoles or, you know,
Ecole Normale Supérieure, which is where I went, where you learn how to do mathematics. So that's
where I really fell in love with mathematics.
I really always remember it like the very first lecture.
I immediately thought, OK, this is the first time that somebody is saying something that really resonates with me.
So two years of prepa, one year in École Normale Supérieure, and then I moved on to do my PhD.
What I thought at the time I wanted to do was
mathematics of the brain, but then I realized that there was no such thing.
But as I realized that, I also noticed this field of machine learning, which seemed very exciting.
So I started to learn more about it and then I decided to do a PhD in machine learning and specifically in the multi-armed bandit problem. So that's to learn more about it. And then I decided to do a PhD in machine learning
and specifically in the multi-armed bandit problem. So that's what I did in France.
So how did you end up at Microsoft Research?
After my PhD, I did one year in Barcelona for my postdoc, which was just amazing scientifically
and otherwise. And then with my wife, we moved to Princeton, New Jersey, where I was an assistant professor in the OR department.
I stayed there for three years.
And then I did a semester at the Simons Institute in Berkeley.
And that's where I met Yuval Perez.
And that's also where I discovered the field of theoretical computer science, which, again, I fell in love with. And that's really where I could see, you know, my second calling.
Let's say after mathematics, really theoretical computer science, I thought the people in
the field were amazing.
I thought the questions they were looking at were fantastic.
I thought their work ethic was very refreshing.
So that's where I met Yuval Perez.
And then after that, it's history.
That's why I joined Microsoft Research.
I came to visit and they made me an offer
and I decided to join.
So as of two days ago,
you have a pretty cool announcement to make.
Right.
So our paper,
joint work with researchers in RIA in France
and in Tatli, again in Tatli,
just got awarded the best
paper at NIPS. So it's pretty exciting. Tell us a little bit about the paper and what it's adding
to the literature in the field. Right. So it's in fact tying back to one of my main area of interest,
which is convexity and randomness. So the question we were looking at is the problem of distributed
convex optimization. So what happens when the objective function that you want to optimize,
which depends on your data, it's not all in a single place, but the data is distributed among
many servers. Then how do you optimize both the communication between the different servers,
as well as the work that each server is doing? So that's the question we were looking at. Of course,
we weren't very far from being the first ones to look at.
There are literally hundreds, if not thousands of papers written on this topic.
But somewhat surprisingly, it seems that nobody has really looked at optimal methods,
meaning provably optimal, both upper and lower bounds.
So in classical serial optimization, this is again work done by the Russian school,
Arkady Nemirovsky, Yuri Nesterov, and those people.
But they didn't look so much at the distributed scenario.
And one reason, I think, is because in this distributed scenario,
in fact, randomness is key.
Randomization is very, very important.
And that's what our paper proves,
is that if you use randomization,
then you can get optimal guarantees.
That's awesome. Congratulations.
Thank you.
Sébastien, as we close, I always ask my guests to offer some parting advice to our listeners, whether it be in the form of inspiration for future research or challenges that remain to be solved.
And your recent paper kind of teased that up a little bit.
What would you say to researchers
who might be just getting interested in this field?
Maybe a good advice to any starting researcher
is to become an expert in a very narrow field
or even narrow question,
but really become the world expert in that question.
And that will yield rewards for a very long time.
So that's what I did with the multi-armed bandit problem.
And I think this was very successful.
And, you know, later on, I could see connections with this problem that I wouldn't have made
if I didn't know all of this about the multi-armed bandit problem.
Now, in terms of research directions, there are really many, many questions emerging.
So one of them we already mentioned is in this chasing convex body problem.
We showed that you can be competitive,
but our dependency on the dimension is very bad.
It's exponential in the dimension.
So a great open question, but difficult one,
is whether you can be polynomial.
Another open question is this metrical task system
to actually solve the longstanding conjecture
of whether you can be log n competitive.
This is a fantastic question
that relates to how do you define a good notion of entropy for general metric spaces. It's
really a very deep and beautiful question. We don't know how to solve this. Related to
this latest best paper award at NIPS, in fact, you can think about the multi-armed bandit
version of this problem, which is exactly
distributed bandit. This problem is basically entirely open. And I believe it's a very rich
question. And in fact, it ties into a broader question of how do you think about multi-agent
problem? And this is a very rich field yet to emerge. Sebastian Bubeck, thank you very much for coming on
the podcast. It's been really, really interesting.
My pleasure. I really enjoyed the questions.
To learn more about Dr.
Sebastian Bubeck and how theoretical
computer scientists are tackling the
multi-armed bandit and other combinatorial
conundrums, visit
microsoft.com slash research.