CppCast - Trading Systems
Episode Date: February 19, 2021Rob and Jason are joined by Carl Cook from Optiver. They first talk discuss an announcement from Khronos that SYCL 2020 has been released, and a blog post from Microsoft on updates to the Visual Studi...o Code C++ extension. Then they talk to Carl Cook from Optiver about how they use C++ to power everything they do. News Khronos Releases SYCL 2020 for C++ Heterogeneous Parallel Programming VS Code C++ Extension: Cross-Compilation IntelliSense Configurations Modern C++ Tip of the Week Links Optiver Sponsors PVS-Studio. Write #cppcast in the message field on the download page and get one month license The Evil within the Comparison Functions Top 10 Bugs Found in C++ Projects in 2020
Transcript
Discussion (0)
Episode 287 of CppCast with guest Carl Cook, recorded February 17th, 2021.
Sponsor of this episode of CppCast is the PVS Studio team.
The team promotes regular usage of static code analysis and the PVS Studio static analysis tool. In this episode, we discuss updates to VS Code.
Then we talk to Carl Cook from Optiver.
Carl talks to us about high frequency training. Welcome to episode 287 of CppCast, the first podcast for C++ developers by C++ developers.
I'm your host, Rob Irving, joined by my co-host, Jason Turner.
Jason, how are you doing today?
I'm all right, Rob. How are you doing?
Doing okay. I guess we should talk a little bit about the awful weather that's going on right now.
Like half of Texas lost power, I think.
It's very scary.
Hopefully they get that resolved soon.
How are you doing in Colorado?
Oh, I mean, we've had winter weather, but it's Colorado.
You're used to it.
We had winter weather.
Yeah.
Like, oh, no, there's some snow on the ground and it's cold.
I mean, we were cold.
We were negative 8 Fahrenheit, negative 23 C.
Wow.
And then when I looked at one point, I actually saw a reading of negative 13 Fahrenheit, which I think was something like negative 30 Celsius.
But still, it didn't set any records. Right. I mean, it was something like negative 30 Celsius.
But still, it didn't set any records.
I mean, it was cold for a few days.
I didn't get anything.
We've just been getting rain over and over.
Well, I mean, on the upside, from our perspective, a little bit of snow got our river valley here.
Our winter snow totals are almost up to normal finally, which have below normal and it's really important for the farmers out here it's like a big deal
well uh at the top of every episode i tried a piece of feedback uh this week i actually got a
message from abigail martinson on linkedin uh she connected to me and wrote i started listening to
cppcast last year to help as i talked to C++ engineers for work, and I now look forward to listening each week.
As a recruiter, I know how most engineers view me, so I think it's important to become as knowledgeable as possible, and your podcast has played a huge part in that.
I just thought that was really cool that non-programmers are listening to the podcast and making use out of it. I know I haven't been recruited in a while, but if a recruiter reached out to me
and knew their stuff with C++, I would probably be a bit more responsive
when talking to them.
Doesn't your boss listen to the podcast?
Maybe. I'm not sure he still does.
I'm not looking to get recruited right now.
If I was looking for a new job
and the recruiter I was working with, you know, knew their stuff with C++, I think that'd be
pretty cool. Yeah. Yeah. I mean, if you don't get like must have five years experience with C++17
kind of, you know, requirements or something like that, then yeah, that's, that's handy.
I know I, I do hate it when I get like those out of the blue emails, like, oh, you know, we're looking for someone with all this experience in like JavaScript or something like that.
I don't even use.
But right.
Yeah.
So I thought that was cool.
We'd love to hear your thoughts about the show.
You can always reach out to us on Facebook, Twitter or email us at feedback at cpcast.com.
And don't forget to leave us a review on iTunes or subscribe on YouTube.
Joining us today is Carl Cook.
Carl has a PhD in computer science
from the University of Canterbury, New Zealand.
Since graduating in 2005,
he has worked mainly within finance,
ranging from hedge funds and market makers
through to cryptocurrency trading.
Carl is interested in high performance
and low latency systems
and enjoys the competitive
and challenging nature of trading.
As a hobby, he enjoys recreational flying and sees parallels between the safety built into aviation and the safety in place within regulated financial markets.
Carl, welcome to the show. Thank you very much. Good to be here.
So you mean flying actual airplanes, right? Yeah. Yeah. Although at the moment with COVID,
it's just on flight simulator, but yeah, I want to get a chance, actual airplanes.
Well, that's interesting.
I mean, in all seriousness, why would COVID limit your ability to rent a plane and go flying?
There are curfews in the Netherlands, for example.
There are curfews on sort of, but you can exercise and recreate during the day if you want.
But also, it's not an entirely nice experience with, you know, face masks and things like that right now.
It just takes the fun out of it a little bit.
Okay.
Well, Carl, we got a couple news articles to discuss.
Feel free to comment on any of these and we'll start talking more about your work and high-frequency trading.
Okay.
All right. So this first one is a blog post about Kronos releasing SICKL 2020 for C++ heterogeneous
parallel programming.
And I think we talked about this release when it was still upcoming with Michael Wong a
couple of months ago.
Right, Jason?
That sounds believable.
Yeah.
So I'm not sure how much to go into detail on this,
but if you want to hear about what's in this release,
you can probably go back to that episode
or read into this blog post.
Is there anything you wanted to highlight here, Jason?
Well, I do find it interesting that they jumped
2019 version numbers
because they went from 1.2 point X to 2020.
But in all seriousness,
I've actually been looking at sickle lately and it's been on my to do list for
like years to actually play with some of this GPU programming stuff.
And I think this is for me,
it looks like this would be the gateway drug here because it is,
it is cross platform.
Right. And that's what I would, I gateway drug here because it is cross-platform. Right.
And that's what I would want for the kinds of things that I do.
But I haven't actually tried it yet.
Is that possibly in an upcoming C++ Weekly episode?
Yes, but at this rate, it'll be in like 18 months.
What about you, Carl?
Do you use any of these higher higher level abstractions for gpu programming
not not directly um we do a small amount of gpu programming uh at work but i'm fairly well
detached from that side of things i'm very much i think the same as you i'm just watching this
sort of from the sidelines at the moment and um yeah when i get a rainy afternoon or a rainy
week one day i might dive in a little bit deeper i looked at the programming model and i'm like
i'm just going to give this a try this afternoon and then i looked at the programming model and
i'm like i'm gonna do it next week well what is it called the the one that uh nvidia is working on
that is more like you can just call C++17 parallel algorithms directly,
that is a just this afternoon, I'll play with it. But I don't have an NVIDIA GPU on a computer that
I have a C++ compiler on. I do have an NVIDIA GPU, but it's in the system I use for gaming and such.
It's got like different purpose for me. And besides that's running Windows.
Stoodpar, that's what it was called, right?
Yeah.
So that's limited to Linux with an NVIDIA GPU,
which is not a combination I have direct access to at the moment.
Right.
Yeah.
Okay.
Next thing we have is a post on the Visual C++ blog.
And this one is Visual Studio Code C++ Extension
Cross-Compilation IntelliSense Configurations.
And this is pretty neat.
You know, they keep adding more and more
to the C++ support in Visual Studio Code.
What this post is about is if you're doing cross-compilation,
so you're on a Mac and you're targeting Linux, for one example,
then instead of all the IntelliSense being based
on what it perceives as a default Mac environment,
it'll actually look, oh, you're using a Linux compiler,
then it's going to give you Linux-based IntelliSense.
So that's pretty neat.
Yeah.
Yeah.
Do you make use of Visual Studio Code,
or what is your IDE of choice, Carl?
Yeah, that's a religious war right there and there, right?
Let's go.
I personally used to use Visual Studio many years ago,
and I know a lot of people who use Visual Studio for C++.
Most of my coworkers are actually using Qt Creator
which is fine, just awesome
and saying that VS Code is definitely turning up in large numbers now
most of our new starters seem to gravitate towards it for some reason
and I haven't seen anything particularly wrong with it
it seems to be doing the job quite well
typically compilation is actually done off on Linux servers And I haven't seen anything particularly wrong with it. It seems to be doing the job quite well.
Typically, compilation is actually done off on Linux servers,
somewhere with just using your Windows desktop to run code.
And it seems to work pretty well.
You mentioned new developers seem to be gravitating towards it.
I wonder if it's becoming popular in university amongst you know engineers and training that's a good question i mean i i suspect when new developers turn up and they look at it and they
go oh what i'm gonna do i finally get a chance to start from scratch here you know do i really want
to be building out a massive you know virc or something like that maybe this is a chance where
they try something new and shiny and um people seem to get started with it pretty quickly.
And they don't stop using it once they've got it set up.
So, yeah, I'm relatively positive about it.
That's an interesting point because I do recall one of my earlier jobs
before I had really set myself in my ways to become an old man set in my ways.
I was like, I'm starting a shiny new job.
And I tried KDevelop at the time and used it for a while until I ended up
moving to Vim actually,
which I think we talked about KDevelop briefly a couple of times on the show.
Yeah.
Which could be seen as a predecessor in a way to Qt creator, I guess.
Maybe.
I mean, don't get me wrong.
I have VI key bindings in Qt.
So I just try to take the best of everything I can,
and it works all right.
Yeah.
Well, yeah, and now I use CLion for most of my development
with the VI bindings enabled.
Yeah.
Is that an option in VS Code?
I'd imagine it probably is.
It's an option everywhere.
It's an option in Compiler Explorer if you really want to.
I do know at work a lot of people do use CLI
because I see the Java processes being spawned all over the dev servers.
It can be a bit of a problem if you have a very large C++ project
as it consumes all available memory.
Yeah, it seems to consume all available CPUs as well.
It's getting better, yes. It's definitely getting
better. Okay, and then the last thing we
have is, this is a post on QuantLab Financial's
GitHub, and I don't know how long ago they started this, but they're doing
these C++ tip of the weeks, doing these c++ tip of the weeks
and modern c++ tip of the weeks and i think it's one of our previous guests who is uh working on
these right jason that's two of our previous guests actually if you look at uh lenny maiorani
and chris jusiak we've had both on the show in the past um Yeah. Yeah. So it looks like they're up to 213 tips.
I'm sure there are plenty of really good ones in here.
Did you look through any of the recent ones, Jason?
I started to click around.
Sorry.
And then I got distracted by something else at the moment.
You know how it goes.
But yeah, I was looking at.
No.
Yeah.
I didn't dig into any of them.
Although I do notice now that you say something that it says episode or whatever, 213, but only goes back to 182.
I don't know if they're pruning older ones or what.
Or if they only just started making this thing public in the last six months.
That's not immediately clear to me.
There's 52 commits to the repository.
That is a good point.
I doubt they would just prune them.
Well, but GitHub gets sad once you have several hundred files in a directory.
I work on projects that break GitHub. You can't open certain folders on GitHub. Not ideal.
I'm curious about that now. Maybe we should ping him on Chris
or Lenny on Twitter and find out what the answer is to that.
All right.
Well, Carl, do you want to start off by just telling us
a little bit about the work you do at Optiver
and how C++ factors into that?
Yeah, yeah, sure.
So I work within the sort of the core auto trading team
within Optiver Amsterdam.
So my team develops all of the auto traders that we trade on,
goodness, probably about 30 or 40 exchanges around the world.
Optiva has probably about five or six offices,
including an office in Chicago, an office in London,
office in Sydney, office in Beijing.
But we tend to trade actually across from a single office.
We'll trade, for example, the Dutch office will happily trade
in the US markets as well, in the South American markets.
So about half the job is working with the exchanges
and writing code to actually function within the exchanges to
be buying and selling financial instruments and probably the other half of the job is just working
on the algorithms the actual trading strategies behind the effectively what we use to decide
when we buy when we sell yeah that, that would be a pretty simple description.
Everything, basically everything is in C++
from within my team
and the vast majority of my office would be C++.
A little bit of Python here and there,
a little bit of C Sharp for some of the user interfaces.
But then again, a lot of the interfaces,
user interfaces are in C++ as well.
Now, this is not at all C++ related,
but when you said that you're happy to trade
in whatever markets you can
from wherever the office happens to be,
and maybe you can't speak to this at all,
but it kind of sounds like Optiver
would end up competing against themselves
if different offices are trading in the same exchanges.
Yeah, to some degree we do.
These are separate trading entities and that's fine.
So I should qualify this by the vast majority of our work that we do is what we call market
making, which is effectively printing what we're willing to buy and what we're willing
to sell for. So it's passive.
And we're there on the exchange, basically providing the liquidity to the market.
You'll find the market makers are actually the companies behind the vast majority of people willing to list or print their prices to the exchange.
And so, yeah, indeed, sometimes we will end up improving on another office's prices.
I mean, there isn't a lot of overlap, but there is something that does happen sometimes.
Then it's actually quite interesting to see who ultimately has the better trading strategy.
Interesting.
Yeah.
So what are some of the unique challenges about programming for low latency trading that cause your whole team to work in almost entirely C++?
Yeah. So the Dutch office has been around for about 35 years.
We've been writing our own auto trading systems for at least 15 years.
So it's quite a big office.
There's a lot of existing systems, a lot of integration points,
a lot of components, a lot of moving parts,
quite a complex sort of risk and assurance set of systems in there as well.
We started with C++ and basically never looked back in that respect.
And what's happened recently within the last few years
is we've actually made even more of a conscious decision
to write more and more of what we can in C++
because for us, it actually really works quite nicely.
And so we have our own you know networking libraries
that for just for communication within the office within the different components that you know it
works really well and we have you know a nice sort of you know client server architecture we have
nice protocol serialization and deserialization libraries,
sort of our own inbuilt reflection and things like that.
The more that we, I should just say it another way,
we just kind of realized that, you know, we've got everything we need here.
And so more and more we throw away components written in other languages
and just have a very quite simple C++ stack for most things, I should say.
It does seem sometimes that a problem is easier to express in a different language for whatever reason,
like maybe a functional programming paradigm or something.
I mean, we can do that in C++.
So do you have any strategies at all for when it's, you know,
do you embed scripting languages into C++ or anything like that? Do you know what I'm trying to say? Yeah, yeah, it's, you know, do you embed scripting languages into C++
or anything like that?
Do you know what I'm trying to say?
Yeah, yeah, it's a good question.
We've definitely gone down that route a few times.
The traders at present do have a scripting language
that they can use,
which then gets converted into C++ for us.
I'm not even sure what the current incarnation is,
but we did use Lua for quite some time.
But again, that all basically gets converted back into,
I think it's parsed and processed by our backing servers,
which are all C++.
Look, I mean, for sure,
functional languages can be really nice for certain problems.
So I guess the exception here is that the vast majority of our research code
would probably be Python. Okay. Because of course, you've got all of the goodies, all the numerical
goodies and all of the graphing, all of these cool things that you can do, such as this, you know,
real-time analysis with Juniper notebooks and just, you know, the really nice number-hackling libraries.
In saying that, all of our models for predicting, well, for effectively generating prices or
generating other sort of mathematical attributes of shares, they're all in C++, and we just
put Python bindings over everything. So whenever there's a
sort of a need to do something,
either to get access to something
that's written in C++ or
to get the performance of C++, we just
write it in C++, put a Python
binding on top of it, and then that makes it
accessible to everyone else as well.
Kind of the same approach that a lot of
AI research has gone, I believe.
That's interesting. A lot of the research has gone, I believe. That's interesting.
A lot of the neural network libraries, I believe, are written similarly.
We talked to someone about that at some point.
Yeah, that sounds right to me.
Yeah.
Yeah.
I mean, we used to try to have a dual implementation,
so we might have...
We actually...
Yeah, we did have Python implementations of, you know,
different pricing models and things,
but they just get too hard to maintain two different copies you get mathematical
inaccuracies and it's two code bases you're better off better off with one right makes sense yeah so
uh one thing we haven't asked yet is are you able to keep up with the latest standards are you
you know using c++17 currently? Are you already
looking at C++20? Yeah, it's an interesting one. So we don't get much benefit at all out of using
the latest, greatest compilers or the latest, greatest language features. In fact, to a degree,
we tend to just stand back a little bit and make sure that things
are stable. So that's pretty important for us. The other thing, I mean, not that stability is
an issue. It's not like, you know, there's problems where early versions of the compilers
just have terrible implementations. That's not the case at all. What we do find is a challenge
is to keep our code really clean and keep our architecture clean, keep the code consistent,
keep it understandable, keep it readable.
It's actually a real challenge within trading
because things change very, very quickly.
And something you could be working on today,
you might have to wrap that up tomorrow.
A new opportunity might come along.
It's potential that, hey, we're not going to use the strategy anymore
as of next
week, that one goes.
And so there's a huge churn and turnover, a huge number of commits, quite a lot of people
working on basically the same things.
So we have to be very, very careful not to, for example, go, hey, coroutines are awesome.
Let's just put coroutines into a quarter of our code base and see how that works out.
And we have to be very very very careful in that respect i'm curious about what exactly these strategies
look like that you're constantly rewriting and then possibly abandoning you know one week after
another what are these you know can you describe these a little bit more maybe um yeah i'm just
trying to think too much detail that's fine i'm just trying to actually think of
an example to back up what i'm saying now um well an example might be potentially a market that we
thought was really interesting or we thought that um a potential like a possible new sector on a
market was really interesting we start trading there it turns out that we were wrong we just
didn't have the numbers right and so that would be a market where we might go right we're we're not going to trade
there anymore or not for the meantime um another example yeah it could just simply be a strategy
that um we thought would um be profitable it turns out that the trades we were trading a lot but the
trades weren't profitable um in the long run and you know that could be anything from a implementation error which you
can fix to just fundamentally we got our numbers wrong in that case we'd walk away from that
strategy and markets change markets change all the time so you've got to keep up with that so
does the c++ code in any way itself adapt to these market changes like do you
use like profile guided optimization or something like that to help tune your code yeah so yeah
that's a good question we do use profile guided optimization a little bit or we have in the past a little bit um generally we're
not too sensitive to those sorts of market changes um so it's kind of too too too past
your question really um no we generally just write a one-size-fits-all sort of fundamental
system um that should be able to trade on most markets and we sometimes have to be in the
architecture for special cases or we have
to take a step back and refactor and have a look and go you know what we didn't quite model this
right let's do this again and properly in the next iteration now that we know a little bit more
so that's kind of answering the how well does our code work just as markets change answering
the question about profile-guided optimization,
we haven't had a huge amount of success with that.
Other trading companies might have.
But the problem that we find, it's a pretty classic problem.
It's just one of basically overfitting the model.
And so we can get an auto trader or an auto quota
absolutely flying for a day's replayed data so if we ever get that day again
with exactly the same trading events you know we can we're gonna be pretty damn pretty damn quick
unfortunately the 99.9 percent of the time where that day doesn't happen to be replayed it's slower
yeah if you knew the day was going to be replayed you would do all kinds
of things differently i would think not just need to do them faster yeah so that so that's that's
the um trade-off that we have is we want to be fast that that's important um you know that if
you're not within a reasonable balance of performance you're not really in the game at all but also on the days
when you do get strange behavior so a very very busy trading day like a day that no one's all
coming or something else that that really the markets are really really very hot you also need
to be able to handle that and so there's always that tension there. Because if you push
it too much, you'll probably find that on a day with a huge amount of volume, you're out of the
market, which is not what you want. In fact, with Optiva and any other registered market maker,
you can't be out of the market. There's regulatory obligations to be in the market, to be trading
for the entire time that that market is um open to trading oh
interesting otherwise you risk getting fined or something yeah so yeah well indeed or um
uh kicked kicked off the market um because that would be worse yeah well if i mean the markets
need it's a symbiotic relationship the markets or the exchanges need the market makers to provide the prices.
And you need to provide the prices all the time.
Like say with a couple of weeks ago with GameStop,
I mean, that caused a lot of volatility in the market.
It'd be quite easy to just turn off the systems,
you know, and walk away and go,
hey, like this week is just too hard for us.
There's too much craziness going around.
But that's exactly the time that the exchanges need the market makers to be there,
to just be providing a bit of stability to as much of the market as they can.
One thing I was curious about is you mentioned there being a lot of churn of new commits coming in all the time.
Do you have a lot of good practices in place to handle that?
Are you, you know, co-reviewing every single
commit going in i kind of imagine you must be yeah someone's not gonna bring down the whole
trading network yeah yeah so from a regulatory point of view um and in the last five to seven
years the regulation has has just increased you know three threefold um it's been a heck of a lot
more regulation um in the u.s markets
but actually pretty much all around the world so you need to be able to um prove that your systems
are in control at all times you need to be able to um prove that you've done a prudent amount of
testing for any release of any software that goes near the exchange which makes sense because it has
happened a couple of times where participants and exchanges have had code that hasn't worked
and it has caused material damage to the markets it's caused spikes in stocks it's caused you know
flash crashes a couple of times so you absolutely need to prove that you know that you're not going to have
a system that's going to do that to the exchange what does this proof look like like do you have
a large set of unit test suites or integration to us or like higher level like yeah so there's
there's guidance on this from the regulators but for us we um absolutely absolutely have a huge amount of unit tests,
but we record every run of unit tests.
We record the output of those unit tests.
That gets committed.
We record any log files that were produced.
We record what the output was versus the expected output,
and that's actually code-reviewed.
So those commits are code-reviewed,
and those commits are committed as well.
Does somebody review those commit?
Oh, sorry.
And then we have a suite of automated tests as well, which takes an hour or two to run
just to make sure that there's been no regressions as well.
Interesting. make sure that there's been no regressions as well interesting hour two would be considered
a very long test suite by some people's standards yeah um to be honest i could
i could make a change in the morning and have that running in the exchanges in the afternoon
okay as long as um i can get other people to review what I've done,
both the Agile code changes and the review of the testing
that's been done as well.
There'd be a few eyebrows raised if I did want to get something
into production that quickly.
But it's actually not too bad.
I mean, one, we have to do it from a regulatory point of view
But two it's kind of nice being able to sleep easy at night as well
No, not knowing that gee tomorrow wonder if that codes gonna work all right or not
So it's kind of a there's a net natural sort of motivation to to making sure that things are pretty well covered
Have you gotten the 3 a.m. The Japanese market is probably a problem phone call that you have to go deal with?
Not with OptiVu, but I have with other companies I've worked for.
To be honest, though, the financial system is pretty amazing in that respect.
All trades that we do, we have internal processes to make sure that our orders and
trades look correct we've got a huge amount of risk checking around that and you know just sort
of monetary checking around that as well but then there's external parties looking at this all the
time as well so the exchanges are looking at it but for every trade and every order that we send
actually that gets a copy of that gets sent to an external company that's running
a whole bunch of analytics and machine learning on it as well, just to spot, to make sure
that everything's going okay.
Sponsor of this episode of CppCast is the PVS Studio team.
The team develops the PVS Studio Static Code Analyzer, which detects errors in C, C++,
C Sharp, and Java code.
When you use the analyzer regularly, you can spot and fix
many errors right after you write new code. This means your team is more productive during code
reviews and has more time to discuss algorithms and high-level errors. Let the analyzer that never
gets tired do the tedious work of sifting through the boring parts of code looking for typos. For
example, let it check comparison functions. Why comparison functions? Click the link in the podcast description to find out.
Remember that you can extend the PVS Studio trial period from one week to one month.
Just use the CppCast hashtag when requesting your license.
I don't want to derail too much from the technical discussion,
but one thing you said a moment ago interested me
when you talked about all this regulation going into,
changes you're making in the software
and how that regulation has become,
I guess more strict over the past few years.
And you mentioned the GameStop thing.
And I thought one of the things that was being said
in the news was that there was really a lack of regulation and that, you know, there were hedge funds buying like 50% more of the shorts than should have been available.
And I was just kind of curious about that.
Yep.
So we've definitely derailed the technical side.
Okay.
That's fine.
I can handle that.
Yeah.
So, I mean, the markets are very heavily regulated.
Okay.
I can only short sell as much as the market allows and there's actually a there's a very well thought of um sort of layered
system of protection here so when you buy shares it's not you buying shares directly from the
exchange it's you buying them from a broker that ultimately will go to a clearer which will
ultimately go to the exchange and so
if for example something went really wrong the worst case scenario probably is that the broker
fails i think once in history maybe a clearing house has failed it would never pull down an
exchange with something like this so i'm not entirely sure where the accusations of breaking
regulations come from because if there are regulations that
have been broken that'll be investigated um i don't know i i i can't imagine a scenario and
where people either selling a huge amount of game stock or buying a huge amount of game stock
actually is a regulatory breach i'm not sure i just remember hearing that stat that like you know 150 150 percent of the
shorts that should have been available were bought you know it does seem well okay so
we can move on though it's okay well i'll tell you what i'll say one thing being able to sell
one unit of stock that you don't own is somewhat hard to understand so i just say it's slightly but it's even more harder to
understand how you can sell more than a hundred percent of the stock that you don't own so yeah
short short selling is a very interesting um thorny debate that's been going on ever since
markets were invented and now i'm just curious if rob got in on that oh no game stop bubble
no i mean i heard about it after it was all happening.
Kind of wish I had, but no, I was not into any of that.
No, I looked at that and I said, yeah, okay, whatever.
Maybe back to the technical a little bit.
I want to maybe just clarify something
because you're talking about how all of the Optiver offices are independent
and you're talking about you're at Optiver, an exclusive, almost exclusive C++ shop. I'm just
curious, is the C++ code that you work on shared across the offices? Yeah, it's a great question.
No, we don't. We tried that. It didn't work so well. Interesting. Really? So we, are you guys serious about the interesting?
Or was that a sarcastic?
No, I'm actually.
I'm a little surprised that an international company doing the same work in all their different offices wouldn't have a shared code base.
I mean, compared to like Google with their mono repo, 100 billion lines of code that everyone works on.
Yeah, it's like the evil opposite.
Yeah.
Yeah.
So, boy, we've tried it a few times,
and I still wake up in sweats at nighttime
sometimes thinking about it.
So Git wasn't around when we first started doing this,
so that's the first point I'd like to make.
Okay.
So using CVS or...
Yeah, we're using Subversion.
Okay, that's better.
Yeah, but it's very difficult because yeah yeah sure so some core
core libraries and we have a lot of really cool libraries um they were sort of shared for a while
um because we gave up on sharing strategies and the sort of high level things um it just got a
little bit too difficult because the market traded out of Sydney is often
very, very different from the European market, for example. Just completely different regulations and
just different logic, even if it's relatively the same style of trading. It just got too hard.
We just spent all of our time resolving merge conflicts and fighting bugs that came out of
resolved merge conflicts because, you know, didn't merge them correctly.
That's quite hard to spot sometimes.
So our very, very sort of lowest level library,
so, you know, the event model that we have,
the logging, the inter-process communication,
shared memory, those sorts of things,
yeah, that was shared for a while
and that just got too hard as well.
And so we eventually went,
we're just running our own um fork of the repo per per office amsterdam and sydney um have roughly the same
code base more or less or a fork of the same code base like like that look relatively similar, I think.
The US office is basically purely C.
That's sort of the direction they've gone in now as well.
Interesting. That is interesting.
Yeah, I'm not just saying it.
Yeah, so I think the argument there is that, you know, there are complexities with C++.
You have to be really disciplined when the code is moving around so much.
And in the US, I've just found that actually C seems to be enough for them,
and they're pretty happy with that choice.
Within Amsterdam, there's probably no way we'd do that.
We're very much in the C++ camp.
Do you ever get like a call or a Slack message from another office?
Hey, Carl, you won't believe we just made this change.
It had a huge impact.
You guys need to reconsider this part of the code or something like that.
Yeah.
So now that does happen because the idea sharing is there for sure.
Yeah.
But it is interesting because if you have
you know four or five different offices all trying to solve the same problem um pretty much
independently it's kind of like one of these genetic algorithms right like the best one wins
and then the information is shared around so oh for sure the information sharing is there
but it's not done via code it's done more at the sort of um
face-to-face level it's interesting effectively having the teams compete against each other to
see who's got the i mean not not really directly but yeah yeah yeah i mean i use the word compete
it's a little bit of a strong possibly a stronger word than reality um right. But yeah, it's by no mistake that we have this model.
Okay.
But just with the amount of code changes we make, particularly across the offices, yeah,
honestly, I would not like to go back to the dates of global repo.
And that's with a relatively low number of developers.
I mean, I don't know the exact
numbers but i'll be guessing around about an average of 50 to 100 developers per office
okay can you give us an idea approximately how large your code base is our one would we're not
entirely sure okay because i ran some numbers the other, and it's actually a little bit hard because where are all the repos within Git?
We have a lot of different repos.
We think somewhere between 1 million lines of code to 5 million.
That's pretty wide.
Yeah, that's a wide range.
Big, but not gigantic.
Yeah, and that's for my office.
I think the other offices would be broadly about the same.
And then our applications are,
I just ran some basic word counts, line counts over them.
Seem to be about 200,000, 250,000 lines of code
for an auto trader or for a thing that does this on the market.
About 10,000 to 20,000 lines of code is actually custom
code for that auto trader or auto quota or whatever it is the remaining 190,000
lines is common libraries that we use to build up it's the base for our for our
applications and of those common libraries I'm guessing somewhere between
we only use about 10 to 20% of those per application.
So this was just sort of a shotgun approach
of getting some lines of code metrics.
And then another thing is we build from source every time.
There's no concept of headers and libraries.
We just literally with Git bring everything in,
CMake it, build it, grab a coffee, grab another coffee, come back.
I'm guessing if you're in the Amsterdam office,
you have one of those nice fully automated espresso machines as well,
which I tend to see across the Netherlands.
We actually have an in-house barista.
Oh, that's even better.
I've only been to three offices that had an in-house barista.
Okay.
Yeah, well, the office is in lockdown right now,
so we all got sent a coffee machine to home.
That'll do.
Did they really send you a coffee machine?
For a Christmas gift, yeah.
Which has been getting some serious use
that's cool kind of yeah same time i also i also learned that making good coffee is difficult
uh yeah so i know we asked earlier um if you're you know on the latest version of c++ and you
said you're not necessarily uh on the latest and greatest is there anything you're on the latest version of C++ and you said you're not necessarily on the latest and greatest,
is there anything you're looking for from C++
that you would like to be standardized
to make the type of work you do easier?
Yeah, for sure.
So one of the reasons we're not on the latest and greatest
is we tend to be pretty conservative with our servers.
So we typically run a pretty mature version
of Enterprise Red Hat or CentOS
or whatever it happens to be.
That's a little down.
Yeah, yeah.
So we use their dev tool set.
So we're currently on GCC 8
with the option to run GCC 9 if we want.
I still suspect, I haven't checked,
but I still suspect things like small string optimization
are not in there with the
patched version of gcc that use i could be wrong but i yeah the copy on right change yeah for binary
compatibility ah you might have to stupid abis yeah so um so yeah so we're a little like we don't
we don't take you know gcc team for for example, just because they're not the cards that we've been dealt.
And that's actually fine.
We do use, we use a few features of, there's a few features of 20 that we use at the moment.
We use most new features that are available on the compiler.
23 is interesting with the BitCast.
We were talking about that today actually at work.
We have a few potential issues with alignment
and then we're like,
oh, how are we actually gonna fix this?
And it's like, oh, what do you know?
It's in 23, problem solved um so that's that's kind of cool um coroutines is something i do want to look at more because our
code is with with lambdas lambdas were awesome for us like lambdas just cleaned up our code base
um you know we can we could just go hey here i'm going to call something i don't want to know when you're ready but when
you are ready well i don't care how long it takes but when you are ready i'm going to pass you you
know this lambda to execute uh you pass a standard function but lambda same thing so our code really
got cleaned up for a while but unfortunately then we've kind of pushed lambdas a little bit too far
we are we have a lambda which creates another lambda that's fine
so yeah but you can't you can't really go far it's it's fine yeah use uses lambdas as much as
you want to so i mean you know you have to scroll right on your screen quite a long way to actually
figure out when you're done um and so it's just the way that we program we program in this
asynchronous kind of style.
I mean, we have no blocking call.
That would be a nightmare, right?
So we always just do work until we can't anymore,
give that to something else to deal with,
and then we do work where we can.
And this is the way our event call works.
We just spin whenever there's something to do, we do it.
If there's no event coming in,
then we just spin again until an event comes in,
like a new price arrives in the market
and we need to figure out what to do with that.
It's like you've written your own task scheduler.
Yeah.
Oh, look, everyone does.
Or you can use ASIO.
I've used ASIO in the past,
and that was fine using its um event event engine or whatever
they call it i can't remember now um it's been too long since i've used it also actually i've
been impressed with our masio for event-based programming except for a couple of weird bugs
around timers which uh bite you in lifetime but we'll just park that for now um so yeah i mean
we have a handwritten event model
and i'm sure all trading companies do because you just want to know exactly what's going to happen
all the time um you don't want it you know the event model to all of a sudden be doing a memory
cleanup or or some house clean cleaning tasks um but the more and more i look at our code the more
i think that in a perfect world i think co-outines could actually make our code a bit easier to read, a bit simpler.
If you end up deploying coroutines and have a huge success with it,
then you have to come back on and explain to us what you did.
And if it's a massive failure...
Then come back on and explain to us why.
Yeah, it'd be something that we'd have to do
quite slow as well like we'd have to be quite careful about it um i have to be pretty careful
to make sure that we don't get caught out by object lifetimes or you know allocation where
we weren't expecting it also the ability to debug as well um would actually come into it as well
um a little bit um so this would be more sort of trial and error with a small project and just see how it goes.
But I wouldn't be overly surprised
if in a year or two from now,
we are all about coroutines.
We'll see.
But the other thing that I think is missing,
and I haven't been looking too much at the proposals,
but shared memory, shared memory communication
and process communication.
Yeah, it'd be awesome to have that as standardized support.
I don't know if there's anything like that in the pipeline,
although there's still discussion about transactional memory,
which is tangentially related, I think.
So there's the Boost inter-process library as well,
which we use a little bit,
but we've just found that we've hand-coded
the majority of what we need,
which is just effectively fast
shared memory messaging from process to process on the same box.
I know other trading companies are doing this because I've got friends
in other companies.
Everyone's doing this.
Everyone's hand-coding it.
Everyone's hitting the same mistakes.
In fact, in your previous podcast, I forget his name,
but he was working on the PowerPoint plugins.
He was.
Yeah.
He was also doing his own shared memory kind of work.
Yeah.
Yeah.
I mean, those, you know,
I'm making files and what happens if another process with different user
permissions use that same file name and crashed uncleanly
and the kernel didn't clean it up, and then you start up
and you can't map the file name that you've picked
because Linux isn't going to let you lock that file.
Yeah, I mean, these problems are solved,
but it's interesting that everyone has to solve them themselves.
Right.
Just out of curiosity what um
does it look like to debug on this software do you have some type of
mock trading system that you're able to work with in order to actually test your code um
to are you talking about testing in particular or are you talking about debugging some of this? Debugging testing, yeah. So the exchanges will offer test exchanges
with varying degrees of quality.
Some exchanges are excellent.
Some exchanges are bordering on non-existent.
So that's obviously a little bit of a concern.
But there are certainly test exchanges that you can test on
the problem is generating the exact sequence of responses that you want um some exchanges are
great and they have really good automated support for doing that or semi-automated support
other exchanges you have to try to simulate your own sequence of events such as putting in an order
on a stock that hopefully no one else
in the test exchange is also trying to test on at that point in time. Then send in an imposing order
for maybe only half the volume, see if that trades, see what happens to the rest of the order.
Yeah, that's not a huge amount of fun. So to a a degree we have exchange simulators to just simulate what we expect the exchange to be doing obviously that has a little bit
of risk as well exchanges however when you do a software update such as using a
new protocol of the exchange so they might upgrade their API they typically
have a certification process as well so we sit there with the exchange on the
phone go through all of the likely scenarios,
and they tick it off to make sure that they're seeing what they expect to at their end.
For actual debugging, say, for example, we had an issue,
typically we'll just crash out and restart again.
We try to not recover from errors that shouldn't be happening.
That typically means that there's a bug.
And so when I say we restart restart often we don't automatically restart but often we'll roll back
just to the last known stable version something happens we've got a quarter we've got the symbols
in there um you know just trying to work out what's happening but a huge amount of uh real-time
metrics is going on as well.
So, I mean, typically we'll catch something before it really goes bad.
If, you know, there's excess memory consumption or excess allocations or excess messages between us and the exchange,
that'll be flagged pretty quickly.
That's actually, that's really interesting.
I'm familiar with systems that, you know,
obviously like a watchdog or something, or if there's a crash,
then they just come right back up again.
But the idea of not just coming right back up again but you know having available and easy to access the last known binary or whatever is that you could roll back to
yeah and look it's all in git you know right any right all of that configuration is done through
git um so if we want to upgrade it's's a code-reviewed pull request,
and then that will automatically, once that's been approved,
that's triggered to upgrade in production.
And so if we need to roll back,
it's very simple to figure out how to do that.
Interesting.
Although we are looking at,
this is an idea that Sydney have, which is quite cool,
is actually just return values on when we exit to indicate if definitely don't restart do restart but not
until you know this component has been fixed or you know there's there's um 255 values we can use
there to indicate what you know has has happened um and so that's quite a nice way of semi-automating
recovery after shutdown.
Yeah.
Yeah.
I mean,
you could take it a step further more than just the,
the bite is if you,
you know,
could somehow manage to like,
what's the word I'm looking for?
Like touch a file real quick or something that is,
you know,
even more information.
Yeah.
Dependency ordering is also a fun one too,
because we have 15 nearly 15,000
components running in production every day or 15,000 running processes and we have to start
them up in the right order and we've got a degree of automation there to do it automatically
but of course it gets a little bit a little bit difficult sometimes particularly if you're communicating over a shared message bus and it's not entirely sure if this
other component is up yet or not um so yeah sometimes it can be a little bit of fun in
the morning if something goes down to then figure out exactly the correct sequence to bring things
back up again right i sorry bringing back flashbacks
of systems that i worked on that had a very theoretically we could reboot the whole system
at once but in reality um we didn't rely on a dhcp server so each device got its own link local
address and well i won't go in the details but they would all come up with the same link local
address so in that particular environment it would sometimes take like three hours for the
system to finish rebooting as they all negotiated what link local address they were going to use
anyhow okay well i think we're starting to run a little low on time. Carl, is there anything else you wanted to talk about
before we let you go today?
Not really, actually.
It's been fun.
I'm just having a look through some notes there.
Not really.
I think the main thing I want to sum up, I guess,
is that, I mean, for us, C++ has been awesome.
The hard bit is just managing it,
picking and choosing the right language features.
Yeah, it's certainly challenging to keep code
that's constantly changing, you know, readable and bug-free.
I think our modern C++ has been great for that, you know,
just to be able to auto- to auto out all of the, you know,
iterated declarations.
And we can write some really nice code these days,
but I think that's actually where the challenge is,
is, you know, when you've got big systems
that are constantly changing and, yeah,
it's just, that's where the challenge is,
is just trying to keep the code sane.
But yeah, so far, so far, so good, actually.
Yeah, that's my summary of my my life at the moment i guess well and you know since we've i've been or one
of us has been asking pretty much every guest that we've had on lately uh it sounds like abi
compatibility not a big deal for you you're you would be happy if c++ broke abi um yeah as long as the world
doesn't end um i'd have to think about it quite carefully from uh what happens to our servers are
we actually relying on a library that we didn't realize about but if we can't do it given that
we compile everything from source we have literally no libraries except for
you know glibc and the usual suspects right yeah if we can't do it no one can do it because really
we have the simplest um abi constraints we we have none we compile from source right well so far rob
i don't think we've had a single guest who has said that breaking abi would break their system
so sure and i think we would all agree with car with Carl that we're good with it as long as
it doesn't cause the world to end.
No one wants to cause the world to end with an ABI break.
Our informal poll here of our guests that is going on.
Okay.
Well,
Carl,
where can listeners find you online?
And you know,
is there anything else you want to plug from
optivar are you actively recruiting or anything like that um for me myself online i tend to keep
a pretty low profile don't even have a twitter handle um okay optivar is always always hiring
of course so i i think i've ordered my link which i guess you guys will um add to this podcast so yeah octopus um url is
there check it out um yeah of course we're always always looking for people interested um uh and
yeah that's about it for me awesome okay thanks girl thanks great thanks guys pleasure thanks so
much for listening in as we chat about c++ we'd love to hear what you think of the podcast please
let us know if we're discussing the stuff you're interested in, or if you have a suggestion for a topic, we'd love to
hear about that too. You can email all your thoughts to feedback at cppcast.com. We'd also
appreciate if you can like CppCast on Facebook and follow CppCast on Twitter. You can also follow me
at Rob W. Irving and Jason at Lefticus on Twitter. We'd also like to thank all our patrons who help
support the show through Patreon.
If you'd like to support us on Patreon, you can do so at
patreon.com slash cppcast.
And of course you can find all that
info and the show notes on the podcast website
at cppcast.com.
Theme music for this episode was provided by