The Derivative - The Mysteries and Makings of Machine Learning with Dr. Ernie Chan of QTS Cap

Starting point is 00:00:00 Thanks for listening to The Derivative. This podcast is provided for informational purposes only and should not be relied upon as legal, business, investment, or tax advice. All opinions expressed by podcast participants are solely their own opinions and do not necessarily reflect the opinions of RCM Alternatives, their affiliates, or companies featured. Due to industry regulations, participants on this podcast are instructed not to make specific trade recommendations nor reference past or potential profits, and listeners are reminded that managed futures, commodity trading, and other alternative investments are complex and carry a risk

Starting point is 00:00:35 of substantial losses. As such, they are not suitable for all investors. Welcome to The Derivative by RCM Alternatives, where we dive into what makes alternative investments go, analyze the strategies of unique hedge fund managers, and chat with interesting guests from across the investment world. So let's say you think that the fix or the realized volatility or the GDP growth might be a useful variable to predict whether the market is going to go up or down. Well, if you just apply linear regression, you will find that the signal is very weak because it doesn't take into account the fact that some of these variables might be conditioned on each other. It could be that, you know, only in a low volatility regime does the stock market depend on GDP growth, whereas on a high volatility regime, there's no such dependence. If you apply linear regression to this variable, you will find it's

Starting point is 00:01:40 just a wash. You cannot find any signal. But if you apply random forest to it, it will tease out this kind of dependence of, you know, under different regime, under different conditions, these variables would work. And under other different conditions, some other group of variables will work. So it has this kind of hierarchical structure. You pick the most important variable first first and then conditioning on that variable, look for another variable that is maybe less important, but combined they can generate a much stronger prediction than if you just treat them on the ego footprint. All right. Hello, everybody. For those watching on YouTube, you'll see I'm coming to you from a different work from home locale today. Turns out I tested positive for COVID this weekend, so the family has quarantined me to the basement.

Starting point is 00:02:46 But the show must go on. And despite the COVID fog, I'm in. We're going to try and dig deep into the nitty-gritty of AI and machine learning with a star in the space, Dr. Ernest P. Chan. Ernie, as we call him, writes the popular quantitative trading blog, authored three books on quant trading and machine learning founded predictnow.ai and runs managed accounts and a fund through his asset management firm qts capital management so welcome ernie thank you very much for inviting me jeff yeah uh we were just talking so you're in the niagara falls area of ontario most all every finance person I've ever met from Ontario is in Toronto. So you're a little out of the city. That's because I was never part of the financial community in Toronto.

Starting point is 00:03:34 I started my career in New York. And since I'm after I moved back to Canada, I had no particular reason to stay in Toronto because I really never worked there. Right. And were you a Canadian citizen? What's your background? Yes, I'm Canadian. I was originally from Hong Kong, but my family moved to Canada before I was in college. So I went to U of T, U of Toronto for my undergraduate. And then I went to uh u of u of t uh u of toronto for my undergraduate and then i went to cornell for my graduate school and then since then i have you know almost every job i got was in the new york area until i decided to do my own thing and move back to canada got it and now what are we are americans allowed back in there yet or what? It's a very interesting situation.

Starting point is 00:04:26 If you fly, you can do that any day. Air traffic is allowed, but you cannot drive or take the train across the border. Or walk. When I took the kids to Niagara Falls, you do the walk over the, is it called the Rainbow Bridge? Yes, you can. Yeah, I think they can walk walk across countries which is fun yeah so that's probably shut down now with the COVID anyway and for sure they don't want me coming in um well we have enough cases on our own we uh yeah the um and so Ali was telling me we might be joined by a special

Starting point is 00:05:06 guest today Coco yes I wonder where she is what is she? Cat? she's a ragdoll cat it looks like a ragdoll it's called ragdoll because

Starting point is 00:05:21 when you hold them in your arm they just go limp. They're very relaxed. All right. I want to see that. Get them over there sometime. So, yeah, let's jump into your background. A bunch of those big New York shops.

Starting point is 00:05:36 So give us a quick rundown of how you got to where you are at QTS with all those big names in your bio. Okay. So I, you know, probably a couple of years into my PhD in physics, I knew that I was not cut out to be a physicist academically. So I actually worked hard to find applications for physics. And I found that at IBM Human Language Technologies Group, that's where I joined after my PhD. So that group has produced some very famous fund managers,

Starting point is 00:06:13 some of whom you might see in Washington Journal, such as Bob Mercer and Peter Brown of Renaissance Technologies. They co-ran Renaissance for the last probably decade um until recently um and uh so i have pulled them out of there like he hired this they kind of took up the reign for jim simons that is correct that's correct yes um so after a few very enjoyable years at IBM in Yorktown Heights, I also decided to get into finance because not because I have particular love for finance at that time, but because I love New York City. So I can't stand one and a half hour commute from New York, reverse commute from New York to Yorktown. So I joined the new data mining and artificial intelligence group at Morgan Sani, which was in the mid 1990s. But already all the investment banks are heavily invested in AI technologies, not just to trade,

Starting point is 00:07:27 not just to investment management or trading, but also to various other aspects of their, of their business, sales, marketing, operations, customer relationships, whatever. Fraud. Yeah. Yes. Oh yes. Fraud is a big thing too. Although when I joined out, Morgan Stanley was purely an investment bank. They did not acquire any retail business yet. So to them was not a major concern at that time.

Starting point is 00:07:54 So after a short while there, my team or half of my team decided they want to trade themselves. They don't want to consult for other business units. So we went over to Credit Suisse and started Pop Trading Group. It did not went well because actually I have not found AI to work in finance for the longest time since I arrived in Morgan Stanley. Not in fraud detection or other credit,

Starting point is 00:08:20 you know, other things in trading the market. It's- And by your team at Morgan Stanley, you were trying to develop actual models, quantitative models. Yes. Part of our job was to develop quantitative trading model using AI. And we continue that in credit suites. And as I said, it did not went well.

Starting point is 00:08:39 Give me an example. Give me a simplistic example of something that the AI spit out that you guys were trying to train. Right. So the AI typically learned complicated things from a time series. We apply it, for example, to the global futures market, 55, 60 different futures. And we try to apply the same model to trade, you know, to define trading rules to trade all these futures using the same model. And the input would be, you know, nothing particularly fancy, different kind of technical indicators, but of course engineered into various fancy mathematical

Starting point is 00:09:20 functions. And, you know, inevitably the backtest looks fantastic. And inevitably when you live trade it, it had to, you know, the day you live trade would be the high watermark of the model. And we now know why that is the case. You know, as some of the industry giants, such as, you know, Dr. Lopez-Gilberto has written many times in the book. There are many reasons why machine learning is very difficult to apply to finance, one of which is the data snooping bias. It's simply that there's not enough data for a naive application of machine learning. So we did not find a lot of success. And after trying to apply machine learning

Starting point is 00:10:14 to finance during my stay in New York, I grew frustrated and therefore that's why I left the industry and decided to trade on my own in 2006 and completely abandoned machine learning at that time. Just start from the basics. And lo and behold, simple and basic strategy worked, whereas complicated machine learning model did not. So that's how I started my investment management career. After a few years that I had successfully applied to my own account, I started to manage some external investor accounts, started a fund in 2008. That was an auspicious year to start a fund because we actually did very well.

Starting point is 00:10:58 My strategy had a strange intention to do well when there's a crisis. So it has always been almost always except one situation, except one year, it was always the case that it did particularly well in crisis. So that, so you know, ever since, you know, I started to manage my own money and started to manage my own investors money we had used fairly simple uh strategies um and most of these strategies are not so different from the ones i wrote about in my book people say wow you know uh you know why did you know clearly the strategy you wrote in your book can't can't work right because who would write something that worked in a book yeah so that everybody can can read about it but that's not the case because the basic um kind of strategy that i trade are

Starting point is 00:11:51 quite similar to the ones that i wrote about it's just that of course i have bells and whistles uh little tweaks here little tweaks there that make it better than the plain vanilla version uh that you know everybody know about but. But pair trading is pair trading. The gist of pair trading is the convergence of a spread. And that everybody knows. It's just that even today, I know that many very successful pop trading firms are still using pair trading and very

Starting point is 00:12:25 successful. It's just that they have all these little tricks and maybe have better data, better execution. They have maybe some other variables that they monitor, which make it more successful than the plain version. So in any case, that has been what we do for the last 10 years or so. Until recently, when we...

Starting point is 00:12:48 Real quick, didn't you have a stop at Millennium in there somewhere? Oh, yes. During my 10 plus years in New York, I did have a brief stop at the Millennium. So one of my colleagues at Credit Suisse, the group that was, you know, was not particularly successful in Credit Suisse, but one of the colleagues did become a portfolio manager at Credit Suisse. And he asked me to join. So basically, every job I got is because somebody else wanted me to join them. So I joined Millennium and we had a almost weekly chat with EC England. I guess at that time, Millennium was not huge. So easy to have time to meet with every portfolio manager and their underlings like myself. But for a reason that is beyond my understanding, he and my boss had kind of fallen out.

Starting point is 00:13:49 And so the group was eliminated within six weeks of my arrival. Oh, really? So it was a quick stand. But EC was very kind. You know, I got to give the credit to the man. He personally called my next employer because it doesn't look good on the resume to see that you're dismissed after six weeks. People say, what's

Starting point is 00:14:08 wrong with this guy? And he personally called my next employer and said, hey, Ernie's not a bad guy. Just blah, blah, blah. So I'm quite grateful for his graciousness. Did you know then that they'd be huge one day? Could you tell the structure?

Starting point is 00:14:23 It was already pretty big at that time. And one thing I can tell is that every day you arrive at the lobby, I mean, their lobby is amazing because you have to use palm recognition. You open the door by putting your palm on the detector. And you open the door. This is 15 years ago?

Starting point is 00:14:41 How many years? That is in, let me think, in the years 2000 and just around 2003. Okay, so yeah, pretty far back for palm detectors. Yes, at that time, it was not common. Now, of course, every cell phone has it. But the office is still at 666 Fifth Avenue. It's a famous address. And when you open the door, after you put your palm to it, you will be greeted with a strong smell of lilies. Every day they have this huge bunch of lilies sitting in the reception room.

Starting point is 00:15:24 Just by that, you know that these guys are successful. They never forget to fill their lobby with fresh lilies. Every day they have this huge bunch of lilies sitting in the reception room. Just by that, you know that these guys are successful. They never forget to fill the lobby with fresh lilies. They must never have a bad day. That's right. So no, I think they were already very successful at the time. They were hiding all kinds of set up groups. And, and, you know, and, and everybody was huddled in their own little corner, generating this and that. And, and it's amazing, you know, they, they have perfected working from home. Even at that time, my boss came into the office once a month or well, actually,

Starting point is 00:15:59 when I was there, he came in once a week, but I heard that when I wasn't there before I arrived, it was once a month just to meet with VZ. So everybody just work from home and then, you know, not everybody, but at least a lot of people and all their trades are just executed automatically from the servers. So it's a highly, you know, it's really a technology firm more than a trading firm. You know, you don't hear the buzz on the trading floor. It's really a technology firm more than a trading firm. You don't hear the buzz on the trading floor. It's very quiet.

Starting point is 00:16:29 The floor is everybody's whispering. You don't hear traders yelling across the desk. And that's just the way I like it because everybody's quantitative. Nobody gets emotional here. You wouldn't have liked the trading floor. It was the exact opposite. That's right. The border trade in Chicago was big guys yelling and screaming and punching each other.

Starting point is 00:16:50 Yes. Not a lot of time or quiet for peaceful thought. That's right. Yes. Which is surprising because I understand that EC came from a market maker or, you know, on the floor. He's probably used to this kind of noisy hyper environment, but the firm that he founded is quiet like a university library.

Starting point is 00:17:19 I hijacked you a bit there. You abandoned your AI, your machine learning, you went back to simple models. Yes. And until last year when we observed that a lot of our models are gradually losing offer. And that is because if you look around, you know, it's like every university now has a quant finance program. You know, it's like every STEM students have thought about applying it to trading.

Starting point is 00:17:46 So all the simple strategies and, you know, small part of which is because of my book's popularization, every simple strategy has been exploited by hundreds, if not thousands of people. So they are kind of, they're slowly suffering after decay. And so, you know, and and as i said we do apply tricks bells and whistles but even the bells and whistles are becoming quite obvious you know how many bells and whistles can you add to a strategy not you know if you do it manually uh you know the

Starting point is 00:18:17 obvious bells and whistle is something that many smart people can can that. So even those are losing their potency. So finally, I came full circle. I rediscovered my interest in machine learning. I started to read up on it again to bring me up to date because there are huge strides in machine learning that were being done in my 10 years of absence from the field. Things like dropout technique, deep learning, they are all new. So I brought myself up to date and then apply it to our current training strategies using a method called method labeling, which is essentially applying machine learning to your basic strategy to learn from your own strategy to see when it would not do well. And we found some success with it.

Starting point is 00:19:08 We were, I'm constantly surprised actually by, you know, before I was constantly disappointed with machine learning. You know, every time it's like you start trading, it's a start to draw down. But now I'm constantly pleasantly surprised by machine learning because it seems to be so present. I remember two examples stood out in my mind. One is the period from November to January of this year. The machine learning model tells us that there's no terrorist in the market. Stop trading.

Starting point is 00:19:41 We didn't send a single trade to the market. And our investors thought that we had fallen asleep, frankly. And we said no. Were you out of the country? Did you go on vacation? No. Well, actually, I did go to New York to enjoy a nice Christmas there. But the program is

Starting point is 00:20:02 completely automated. So even if I was not there, I should trade. But it says, hey, the economy is great. Everything's great from November to January. There's no risk. Why bother to tail hedge anything? And then suddenly on February 1st, it started to tell us to hedge tail risk.

Starting point is 00:20:20 At that time, there was nobody, at least publicly, nobody thought that the virus would affect the economy, nor the stock market. But the machine learning program monitoring 150 plus variable at that time, now it's more. Basically, every month we add more variable. But at that time, it's about 150. They detected some risk somewhere. Maybe faint signals, but because they are looking at so many variables and the whole is greater than the sum of the parts. It's sort of aggregate all these

Starting point is 00:20:56 faint signals from around the world, they detect it as tail risk and they start to ask us to tail hedge and allow us to capture maybe around 80% growth of return in those two months. So that was one surprise that it knows how to turn off and it knows how to turn on just two weeks before the market hit all time high and then came a huge drawdown from there. And then I read your blog post on that back from august or whatnot and it made me think back when i was designing models and you'd go through these bad periods and you'd be like you know what this doesn't work when there's declining true range in the evening or whatever i'm gonna i'm gonna filter that out oh and it doesn't work when there's this so all you're basically doing is saying hey i let the machine figure out what those that's right that's right i did the same thing as you did uh before uh which is you know hey you know let's

Starting point is 00:21:50 field it through the fix let's field it through the you know uh realized volatility whatever gold price but you know soon enough you're going to get yourself 150 variables you know which one how do you weight them yeah so you know and and um and so that's machine learning is pretty much the only practical approach when you get too many of these variables that you need to incorporate can we go backwards a step now and kind of explain the basics of machine learning for some of the listeners? So how do you approach that problem of I've got 150 variables and I want to see what works? Yes. So, you know, actually, a lot of people, maybe they didn't know about it, but they have probably experienced or used machine learning in college. They may not know it, that it's called machine learning.

Starting point is 00:22:46 And the simplest machine learning is called linear regression, right? So in a linear regression, let's say any economic student probably have used it. Yeah, or statistics class. Yeah, how do you predict the change in GDP? Oh, you know, the unemployment rate and whatever, consumption and whatever. So, you know, that the basic machine learning is simply fitting a straight line through, well, a plane through multiple variables to predict the dependent variable. And machine learning is simply a more complicated version of that.

Starting point is 00:23:21 But the key difference of machine learning such as random forest, let's be concrete. There are so many machine learning algorithms but let's say random forest is typically the good old standby for financial machine learning. For random forest, it is introduced an element of non-linearity and also of conditional dependence on this variable. So instead of treating all this input on the same footing, they would hierarchically or iteratively pick the most important variable first. So let's say you think that the fix or the realized volatility or that's a GDP growth might be useful variable

Starting point is 00:24:11 to predict whether the market is going to go up or down tomorrow. Well, if you just apply linear regression, you will find that the signal is very weak because it doesn't take into account the fact that some of these variables might be conditioned on each other. It could be that only in a low volatility regime does the stock market depend on GDP growth, whereas on a high volatility regime, there's no such dependence. If you apply linear regression to this variable, you will find it's just a wash. You cannot find any signal. But if you apply random forest to it,

Starting point is 00:24:51 it will tease out this kind of dependence of under different regime, under different condition, these variables would work. And under other different conditions, some other group of variables will work. So it has this kind of a hierarchical structure. You pick the most important variable first

Starting point is 00:25:09 and then conditioning on that variable, look for another variable that is maybe less important, but combined, they can generate a much stronger prediction than if you just treat them on the equal footing. So that's- And the random forest wording comes from like a decision tree, right? Like the base case is decision tree,

Starting point is 00:25:31 and then a random forest is the many iterations later. How do you tie those two together for me? That's exactly true. Decision tree, everybody kind of know, even if you're not in machine learning, you can kind of understand what a dissidentry is. Did I...

Starting point is 00:25:51 You know... If the market opens higher today, then bye. The key insight that machine learners have found in the last 10 years is that if you just build one decision tree, it is very prone to overfitting. So they deliberately generate randomness in the data

Starting point is 00:26:14 by a technique called sampling with replacement, so that they randomize the training data using the same set of data, but sometimes you sample some data point more than once in order to create some variation. And because of this randomness, the tree that was built on this resampled data will be slightly different. So you would generate, let's say, 100 trees using the same data set that we sampled. And so these 100 trees all have slight differences. And you take the sort of the average prediction of these 100 trees. And that average is much more robust to noise and to coincidences than just building one tree.

Starting point is 00:27:03 Because if you just build one tree, you are very much learning only from the particular fluctuation of that data. You are not learning sort of from the average of that data. So that kind of technique is now prevalent in machine learning. And it seems like classic algorithmic trend followers and managed futures people came to that conclusion more than 10 years ago, but without machine learning, without this construct, but they were saying, hey, instead of one model,

Starting point is 00:27:37 let's have five models. Instead of one parameter, let's have 10 parameters and basically do an ensemble of the signal so that the signal itself is stronger than just relying on, oh, it didn't take that high today. I'm not going to get that trade. But I got the average of all those trades over four days. That's right.

Starting point is 00:27:55 Yeah. Yeah. That's the ensemble approach. You can call it diversification, right? So everybody knows that if you just trade one stock, you are very much in danger of, you know, it's almost like buying a lottery. But if you have a portfolio stock, buy 100 stocks, and if your stock picking strategy is good, you are, you know, greatly reduced your chance of big drawdown. You know, it's unlikely that all your 100 bets are wrong. So that's the same principle, indeed,

Starting point is 00:28:27 in going from decision tree to random forest. And the random forest is giving probabilistic, so it's assigning a probability to each outcome? Or is it giving an actual outcome? Yes, so there are two kinds of random forest. One is a classification. The other is regression. So in regression random forest or regression decision tree,

Starting point is 00:28:50 you would produce an expected return with some error bars. So you can say, oh, what's the expected return for a spider tomorrow? And you'll get, oh, 0.3% plus or minus 0.5, right? That's, and the other kind is the classification tree that will give you whether, okay, is the market go up or down? So it's discrete prediction,

Starting point is 00:29:14 up or down. And then for each up or down prediction, you will have a probability. So you say, well, it's going to be up with a 0.52 probability and, of course, down with a 0.48 probability. So that's the two kinds of trees. Most people, including myself, most often it will. Well, obviously, you don't expect the market to go up exactly 0.32 percent, even if the expected return 0.32.

Starting point is 00:29:57 Nobody expects it. It has a big IRA bar, first of all. But the other problem is that oftentimes it will also get the sign more wrong than not. So even, you know, you're trying to predict the magnitude and it often get the sign wrong. Not only the magnitude, of course, is wrong, but the sign is also wrong. So you missed twice. Yeah. Yes. It's a bad mess.

Starting point is 00:30:20 Not that great. Yes. Yeah. So we prefer classification tree. But there are some applications where regression are important. And actually, we are writing a paper on that application. It's quite intriguing. I think that application will be useful for many traders.

Starting point is 00:30:36 Even if they're not eager to use machine learning, this application will nudge them towards machine learning. It's just quite interesting. But for that application, it has to do with parameter optimization. application will nudge them towards machine learning. It's just quite interesting. But for that application, it has to do with parameter optimization. That application, you do have to use regression tree. But for meta-labeling, just for straightforward to assign probability to whether it's going to go up or down or whether your trade will be

Starting point is 00:31:02 profitable or losing a cost of creation tree is typically more useful. And would you, I'll just go into this and then we'll come back to something else. But would you say your breakthrough was figuring out, hey, I'm just going to run it on my own model that's designed outside of the machine learning versus are other people just trying to have a model that's based solely on machine learning of whether the market's going to go up or down tomorrow? Yes. Yes, that definitely, I, I, I definitely think that, uh, that's,

Starting point is 00:31:33 that's really, um, uh, the first time we tasted success in, in applying machine learning finances when you already have a successful model, uh, uh, that is based on fairly straightforward trading rules and you apply machine learning to learn when it's not favorable to run that model. I mean, this technique called meta-labeling certainly is not invented by us.

Starting point is 00:31:56 Again, it's written by Lopakli Prado and probably exists even before he wrote that book. It just wasn't given perhaps a name. But, you know, a lot of machine learners or in finance or otherwise have used it. But since, you know, Dr. Lopax de Prado popularized it in his book, more people use it and we definitely learn from it. And we find it to be, you know, quite practical. And now take us from everything. And we found it to be quite practical. And now take us from everything we just said about machine learning and tie it into AI.

Starting point is 00:32:31 So are they one and the same thing in this regard? Or does AI get used improperly in that regard? What are your thoughts there? In the old days, there's a distinction between AI and machine learning. So in the old days, AI tend to mean expert systems where people hand craft some rules. And, you know, like a chess game, they program 1000 different moves when to play a chess game. And they call that artificial intelligence. But that's really not artificial intelligence because the human intelligence just Encoded in a program. It's not quite artificial, right?

Starting point is 00:33:09 So now automation I'm sorry. I'd say it's more automation than Exactly. Nowadays people would just say that it's programming. It's not artificial It's just just coding everything it's not machine learning, it's just coding everything. Hard coding rules, right? So nowadays there's not much, not few people do that anymore. Nobody uses expert systems anymore. So everything is probabilistic, it's real learning, no hard core, no hardwired rules.

Starting point is 00:33:42 And so now the distinction between AI and machine learning are basically zero. Zero. Okay. So they're one in the same for all practical purposes. That's right. So then it comes down to supervised versus unsupervised learning, right? I think what most people might consider more pure AI would be unsupervised learning where the machine's just running through and figuring out things on its own uh yes well um yes supervised learning has a lot of uh yeah unsupervised learning has some usages for example um clustering you know for example you you say okay uh i I want to know what are the, identify some market regimes

Starting point is 00:34:32 out there and how do I identify? I want to look at the volatility of the market, whether it's a bullish or bearish market, or whether the interest rate was going up or down and so on and so forth. Whether the dollar is going up or down and so forth. So you have a whole bunch of variables, maybe five, 10 variables. And for each day in the market, you have this 10 variable with different values

Starting point is 00:34:58 and you want to cluster the market into, let's say, oh, three different regimes automatically. Now you as a human, I say, I don't know how to define a regime. Maybe you don't want to define it. I just want to use these five variables and have the machine automatically find these three regimes. And you turn a clustering algorithm loose on this data with these five variables. And lo and behold, it will find three regimes. Some days belong to regime one, some days belong to regime two, and some days belong to regime three. And then you have to scratch your head and say, OK, what does it mean?

Starting point is 00:35:36 What does this first regime mean? Oh, this first regime typically has low volatility where the dollar is down or where the interest rates are decreasing. Maybe you can make do some interpretation, but that's not a priority. You don't know that ahead of time. It's just a machine happened to pick this regime that you after the fact, interpret it as being a low volatility calm regime. And maybe the second regime is in inflationary. Oh, interest rate going up, market goes down because of inflation and gold price going up and whatever.

Starting point is 00:36:10 And you can interpret that as the inflation regime. And maybe there's a third regime where you have slow economic growth. Beautiful. Everything, stock market goes up, inflation is non-existent, interest rate is steady. And so you get this third regime. So that's a particular use of unsupervised learning because you as a human, you don't tell the machine ahead of time what regimes you are going to find. You only can interpret this regime after they're found. And that's of use when you want to find something that doesn't fit into your brain or the human brain. That is right. Go out and

Starting point is 00:36:50 find some relationships that I can't see. Exactly. And one common technique that has been used in classical finance is called principal component analysis. A lot of people use that uh and that is a form of unsupervised learning because it will find sort of market factors instead of saying oh um in the market there's the market factor there is the um uh the the value factor and then there is the momentum factor and then there's the the winner minus loser factor you well, that's the momentum factor or the size factor. You know, those are human determined factor,

Starting point is 00:37:29 right? You, you think that size is important to determine the stock valuation. You think that the book price ratio is important, but you know, who's to say you're right. Okay. You,

Starting point is 00:37:43 you think that way and then you find evidence to support your view. That's supervised. But unsupervised would be like principal component analysis. You just compute a covariance matrix, and the machine will find these factors, some of which may resemble the value factor. Another factor might resemble the momentum factor, but you will never know ahead of time. It's not easy to interpret some of these factors, except the first factor,

Starting point is 00:38:11 of course, it's always going to be the market factor. There's no question about it. You know, most stocks will go up when the S&P go up. So that factor is universal.

Starting point is 00:38:20 That's easy to interpret. But after that first factor, what's the second most important principal component? It's usually a little bit hard to interpret. But after that first factor, what's the second most important principle component? It's usually a little bit hard to determine. Is that the value factor? Is that the momentum factor? Well, maybe a mix of both. Who knows?

Starting point is 00:38:34 That's also unsupervised. So let's switch gears a little bit and tell us about what you're doing with predictive.ai or did I mess up the name? Predictnow.ai. Predictnow.ai. So tell us what you're doing there, tying some of this into it. Yes. So, you know, we, as I said, we had found some success in applying machine learning in a particular way

Starting point is 00:39:07 to finance. Now, we run a tail head strategy. It's a niche strategy. The AUM cannot go big. It's not a strategy that would make money every day. That's the first thing I tell clients. Do not expect us to make money most of the time. Expect us to make you money once every two years. And that is not a strategy that would be very attractive to a lot of clients. So we remain a small fund. A strategy only a machine could love, right? I'm sorry? A strategy only a machine could love.

Starting point is 00:39:43 That is right. You have to be practically a finance professor to appreciate why you should run this strategy. And most people are not finance professors. So so we decided, well, but we have this, which we think is very powerful technology that that can improve anybody's strategy. You know, it doesn't it's not a system that only improves our own strategy, but it can be applied to anybody's strategy. You know, it doesn't, it's not a system that only improves our own strategy, but it can be applied to anybody's strategy. And why not roll it out as a separate product? And we are not the first one. We are by no means the first hedge fund

Starting point is 00:40:17 to think that we can monetize our own technology and roll it out to other people. There are multiple hedge funds that do that. People ask us, why are you doing that? I say, hey, look at this other fund. And so that's the idea behind finding PredictNow.ai is that we want to launch a technology platform where other funds and other professional investors

Starting point is 00:40:40 can benefit from machine learning as a risk management tool, as a capital allocation tool. And I don't want to say that as a signal generation tool, because I still firmly believe that signal generation should be by simple strategies, by the human intelligence rather than machine intelligence. Otherwise, the scope for overfitting is fast. But machine learning can definitely help in risk management and can help in capital allocation,

Starting point is 00:41:07 what we can generally call pre-trade analytics. So we want to offer this pre-trade analytics platform to other professional investors. And that's the reason. And the concept is they can load in their daily trade P&L or something. Yes. And then run through different, take it from there.

Starting point is 00:41:29 So if I load in my trade P&L, what happens? It'll identify factors or I have to also identify the factors. You have to, there are two ways. Certainly, you know, you, you know, best what kind of factor you suspect might affect your strategy profitability. So you can certainly upload it and you don't have to tell us what those factors are. You can name them F1, F2, F3 or F150. And you run it through our system and we will tell you if those factors are predictive of your strategy's outcome. Now, many traders have a hard time

Starting point is 00:42:06 coming up with all these factors. We will help them. So part of our professional service team will help suggest other factors to them. You know, we only need to know, oh, you're trading Forex. Have you looked at this, this, this, and other factors?

Starting point is 00:42:21 Have you looked at interest rate? Have you looked at bond yields and so on and so forth? And we can suggest them and we can help engineer these extra factors for them. So that's the current, you know, we started this offering around April with only about 10 users. So this is still in version one, but in the new year in Q1, we expect to incorporate data in our platforms too. So traders don't have to scramble to find their own data. The first data set we're going to incorporate is US fundamentals. Essentially, all the financial statement data

Starting point is 00:43:04 will be available for free to our users so that they can incorporate if they're trying to predict stock return, they can use some features that are engineered from earning, from dividend, from whatever balance sheet item, you know, where your CFA had and dig into their financial statement and apply them to see if they can affect the profitability of the strategy. So that would be the first data set. And then we will incorporate high frequency features, things such as features that basically you will only be able to obtain if you have a $10,000 subscription to the CME data feed.

Starting point is 00:43:47 Yeah. Those will include like aggressive flag and so forth. So microstructure features. Yeah. We are signing a deal with a high frequency data provider that will allow us to offer that to our premium subscribers. So I think those will be fairly unique. It's not something that you will find on any broker platform, for example.

Starting point is 00:44:13 Yeah. I think you've talked with our Joe Signorelli and David Dahn, right, in our office about some of that microstructure stuff. So cool. So there's 10 users or it's grown a little bit? It has grown. We are now approaching, I think, maybe 80 users over time. But as I said, we are still in the early days. We are creating an API. We just finished an API so that some traders who do not like the no-code service,

Starting point is 00:44:44 local service is the same means that you have to upload a spreadsheet every day. And some people like it, you know, if you are not a programmer and you are comfortable with Excel, that's how you can interact with our service. But if you are already running an automated strategy using some language, using Python or whatever program language that you are running, and you want to use our service as part of the pipeline of your automated process, we offer this API so that your program can call our service and do the training and the prediction as part of the pre-trade analytics. And now, but it's only financial.

Starting point is 00:45:27 So I can't put in like the football who covered the spread in the football games and have it. It's interesting that you ask that because it's in fact universal. So we have a team member in our company who is a sports fan. And the reason we hire him is that, you know, when we interview him, he said, I had always wanted to use machine learning to bet on sports. And I say, sounds great.

Starting point is 00:45:56 You're hired. And so he's actually, he's actually gathering sports statistics to apply machine learning to it. He's going to write a blog post when it's successful and done. So, yeah, sports betting is not a tool. I'm in a league with some buddies and I keep losing where you have to pick. You get 14 extra points, which seems so easy in the NFL. But every week someone loses by more than 14 points. So I'll hit him up on that.

Starting point is 00:46:23 Yeah. so I'll hit him up on that now take us into QTS so we've danced around it you had the base models you added back in the machine learning so where does it stand today and what's the main program right so QTS started out nine years ago as a Forex fund. We did excellent for six months and then we were hit by the U.S.

Starting point is 00:46:56 Treasury debt downgrade, lost 35%. And then we recovered. Thank goodness we recovered over the years and now become more prominent as a tailhead fund. But in the last few years, we also started. So that's the proprietary strategy we offer is the tailhead strategy. We also have a shortfall strategy called fixed timer. But that strategy is trade even less frequently this year. You know, this year we have found that the volatility structure of the market is just completely different. You know, we have never seen a situation where the market go up and fix also go up, you know, that doesn't happen too often. And the level of fix with this, you know it doesn't happen too often so um and the the level fix with this you know the the variance

Starting point is 00:47:47 um with premium is you know it's it's you know i i in my view quite unprecedented you know we fall to so low and the the implied voice so high and it just never seemed to want to come down so it's uh uh that's why our fixed strategy actually are kind of not active this year. So even though we offer it. What's the word you're using there? Your fixed strategies? It's called fixed timer. So we trade a fixed filter and usually we short the fixed.

Starting point is 00:48:15 So that strategy is not over, but it's not trading much this year. Okay. The tail ripper strategy is the one that is a tailhead strategy. It's a trend following tailhead strategy. It's a transform tail head strategy. And it trades very actively this year. And it made very. Yeah. We can't talk about performance,

Starting point is 00:48:32 but we'll put links to it in the, in the show notes. But for sure. So now. But that is the, that is our CTA offering. Okay. So, so that's our proprietary strategy.

Starting point is 00:48:47 But in our fund, we are now a fund of funds. So in addition to trading our own strategy, we invest in other managers and other funds as well. So that it becomes truly an absolute return vehicle. And that's what we run as well. So tell me more about the CTA strategy. So the Tail Reaper is trend following, but you're taking directional trades in E-mini S&P, right? Yes, it's an intraday trend following strategy. And it's very simple. The basic, you know, like I said, I like simple strategy.

Starting point is 00:49:25 So if the market goes up a lot that day, we buy. And market went down a lot, we shop. That's as simple as that. And we always liquidate by the end of the day.

Starting point is 00:49:36 But most people, well, that's even right there is somewhat controversial, right? Because a lot of people say when the market's up big, I'm a contrarian. I want to sell into it.

Starting point is 00:49:43 Right, right. So the base of it is that it's a momentum strategyian i want to sell into it right right so the base of it is that it's a momentum strategy that's going to momentum strategy yes and of course you know obviously we don't get a hundred percent success rate right you know so yeah the the guys who trade the contrarian are awesome often right and in particular regime they are more often right than I am. So example, October, September of this year, they are definitely more right than I am because the market always means reverse for one reason or another. And we lost and they win.

Starting point is 00:50:17 But you're saying you found it's more profitable or you think it's smarter to do, I've got my one strategy, it's going to be a momentum trade, and then I'm going to use my machine learning to basically know when to be on the sidelines. Exactly. Versus knowing when to switch from a momentum trade to a mean reversion trade. We don't switch. We might launch a new strategy in the future that does the reversal side. But at the present time, the machine learning is going to veto

Starting point is 00:50:47 the momentum strategy and tell them that today is likely to be minimum value. So don't trade. We don't advise you to trade opposite. That will be a separate strategy. So even though in this kind of situation, the contrarian oftentimes win more often than us, no question. But on those days that they lose,

Starting point is 00:51:14 they're going to lose huge and we are going to win big. And that's the premise of momentum strategy. We are, we have positive skillness. The probability distribution is not normal. It's skilled to the positive for a momentum strategy. That's why I told my client, we make money once every two years. On those years, we make more money

Starting point is 00:51:36 that will make up for all the losses in the two years that we keep making wrong bets. Right. You take a lot of little small losing bets in exchange for big, uh, lumpy outlier returns, but like normal trend following like long-term trend following. That is right. And what's it looking at minute by minute or hour by hour? No, we, we, we look at a tick by tick on an intraday basis. Okay. But it's coming into the

Starting point is 00:52:02 day, the machine learning saying today is a day where momentum trends should work. So we're, that's why you basically gives you position sizing based on the, yes, on the probabilities. And how does that work? It's all automated. So it's just, it's all automated. That's right. That's right. And do you see it? Do you sometimes say, oh, no, not today? Well, that's what I i uh you know there are certainly occasions where i think that it shouldn't trade but it's trade and we lost so you know you can't be 100 accurate but yeah i i i was saying that one amazing day that it asked us not to trade and it was that right and that was the day when pfizer announced the vaccine oh good that was the day when Pfizer announced the vaccine.

Starting point is 00:52:45 Ah, good. That was the day where if we were in the market, we definitely got stopped out. Because if you recall, the market was up, what, 5% before the market opened? Yeah. And then it was just like relentlessly down for the rest of the day. And if you look at everybody saying that the momentum factor for a stock investor,

Starting point is 00:53:08 there's this momentum factor. That momentum factor had the crash that is like six sigma. It has never crashed so much for that day. So amazingly, the machine learning program told us days ahead of time to say, stop trading. So how do you protect against something like the machine basically saying momentum doesn't work on Wednesdays, don't take trades on Wednesdays, right?

Starting point is 00:53:30 And that's not really based in anything and that might just be noise. Yes, we actually have seasonality factors in it. So yes, indeed, we have say, oh, is this the last trading day of the month? Is this Friday? Is this the triple reaching day? Right. Which there's good examples, but say like days that end in three or something. Something that's kind of nonsensical.

Starting point is 00:53:56 Nothing so bizarre as like, you know, the astrological sign. No, nothing as bizarre as that. But certainly reasonable people would think that triple wishing they may be different from other days right yeah we have all that and i can believe that all this will nudge the probability a little bit no question it will nudge it a little bit but the beauty of machine learning is that not one variable dominates they all nudge it to a smaller or bigger extent but it'll my question is more will it allow anything in there will it allow like hey bulk shipping rates are down or something

Starting point is 00:54:31 right like will it allow any factor in there are you you have to tell it which factors are going no no oh yeah i mean it's a supervised learning so yes you you well you know that that's the human part and i always say that machine learning, the most difficult part is the feature engineering part. It's before you learn, you have to provide the data. And to create the proper data set, to engineer the proper data set, there's a humongous amount of human effort going to it.

Starting point is 00:55:01 You cannot avoid it. And a humongous amount of bias too, for better or for worse, right? Yes. Well, there are known bias that you want to avoid, you know, look ahead bias. Certainly you don't want that. But, you know, let's suppose that you are properly engineering that, but there's still a lot of work to collect all this data and create the proper features out of them. And that's good because that's still, you know, if everything is so automated, there will be no differentiation. Every trader will be equally successful or equally unsuccessful.

Starting point is 00:55:39 But even with machine learning, you can still gain an edge if you are a more experienced trader in terms of the features that you pick. So you pick some, hire some guy or girl from Google machine learning group. They are not going to suddenly outperform all your senior traders because they still have no clue what to look at, what to feed in to your machine learning model. But that would tell me like, just skip that step and go unsupervised and just feed it in every piece of data in the world and let it figure it out. And what are their shops doing that in your opinion? I'm just saying skip that trader step, skip that human need and just have have the machine

Starting point is 00:56:27 figure out which factors are important for itself well yes i mean that um uh for meta labeling that is a uh possible possible approach for meta labeling uh that is to say if you already have a strategy and you only want those universal factors to predict whether your strategy is profitable, that is an acceptable approach. The more data, the merrier. But if you want to directly use those factors to predict the outcome of the market, not the outcome of the strategy, but the outcome of the market, that runs a big risk of data snooping, overfitting. Right. If I've got $50 million to burn and I build a whole data team and just say, hey, take in all the data

Starting point is 00:57:11 you can and I want to make money tomorrow. Yes. You think it's going to come up with something. There are some that do that. For example, WorldQuant. WorldQuant is an offshoot from Millennium Partners.

Starting point is 00:57:29 They take that approach of generating millions of offers. I have come across numerous COI students that had run a mandate to generate signals for them. They hire just thousands, tens of thousands of consultants across the world. Every college student can apply to be a consultant for WorldCom and they will generate something. Well, there were a few of those platforms a few years ago, right? Of like, where you could submit your quant strategy, quantopian and a few of those places, right? Yes. Where they were kind of building libraries of strategies and then adding machine learning on top of them.

Starting point is 00:58:08 Right. But, you know, platforms like Quantopian, they don't necessarily encourage you to use machine learning to generate training signals. But whereas WorldQuant definitely does. And the reason is they give you the data. They don't even tell you what's in your managers. They just give you a time series.

Starting point is 00:58:23 Okay, predict. Well, if that's the case, there's no way you can use any trading knowledge or fundamental understanding you don't even know what market you're dealing with so might not even be a market might be football scoring exactly right um so um maybe it's weather you know who knows traffic whatever and um so they they took that approach it's just exactly as you said. I don't know how successful it is because I have heard that they are not in a great place this year. Rumors had it. I'm not privy to their performance. And there's another firm called Numeray who also take a similar approach. And again, I'm not privy to their performance. I have no

Starting point is 00:59:04 idea how well they do. Well, it seems that's my word, right? That's the old milk prices in Japan affect like Tesla stock over here. Like you're going to get some spurious correlations, right? Yeah, and of course they will tell you. And of course they will tell you, oh no, we have this and that statistical procedure

Starting point is 00:59:22 to prevent that. Out of sample testing, cross-validation, and that's all true. But inevitably, there is still some way that overfitting come in. So, you know, it's not, you know, even with all these techniques to prevent overfitting, to reject spurious correlation, there are inevitably some subtle way that simply because of sheer amount of data and the limited, the sheer amount of features and the limited number of rows, the number, the small number of data points that you can use this amount of data to predict that inevitably create a situation where it is easy to overfit.

Starting point is 01:00:11 And what are you switching? Do you think that the whole, right? It's only becoming more and more prevalent, machine learning usage, right? Like every strategy is going to have some component probably in the not too distant future. Like, do you think that makes the market more robust, more fragile? What do you think it does for the overall structure

Starting point is 01:00:30 if everyone's using these methods? Or is it the same as just we're replacing human brains with machine brains? Well, I think that it will actually create a more diverse ecosystem. Because as I said, in machine learning, it's hard to find two machine learning systems that will make exactly the same prediction, where it is fairly easy to do so for a simple transform strategy,

Starting point is 01:00:57 buy high and sell low, everybody doing that. But for a machine learning system, even if it goes high, it may not buy. One day it might shop. So actually, I find that with machine learning system, the ecosystem is more diverse. And less likely to call everybody like lemmings running run off the cliff together so that's like although although back in 2012 when was there was that big long short equity like everyone lost on the same trades at the same time right but that's not basically everyone was using similar factors so right that's based on

Starting point is 01:01:37 traditional factors yes yeah so if you're you're all using the same random forest techniques and it seems like you would all come up with pretty similar strategies in the end if you ran all using the same random forest techniques and it seems like you would all come up with pretty similar strategies in the end if you ran all the computers all the time you're saying no because there's that well intervention you know i i think that um is is it's actually not not so likely because um uh we have done a lot of experiments where you have the exact same input to a machine learning system. As long as you have a different random seed used in training, it's going to pick different features every time. There's so much randomness in a machine learning system that you can plot. Every random seed, you will generate a particular performance sharp ratio.

Starting point is 01:02:29 So in a traditional quant strategy, you only get one back test, right? You, you, you say, Oh, I don't sell high. And you know, that's it. You know, you, you, you, your 20 year performance is sharp ratio of 1.2. Right. Looks great. Yeah. Well, what if the, the history is different? Well, you said, well, I don't know if the history is different. Maybe I get 1.7 or maybe a negative 1.2. How would I know

Starting point is 01:02:47 what the different history would be? Because I can only see one history. Right. Where the classic people would Monte Carlo it and add all the losers together

Starting point is 01:02:56 and see what happens. But yes. But the problem with Monte Carlo is you can never generate a realistic Monte Carlo, right? Because a lot of these events are so rare,

Starting point is 01:03:07 like long-term capital management or Russian default. How many times can you generate Russian default? It's very difficult. But machine learning, on the other hand, is different. You use exactly the same history. It's the real history, not a simulated history. So you get all the little tail events. But even with the same history, and with the same input, same features, as long as you use a

Starting point is 01:03:32 different seed to create a random forest, you get a different Sharpe ratio. So you get a broad distribution of Sharpe ratio. You know, every time you forward dice, you get a different shop ratio. So actually that's show you the diversity, you know, you know, even for the same system, it's clear that you run it every day. You know, two people having the exact the same system may not have the same trade. Because of the random forms. Because of the random components. Yes, exactly. And so that's actually, I feel, it's make the system less fragile.

Starting point is 01:04:08 Because it's, you know, and of course, as a trader, you might run 10 different random forests with the same model so that you get, you know, hopefully the law of large number will get you closer to the mean sharp ratio that you expect. And that is a good thing. But essentially,

Starting point is 01:04:30 you cannot expect two traders to generate exactly the same trade even with the same system if you use machine learning. I love it. Next time, we'll talk about how the random generation isn't necessarily random, right? Inside the computers, but we'll save that for another time.

Starting point is 01:04:46 That's right. This has been fun. What have I missed? We'll put links to all your books real quickly, but tell us about your books. When was the last one? The last one is called Machine Trading. And I think it was published maybe three years ago.

Starting point is 01:05:06 Yes. I think it's around a few years ago. Are there new ones in the works? Well, I am publishing the second edition of my first book with a complete update. Oh, great. And because the first book is really targeting to new traders,

Starting point is 01:05:20 people who are new to trading or new to quantitative trading, at least. We don't want heavy mathematics to obscure the point. There's no big formula or anything, but it does have some up-to-date techniques and a technique, for example, to determine how long a backtest. When I first write that book,

Starting point is 01:05:42 I say, oh, hand-waving, if you want to, if you have a parameter with five parameters, you need a backtest of whatever three years. Now we have actually some people, you know, again, Dr. Roberto Brado

Starting point is 01:05:53 and collaborators have an exact formula to determine how long a backtest you need given a particular Sharpe ratio. So that kind of updated insights now are in the new book that probably going to publish Q2.

Starting point is 01:06:10 Q2, all right. We'll look forward to it. We'll put it out there. We end all our pods with some of your favorites. We'll go rapid fire here. Favorite investing book. Not your own.

Starting point is 01:06:34 I always like Johann Sinclair's options books. So that's only one of my favorites. I, of course, I like the Advances in Financial Machine Learning by Dr. Wexley Prado. We get a lot of machine learning ideas from that book. We also like a book that is not so much about trading, but about finance. It's called statistics and data analysis for financial engineering. It's a book by finance professors, but it has really all the rigorous time series technique, mathematical techniques, regression techniques, risk management techniques that any quant trader must know is really a must

Starting point is 01:07:25 have okay oh and that's one art book sorry so it sounds like a page turner yep yep well it's um it's a textbook it's thick and it is it's daunting and but you you need to get through it either there's no other way around and that actually that's the one book that is less heavy reading and that's called asset management. And it's by professor Andrew Ng. He used to be a professor at Columbia, a finance professor, but he now heads up a quant investment at the black rock.

Starting point is 01:08:01 Not a bad, not a bad game. Yeah. Yeah. That book is great for people who doesn't like math. It's for MBA, CFA, but they don't want to have heavy duty math. That's the book for them. Perfect.

Starting point is 01:08:14 Favorite non-finance book? Oh, well, there are many, but Les Miserables is one of them i read i don't read french i only read the english version so i okay did you read it before you saw the play or did you know i i read it only after i saw the page right i i've never had the uh that's that's an endeavor i've never thought to actually read it um Yeah, it's less tedious than one would have feared. It's actually quite engaging. I was able to finish it in good time.

Starting point is 01:08:52 All right. I'll check that one out. Favorite Tim Hortons order? Oh! Usually the breakfast sandwich. Although recently, some of the employees are so rude to me that I stopped going there and instead go to McDonald's. That's the Canadian.

Starting point is 01:09:12 That's your spot, right? And we'll finish. Favorite Star Wars character? Oh. Well. well I'm actually not a big Star Wars fan to be honest I've got R2-D2 right behind me

Starting point is 01:09:36 here I would say that's one of my favorite but yeah I'm much more of a Lord of the Rings and, you know, the Harry Potter kind of thing. Okay, I'll take your favorite Harry Potter character then. Oh, favorite Harry Potter character. Okay, well, Hermione is mine. Hermione. All right, perfect. Yes, my daughter has a lot of resemblance to her. So that's why it's my favorite great all right Ernie this has been fun thanks

Starting point is 01:10:05 so much and we'll uh talk to you soon and best of luck with everything all right thank you you've been listening to The Derivative links from this episode will be in the episode description of this channel follow us on twitter at rcm alts and visit our website to read our blog or subscribe to our newsletter at rcm alts.com if you liked our show introduce a friend and show them how to subscribe and be sure to leave comments we'd love to hear from you

The Derivative - The Mysteries and Makings of Machine Learning with Dr. Ernie Chan of QTS Cap

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.