Epicenter - Learn about Crypto, Blockchain, Ethereum, Bitcoin and Distributed Technologies - Richard Craib: Numerai – A Revolutionary Hedge Fund Built on Blockchain and AI
Episode Date: July 11, 2017Numerai Founder Richard Craib joined us to discuss his radical project to build a hedge fund with network effects. Numerai manages its portfolio by giving its data in encrypted form to data scientists... who compete to create the best predictions and get paid with cryptocurrencies. Numerai expects to radically alter the structure of the hedge fund and asset management industry. Topics covered in this episode: How hedge funds work and what trends effect them Quantitative trading and the role of AI in investing How Numerai uses crowdsourcing and AI to manage its portfolio How Numerai lets data scientists build models without knowing the underlying data The Function of Numerai’s own token Numeraire Why Numerai is switching from paying data scientists in Bitcoin to Ethereum Numerai’s crazy goal of managing all the money in the world Episode links: Numerai Numerai Whitepaper An AI Hedge Fund Goes Live On Ethereum – Numerai – Medium A New Cryptocurrency For Coordinating Artificial Intelligence on Numerai Encrypted Data For Efficient Markets Building the Numerai Meta Model Introducing Numeraire This episode is hosted by Brian Fabian Crain and Meher Roy. Show notes and listening options: epicenter.tv/191
Transcript
Discussion (0)
This is Epicenter, episode 191 with guest Richard Crape.
This episode of Epicenter is brought to by Ledger and the Ledger NanoS.
Half piece of mind and knowing your private keys are protected by industry standard physical security.
Go to ledgerwalt.com to learn more.
And by Shapeshift.io, the easiest, fastest, and most secure way to swap your digital assets.
Don't run the risk of leaving your funds on a centralized exchange.
Visit Shapeshift.com.io to get started.
Hello and welcome to Epicenter, the show which talks about the technologies, projects and
startups driving decentralization and the global blockchain revolution.
My name is Brian Fabian Crane.
And I'm Meher Roy.
Today we'll talk to Richard Crabe, who is the founder of Numerai.
Numerai is a new kind of hedge fund built by a network of data scientists that is using blockchains
and cryptocurrency in an especially interesting way.
So let's get started.
to have you on the show, Richard. Thank you. Good to be here. Tell us a bit about your background.
How did you come to be involved? How did you end up doing what you're doing right now?
Well, I originally was always interested in math and finance, and I started working as a
quant after I graduated with a degree in math. And machine learning was just taking off at the time.
And so I started to work on these datasets that, you know, they were using Qant at this asset management firm I worked for, but they didn't use machine learning.
And so I got to basically create their first machine learning algorithms and make a whole fund that was based on machine learning.
Around the same time, I was also reading about what was happening in blockchain and encryption.
and I was, you know, invested in the Ethereum crowd sale and the Augur crowd sale,
and I was sort of keeping an eye on that as well.
But mainly my focus was on machine learning.
And they ended up building this fund that worked very well,
but we couldn't really collaborate with anyone
because the way finance is set up is like structurally everything is secret,
and no one's working together.
And so I had this data set that was proprietary that I couldn't share with anyone.
And I started to have people I knew who were better than me at machine learning, but I couldn't collaborate with them.
And so that's where I started to think about the idea for a hedge fund that had public open data set that anyone could model.
And so that's where I started to have the idea for Numerai.
Cool, yeah.
I mean, I guess if you arrived at it from that position,
that you were in, it's sort of almost an obvious idea, right?
Like you think like, okay, it should go in that direction, right?
That would be better if it did.
But when you first had this idea,
how did you think it was possible?
Did you think this was going to turn out?
It was a feeling of knowing that everything made sense.
And I'd really thought through it because I'd been working on this data set for so long.
The big challenge was how do you frame?
a data set, financial time series data set, into a machine learning problem that's actually
quite fun.
And a lot of the competitions, machine learning competitions on Kaggle, they had, every time
there was a finance one, it was actually bad.
It ended up being very easy to cheat or overfit.
And time series is somehow not really the right kind of thing for machine learning.
and there's very low signal in these data sets.
So thinking about all that was the hard part,
and the next step was to actually be able to share it
without giving away the data.
And so that's where I started reading about homomorphic encryption
and how we could use that
to basically share the data
while preserving all predictive structure,
but having it be that none of the people
who are looking at the data had any idea
what they were modeling. And that was the big insight because if you give people raw data,
they end up cheating or overfitting or really that are most machine learning people don't want
to learn about finance either. So they make a model that wouldn't be, wouldn't make sense.
But by framing it in this very specific way, we can get models that that actually help the
hedge fund. Cool. Well, I mean, we're going to speak a lot more in detail.
about how exactly this works, but just to give some introduction for people who aren't as
familiar with hedge fund and quant trading. When you were managing this fund, was this at the hedge
fund as well or was this an asset management company? So can you, can you give some overview
of this industry? Yeah, it was at an asset management firm. They have a number of different
fund products. They had about managing about $15 billion. But, but,
But a lot of it was long only.
And so that means that you only buy the stocks.
You don't go short.
But when you have a really, really good quant model, the most profitable way to run it is in
a long short style because then you get the alpha or the returns on your longs and your
shorts and are hedged against market risk and other risks.
So when I did build this, it was to create a long.
a long only product to beat the all-country world index, which is like the S&P of the world.
And it worked really well, but it was really much more suited for a hedge fund strategy.
So that's why I quit my job, moved to San Francisco, and started Numeri.
And I think there are definitely lots of hedge funds out there using statistics and mathematical models,
models, but I don't think they're that many using machine learning kind of at their core.
So even just having a great machine learning hedge fund would have been a good idea.
But I think the crowdsourcing and the collecting intelligence from all around the world
took it to another level.
And in general, what kind of big trends are happening in the hedge fund industry?
How is that industry changing?
Well, they definitely have quant investing on the rise.
I think it's close to $1 trillion or something in Qant,
and it was like $500 billion a few years ago.
So it's definitely moving out of human fund management
into Qant fund management.
But it sounds a little bit cooler than it is, Qant.
A lot of it is still people,
but they've just made some mathematical model.
So it's automated, but it actually isn't really artificial intelligence.
So I think what's going on inside of DeepMind and Google, in terms of AI,
is way ahead of where hedge funds are at with it.
And a lot of them are really doing things that they've been doing for many, many years,
certain kinds of statistical arbitrage strategies.
And so a lot of them are actually.
extremely correlated so when they go down they all go down together even though
they all hedged even though they all claim to be uncorrelated so it's a strange
thing where you have the shift from fundamental into quant but now quant itself is
crowded and and that's why new hedge fund innovations I think it's a really good
time for it and there's certain hedge funds that are doing things in
cryptocurrency and AI and I think right now that it's not clear you know who's going to
be the leaders in this next wave of hedge fund but I think it is clear that there is a new
wave and that that traditional quant can't really lost the way it has so let's let's go
into what Numeride does but before that I think in many of our discussions we are
coming across this forward which is a data model right
So can you explain what a data model is?
Can you give us an example of a data model and how a quant fund would use a data model?
Yeah.
Well, you're just looking for a pattern.
So one of the easiest ways to think about machine learning in terms of modeling a data set is just imagine you have a scatter plot
and there's just a whole bunch of points in 2D.
You can fit a line to that data.
data. And maybe it's like corn growing over time or something like that. And you could say,
after six weeks, how far would it have grown based on some historical data? And you can just use
a simple linear model to predict that. And then you've used data, created a model, and now you can
predict corn growth. Now, taking that into other domains, it gets more complicated because you have
many, many different variables. So I think on numerai right now we have 20, 21 features.
So that's like a 21 dimensional scatterplot. And you're not really just trying to fit a line,
you're trying to fit like a very complicated curve in 21 dimensions. And so that's the kind of
principle of it. And what you're looking for is patents and finding the best, best model that's
possible on that data set based on the patterns in it.
But as you can see, like, nothing I've described needed any domain knowledge.
So you don't need to know what the axes are in order to fit a line.
And you don't need to know what numerize data is in order to make a great model on it.
So that's what modeling is all about.
And you can go from, it's gone from being very simple.
having just a linear model to very, very, very complicated with the things you can do with
deep learning and tensor flow neural nets.
So applied to like financial trading could could you give us like a toy example?
Let's say you're talking about these 21 dimensions.
Could you give us a toy example of like maybe something like something that is like three
dimensions or something and
is a relevant example to the financial industry?
Well, you could, a lot of people, there's two big quant factors.
One is momentum, one is value.
So how cheap is a stock on like a PE basis?
And momentum is how well has it been doing lately in terms of the price.
And so just fitting a model to those two features, you can say,
if a stock has good momentum, does it go up?
If a stock has good value, does it go up?
And if you take lots of different stocks over long periods,
you might find a patent that actually says, yeah,
if momentum is high and value is slow,
it's a good time to buy.
And the stocks that have those properties will tend to go up.
And that's what you can learn from a dataset.
So I downloaded this data set before,
and I had a look at it,
because I was just, you know, kind of curious.
And yes, as you pointed out, right, there was, you know,
you have this thing of an ID, but which is some number.
And then you have all these features, like 21 features.
And then you have this thing called Target.
So it would target be the price of a stock or is that a way to think about it?
It's more, yeah, it's more complicated.
And we don't really talk about the data.
That's why we, we, uh,
we encrypt it and the way we do.
But it is worth thinking about it like that.
That a one is like up and then like a zero is down.
So if you have a target of one,
that means the stock is good, target of zero, it's bad,
in some abstract way.
Typically you don't really just want to make money.
You want to actually have, you know,
also hedge out risks.
So we put a lot of things into
the data that actually make the models very nervous about taking risk.
And so you take away all the patterns that are too obvious and actually risky.
But so it's not like, because, you know, for me, I would think, okay, so you want to make money
in the market, right?
And you're going to say, okay, there's Apple shares, right?
So let's say that would be a variable.
And then you're going to take all these things, maybe.
I don't know, consumer price index, GDP growth, exchange rates,
or all kinds of stuff.
And you try to understand, okay, what kind of dynamics are there that affect this share
of apples.
And then, you know, you give this data and then people figure out, even though they don't
exactly know actually what they're modeling.
Is that wrong?
Is that how it works or is that different?
That's pretty much, pretty much right.
We do global equity.
It's a global equity long, short hedge fund.
So we're actually not too interested in where is the market going to be next month.
We have no idea and we don't need to know.
We care about where individual stocks will be.
And we also don't really care about where interest rates will be or where GDP will be.
So all those things are kind of like macro variables.
And we are really looking for idiosyncratic stock risk that we should take.
So mispricings in stocks.
And it turns out you don't need to know everything about the macro economy to find mispriced things in the market.
So some people may be aware, which is always, you know, you sometimes see it on Reddit and different places where you have
have this technical analysis, you know, people do these sort of charts and lines. And, you know,
as I studied economics in college, so this always looks extremely ridiculous to me. But,
but I don't know, maybe it's not. So is, it's a lot of the data coming from that kind of thing,
too, that you say, okay, you're actually going to just look at this price feed, right,
over time and then try to figure out things like momentum, right? Because something like momentum,
you would see purely based on the changes in the price of that asset itself.
Yeah, we're definitely not doing like technical analysis like the pictures you see.
But obviously, like momentum is in a sense a technical indicator in that it is a based on price, based on total return.
So that's a variable based on price.
Things like PE, it's more about, you know, whether the stock is.
is cheap for its price versus just to do with its price.
Generally, on shorter time horizons,
you can have technical analysis style things,
like relative strength index and things like that.
They actually work on short time horizons.
But we actually hold things for a very long time,
so we don't really look at that.
But certainly, yeah, I don't think the chart
chart theory has much credibility anymore.
This episode of EpiCenter is brought to you by Ledger,
makers of the best hardware key security solution on the planet.
But Ledger is more than just a hardware wallet.
It's your path to eternal bliss and happiness and peacefulness.
Do I look like I'm losing sleep?
I am.
But it's not because I'm worried about my cryptocurrency,
my Bitcoin or my Ether,
and that's because I use a Ledger.
Ledger devices for multiple cryptocurrencies like Bitcoin, Ether, Zcash, and more,
and you can even secure your ERC Ethereum tokens with them,
or you can add the security support from Ledger to some of the wallets you already love
and use like Electrum, Copay, My Ether wallet, and others.
All your keys and segregated accounts are derived from one unique seed.
Seeds are generated on the device and are never exposed to the host computer.
So when you make a transaction, your ledger will present you with the details and kindly ask you for your confirmation before signing.
How polite is that?
So the best choice right now for anyone looking to invest in security is the Ledger NanoS.
It's a key chain-sized device that fits in your pocket.
It has a screen and buttons and connects your computer or Android phone using USB.
Look, if you're holding crypto and you're storing your keys on your computer, on your phone, or worse, in exchange, you know that's a disaster waiting to happen.
Don't be the person that loses their keys because they were careless with them.
So don't wait any longer.
Secure your Bitcoin, secure Zcash, secure ether.
Go to ledgerwalt.com and get your Ledger NanoS today.
We'd like to thank Ledger for their support of Epicenter.
Can you explain why are these data sets so valuable for hedge funds?
Why is there's such a, you know, keeping the data private?
Well, it is just like it's sort of the core.
So if we gave it out for us, our users could just go and start their own hedge fund with our data, taking everything we've bought and learned and then just running away.
So we wanted to make a system for us in particular where you would actually have no incentive to do that and actually have more of incentives to work together and to build value together.
But generally, the data sets are very expensive, and so hedge funds don't want to part with them.
And then they also have, there are lots of different kinds of data sets, too.
So there's some hedge funds that talk about buying satellite images to look at the parking lot,
see how many cars are at Walmart.
So there's all these different weird data sets and some of those are very expensive because it's like the hedge funds are to have to pay for the satellites and stuff.
So it gets kind of crazy, but basically you can't give up your edge and that's why it's so secretive.
But in the tech community and in other industries, actually you can both benefit.
And finance is just so so competitive that you can't have a situation where you're helping each other.
other and so their best hedge fund managers they can be friends for 30 years and
they have no idea what each other other's hedge funds do it's kind of weird so it
seems to be that the key here is to have private data but to enable
collaborations like computations collaborative computations on that private
data without disclosing what that data itself is yeah so what kind of technology
does numerai use for this purpose?
Well, yeah, so this is the key thing.
When you give away a dataset, one thing you could do is anonymize it, right?
So you could just say, okay, well, this stock is Apple.
I'm just going to call it ID number 7.
This stock is Google, I'm going to call ID number 8.
And so you take away all the names of the stocks.
But that actually isn't enough because the patents
inside that data are still so strong.
Like everybody will know, oh, I know Apple's P is 10.7
because I can cross-reference it with Yahoo Finance.
And so you can have kind of weird things
where people can do comparative attacks
to basically figure out your anonymization scheme.
So you really want to go much further than that.
and into encryption.
And so I started looking at what's called fully homomorphic encryption.
And it's this really cool, futuristic thing.
When it was invented, it was about a billion times too slow.
And then a couple years later, it was like a million times too slow.
And it actually hasn't got much better than that.
So it's this very, very slow thing.
And what it does, it takes one megabyte of data and turns it into 16 gigabytes.
of data. And that's a huge weight. Plus, it also makes it much harder to work with with machine
learning. So what we ended up doing was figuring out ways to do it differently. And we're using
something right now called neural cryptography in our encryption. So our VP of engineering, this guy
called Jeff Bradway from DeepMind.
And he came up with this way of encrypting the data that was using two neural nets.
So it's really interesting.
You have one that's saying, like, take this data, find some features in it that preserve the
signal of the data.
And then you have another net that's saying, okay, I'm going to try to, I'm the adversarial
network.
I'm going to try to decrypt what you just did.
And then you make them kind of fight with each other.
And the one is saying, I preserve the signal.
And the other one is saying, I can't decrypt it anymore.
Then you know you have something that that neural net can't decipher what it is.
So we do things like that to make sure that we have all the signal preserved,
but also it's very difficult to kind of invert the transformation that we made.
And so far, you know, we hadn't had anyone say what the data is or reveal it.
So most of them don't really care about it.
They would just want to solve the problem.
They don't really mind.
They don't really care what problem they're solving.
But that is kind of a critical technology that we've had to develop.
And I think we're pretty, you know, have done some really interesting things.
I don't think there's any company in the world with the problem that we have, which is like,
we want to work with everyone, but we don't want anyone to know anything.
And it's an interesting, creates an interesting dynamic.
I just wanted to sort of add one point here, because I guess even if people decrypt the data,
okay, they would have the data, right?
So that would be, you know, a certain loss.
but they wouldn't really be able to replicate what you guys are doing, right?
Because you also get the predictions from all these different people and presumably those aren't public.
So you are the only one that can then combine all those.
Is that correct?
Yeah, exactly.
And that is an important point.
We are not trying, we're not looking for the best model.
And then we're just going to shut the whole thing down and hire that guy, give him a job at Numerai.
We're really looking for lots of different uncorrelated models built on our data.
So what would happen actually if someone were to decrypted and say they were going to go start their own fund with their model,
sure, we would lose them, but our combined model will be way better than his.
So it's like even if you're the best user and you quit Numeri, the remaining users are still better than you when you put them to be.
when you put them together.
And that's, that's, this principle is called ensemble theory.
And it's kind of a big part of machine learning.
It's like if you have lots of little models that are different,
then you can actually make a, make a much better model than any one model.
So with these, with these models, there's like,
there's a way by which you source these models,
which is like through your crowd mechanism.
And then you also need a way once you have sourced these models to combine these models.
So let's first talk about how do you decide how to combine these models?
Are the combinations also sourced from the crowd or that's totally in-house?
Yeah, that's where we stop.
People have asked us, why don't we let the crowd combine the crowd models?
And then it's like, well, who's going to combine that?
But so we stop there and we, that's the kind of kind of.
of one model we make is the model that is the meta model that combines all of them.
We first have a few different measures. One thing we don't want to do is pay you for giving us
something we already have. So if you submit a linear model, we can run a linear model ourselves,
believe me, but we'll pay you. But if someone else submits a linear model, then he doesn't
get anything because he's very correlated with something that we already have.
So on Numeri, we're really trying to look for original models.
And so if you pass this test called originality, then you can kind of earn Bitcoin.
And if you don't pass, then you kind of don't own anything.
And the other one is concordance, which is like, it's sort of to do with how much we think
your model is kind of like shifty, like, because there's lots of different ways that users might
try to gain the system. And so we have another check for that. And then we rank them all based on
logarithmic loss, which is like a machine learning metric. And after that, we have basically about
300 or 400 really good models that pass all these tests. And then we do a layer of machine learning
on top of that. So we say, okay, let's try to combine these and you can take a simple average,
that does very well, and you can go even further and do machine learning, and then that does even
better. So that's basically how we do it.
And so the other part of your company is the sourcing of these models, right? So explain
to us how you source these models currently?
Well, all we do is I just made this website and put a data set up in 2015 and just people came.
It was a very interesting problem.
It's very difficult to do well.
It's extremely hard to do well on the dataset.
There'd never been a data science tournament on cryptidata.
There'd never been a hedge fund that was actually going to trade the predictions.
We also paid in Bitcoin.
And we also, one of the key things is we don't actually get the user's models.
All they do send us predictions.
So it's like we're licensing predictions from their models, not actually taking their models.
So they're submitting a bunch of predictions to us.
And that means that we don't know what their model is either.
So they have no idea what the data is.
We have no idea what their model is.
and that makes it symmetric in a way.
Users could say, you're not paying me enough.
I'm not going to do this anymore.
And we'll have to, they'd have a bargaining position.
And that's why I think a lot of the blockchain community likes numerize,
like this sort of trustless part of it, where we don't trust you with our data.
But they don't trust us with their models.
But so if you don't have the models themselves, how are you able to combine these models and apply machine learning on their combinations?
Because like you can't generate infinite, like, you can't play the model on different data sets indefinitely or can you?
Yeah, what we do is we make them submit on a big holdout set.
I think it's like 100,000 rows.
And they're submitting probabilities on all of those.
So having it be on this big holdout set, having it be like that they're submitting predictions over this very long period, we can actually train the model on almost like on their back test.
So it's almost like they're giving us probabilities that we can model without actually giving us their model.
So, but that's the importantly, that means we can only get new, new insights every time we do a new competition.
So right now we do a new machine learning competition every week.
And that's to get updates from our users.
It's actually to get probability updates on what they think is going to happen.
Cool.
Yeah, this is super, super crazy.
So that means that they go like a year in the future.
can you say anything about like how long because because you run it every week but presumably the
predictions that people give are actually beyond that time horizon yeah we end up holding uh things for
six to nine months um which is quite long in quant space um and so but we do like getting weekly
updates on okay maybe we should rebalance uh or sell um things if if suddenly the
all the numeria data scientists think we should.
And why do you hold them so long?
I think it's harder to compete in the shorter time horizon.
And there are very few people doing machine learning on longer time horizons.
I think there's basically lots of mathematical modeling happening in these short time horizons,
one week, one minute or less.
but going further, it's actually harder to do, and so fewer people do it.
It's harder to do from a machine learning standpoint, but it's actually easier to implement.
So to start a new hedge fund that does high frequency trading,
and it would cost you like $100 million just to get set up.
But if you can do longer-term predictions, then you don't have as many trades,
you don't have to worry too much about execution,
you actually don't even have to worry about being ripped off
because you're going to hold long enough to cover those costs
of being ripped off by high-frequency traders.
So it's like a sort of a special time horizon, I think, for AI.
And also, you know, sort of philosophically, morally,
maybe you want to be actually investing, right?
You don't want to be just finding, picking up nickels,
in the market, you want to actually be holding stocks that you like.
And don't want to have an AI that's just like fighting to compete in short time horizon.
You can actually have an AI that's like a real investor that's like making decisions about
buying stocks that actually people would want to hold.
That's very interesting how you sort of, you know, almost make these moral judgment and emotions
about these like predictions that the AI is going to create?
It's not a judgment.
The high frequency people are doing doing something good for the market.
But I do think it's over, overdone.
And you never want to be doing the thing that's kind of already happened.
So if there's a book about it, you don't want to be doing it.
There's no good book about blockchain.
So you probably still want to be doing it.
Cool. So tell us about, tell us how this tournament works. So let's say this week, right? So, so you, you run a weekly tournament and I'm somebody who builds a data model. Can I use my model that same model week after week after week and keep earning payouts?
Yeah, you can. You need to develop something new? Yeah, you can actually, because the training set remains the same week to week. We're just asking you to predict on the latest live day.
basically so because the training stats say it's the same you can actually set up a
server can automatically send predictions as soon as new numeride data comes
comes about and so some people have actually done that and so it's not actually them
spending hours and hours working on it they just put a server up and and
we're pinging that server effectively so so basically like you're asking all of
these different models, so let's imagine there's like the 100 data scientists having 100
different models and here model is basically like a function. There's some incoming data
and there's a function and it's going to compute an output and that output is a prediction
of some form on how a certain segment of the stock market is going to move. And so you have
so for example I have a model here. I'm giving you a prediction about how
certain part of the stock market is going to move and there's somebody else around the world that's doing that and there's hundreds of us
basically you're pinging all of our servers and getting all of these predictions and then you have some method of like combining these predictions and converting them into actionable trades on on your side exactly yeah
so how does your tournament identify so let's say like 10,000 people submitted models and in the end you maybe want to
only 100 that are unique.
So how do you kind of incentivize people to give new models?
How are they exactly paid for it?
And how, like, for example, my question is like,
so I make a model that's unique and somebody else makes a model that's unique for a different part of the stock market.
How is the magnitude of how much I make versus how much he or she makes determined?
mind?
It's based on your live performance.
So what we do is if you submit a model, we don't give you any, no matter how good it is,
we don't give you any money until a month later when we can see how well it worked on live
data.
And a month is a decent amount of time to see whether the model is behaving in the way we
thought it would.
And so everyone is ranked based on their logarithmic loss, which is just like how many times
you are right effectively.
And so what we do is we rank them based on that and then pay them based on like a schedule.
So the person who's coming first makes about $500 and the second $400.
And that goes on all the way down.
But what so that's and that we pay those dollars, we've paid those dollars in Bitcoin.
But the other piece that's new is this is our own cryptocurrency, Numerare, where actually people are earning hundreds of times more money based on that than they are from, from dollars.
So that's the key innovation, I think, as well.
where it's like people wondered how you're not you're not paying these people enough they could just
go work somewhere else and I think a lot of people weren't doing it for money they were doing it
because it was interesting but now by having our own cryptocurrency it's it's like we can actually
afford to pay them a lot of money and the top users are making a lot of money right now so tell us
how this cryptocurrency works the key thing that the smart contract
can do. So it's an Ethereum, ERC20 token. It's called Numerare. It's live. It came out about
two weeks ago. And what you can do with it is, you know, we always try to figure out which
models are good. That's our big thing. If we can tell which models are good, we can make a better
meta model and we can make more money in the hedge fund. So the one way you know a model is good
is if someone's willing to back, back it with their own money.
And so actually, when I was at my old job, I was like, we have to do this machine learning fund.
I promise it's going to work really well.
And I actually, and they said, okay, we'll put your own money into it.
And so I put money in with the firm's money and my boss's money.
And we all put money in because we were prepared to stake our money on it.
And so by having that, you actually work much harder because you have a lot to lose.
You don't just, you're going to say, okay, well, let's see what happens.
I'll give you five models, and one of them will work, and we'll just see what happens.
You have to really commit and say, well, this is the one I want to put my own money on.
And so that dynamic, we weren't capturing at Numeri before Numerare,
because no one had anything to lose.
So we created this cryptocurrency.
We gave about one million to our users.
And we said, OK, you can use this to stake your models, which means a little bit like taking a side bet that your model is going to work.
And so by doing that, you're expressing your confidence as to how much you think the model is going to work.
And if you're right, we actually pay you dollars.
So if you stake numerare, you earn dollars.
And so because of that dynamic, that the numerare is connected to the payouts, if you bet a lot and you have a good model, you'll get this payout.
That actually gives numerare value as well.
And I think that's also what's, so it's sort of like necessarily people value it above zero because it's a way to actually claim dollars in the payouts.
And it has worked extremely well.
I mean, we've seen the stake models be very, you know, much better than the regular models because people are staking on them.
So it's like this final piece that we needed.
And in a regular hedge fund, they have the situation where all the employees have money in the fund.
And we couldn't really capture that with our user base.
And that's why we started to think about blockchain in ways you could do that.
Where everyone actually is in it together and has a lot to gain and a lot to lose if Numeri wins.
So just to kind of clarify how Numerary works, does that mean you created, have all the Numerary been created or are new ones being created continuously?
Or how does that work?
Yeah.
21 million max that's the cap and it'll get there I think in four or five years so
every week we mint new ones and it's in the contract so we can never change we
can never change it and right now we've we haven't minted yeah we only gave
away about a million to our users so there's still plenty more to be to be one as
time goes on yeah yeah the idea though is as well is
if your model does well and you staked it, you win money, right?
But if your model does badly, we actually destroy your numerator,
provably on the blockchain. It's not like we're taking it for ourselves or something.
We just destroy it. And so these dynamics of the minting of new coins
plus the sort of destruction of old coins,
and also the coins tending to be held by the better data scientists over time,
I think it's going to play out a very interesting way where the people who it becomes more and more valuable to people to have it and to hold it.
So does that mean because if I don't have Numerair, I can still participate, right?
And if my model is really good, I can still get some money, dollars and Numerare or do I only get Numerare in that case?
You get dollars and Numerare for the main tournament.
You can just do, anyone can do that.
and then the staking tournament, you risk Numerare to win dollars.
Okay, okay. And so let's say I'm putting in, I'm risking $10,000 worth of a Numerare.
That means I can then earn double my money or how is that kind of mechanic?
It's actually in a kind of complicated multi-unit Dutch auction mechanism.
But it really is the kind of outcome of it is that the more you stake, the more you stand to win of the payouts.
And also, because of the way the auction set up, there's a sort of property of self-revelation,
which means it's rational to bid the true amount.
So people are participating and we can, because of the auction mechanism, we can actually say,
oh, given that you did that and given that you're rational, we can say, oh, okay, we know what you're,
we know what you've got.
We know you have a very, very good model.
We know that you believe your model is going to work with 90% accuracy.
So that's a very interesting part of this.
And the idea that the bet, the stake, is actually communicating information to us.
Let me understand if I get this correctly.
So let's say in the future two years or three years and this has become a big thing.
So do you think that most of the money being paid out is actually going to be paid out
is actually going to be paid out through these side bets.
And so that if somebody comes in and they say,
I have really good models, right?
I know I'm doing really well here.
So I'm going to go and buy Numerare and sort of bet it on my model.
And then I can make much more money and that this is kind of how to demand for Numerare
would be driven.
Is that how you see it play out?
Yeah, exactly.
And we've seen,
we're seeing things like this now.
like there's a there's a sort of hedge fund crypto hedge fund well just a fund I guess that bought a bunch of
numerare in our Slack from from our users like OTC trades before it even existed and I was like
why are they doing this are they just speculating on the price and it was actually that they
they wanted to use it in the staking company
They wanted to basically hire data scientists to work with them.
And they would be like kind of lending them Numerare for them to stake if they didn't have enough, but they had a good model.
So it's very interesting to see these kinds of things.
And I heard other people say since the launch of Numerary, they've quit their jobs.
And now they can actually do Numeri full time.
And they're working with, you know, maybe Spenger.
is you have a bunch you can't use it because it's kind of useless to them because they can't stake and then they work together
and swap it and buy it and so it's very it's very interesting to see and I think it's yeah it's only the beginning
I mean we've just been going for two weeks and you have these outcomes already it's amazing and I guess what
sort of is interesting here right because you guys are switching to paying out in in ether so you know I could and since
Numerware is in the RC20 token, you know, you could even imagine that somebody creates some smart
contract there where, you know, people can put in Numerare there and maybe that somehow gets
lent at an interest rate to people submitting models. Like, do you see that kind of application
being built here? Yeah. That is, yeah, also a recent announcement. We're not going to use Bitcoin
anymore, just pay everything in Ether. And it really does allow for, it's big, it sounds,
sort of silly, but it's a much bigger change than you would think, because we can actually do
the payouts kind of on-chain in the same place as where the numerare stakes are happening.
So you could be a speculator. I could say, I don't have any numerator, but you have a thousand.
Let's make a deal. If my model works, we'll share the profits. You'll say, well, I have a
no idea whether you're going to honor that contract because Numeri is paying you and how do I know
you're going to share it with me and you can say no I'll build a smart contract and the smart
contract is actually the entity that is playing on Numeri and the smart contract automatically
whenever it wins money it automatically shares it between you and so that concept of
making dows that are connected to numerai, which is connected to the real stock market, is extremely
compelling. And I do like that there's a lot of innovation in blockchain and a lot of things
like Prism and things, but a lot of them are on the blockchain. And I think it's pretty cool to have
like this one company that's actually connected to the stock market, but like pulling out all the
wealth from the real world into crypto.
This episode is brought to you by ShapeShift, the world's leading trustless digital asset
exchange, quickly swap between dozens of leading cryptocurrencies, including Bitcoin, Ether,
Zcash, Gnosis, Monero, Golem, Auger, and so many more. When you go to Shapeshift.
.io, you simply select your currency pair, give them your receiving address,
sent the coins, and boom. ShapeShift is not your traditional cryptocurrency,
You don't need to create an account. You don't need to give them your personal information and they don't hold your coins. So you are never at risk from a hacker or other malicious actor.
Shapeshift has competitive rates and has even integrated in some of your favorite wallet apps like Jacks. So you can swap your digital assets directly within your wallet just as easily as putting on your slippers. Whenever you see that good looking fox, you know that's where Shapeshift is. So to get started, visit Shapeshift.io and start trading.
And we'd like to thank Shapeshift for their support of Epicenter.
Can you use your data mode to trade cryptocurrencies themselves rather than stocks?
We could, but there's not as much data in crypto.
So we have, for stocks, we have data going back 30 years.
But for crypto, it's not as much.
And they're also fewer cryptos.
So it's hard to decide.
well, you always want more data.
So you have 10,000 stocks per month for 30 years.
But with crypto, there's like 50 cryptos that you could trade.
And then you go back like one year.
So it's not really quite right yet.
I think it would be great if all the stocks in the world were on the blockchain.
Then we would really be at a huge advantage.
But I'm not sure when that's.
going to happen, but I'm sure it will. Why would stocks being on the blockchain be an advantage to you?
The hedge fund industry is like very structured for for the bigger hedge funds. So it's not really like
structured in a very fair way. Like the biggest hedge funds who need them, who need them, who need it the
least get these huge discounts from like the prime brokers like UBS and Goldman Sachs and
and so it's just very good.
difficult to even start trading and you need lots of capital.
So I think if it were to be on the blockchain, it would be much more open and like
everything would be visible and you would actually have a much more fair market.
I mean, speaking about that, to me, it feels it feels like there's kind of a piece of
decentralization and crowdsourcing missing here, right?
which is you crowdsource the data, you know,
basically decisions on the trade.
But then on the other side, when it comes to the money that you're actually investing,
you know, you're structured as a traditional hedge fund.
So I think your, you know, investors are institutional investors or I guess the kind of
people that would invest in hedge funds currently.
So do you also see that happening that you say, okay, you're going to kind of
crowdsource the pool of capital?
Yeah, we are a proper.
US hedge fund. And there are rules around accredited investor status. So you need to be an accredited
investor to invest in a hedge fund. And you also need to, yeah, and then they're also very high
minimum. So it's really not worth our while to manage less than a 50,000 from someone.
we'd be making a few hundred dollars per year from that person.
So you want to have these much bigger check sizes.
And the hedge fund can really only be profitable at like 300 million AUM.
So you really want to talk to institutions, not individuals.
That being said, right now it looks like you can get $300 million in a crowd sale
if you have good branding or something.
But I just think it's probably the wrong, yeah, probably the wrong thing for us at the moment and will be for a long time because I don't think the regulators really want individuals to be invested in hedge funds.
It's a little bit like they're protecting, protecting investors from risks that they don't understand.
Yeah, although I guess we are also starting to see now with Coinlist first sort of ICO,
platform that's limited to accredited investors.
You know, so maybe we'll see that kind of crowd sale too.
Yeah. Yeah, it's interesting that's, uh, I like what they've done there.
And that that could be a way or way in the future where you could actually raise money.
So you did this fantastic interview with a chase Nicola Kahnis on on this
weekend startup. So we'll definitely link to that in the show notes.
But one of the things you said in there was kind of, you know, striking
me and that's a crazy statement but you said in the long run you see numerai invested managing all the
money in the world can you explain like what you mean with that yeah this uh this this that
statement i actually i made that to a journalist in um a few months after starting numri so i hadn't
even like really hadn't barely started so and i just said that i want to manage all the money in the
world. And I don't remember saying it, but then it ends up sticking. And it is true, though.
I do think that's what we, that's the goal here. I don't think, okay, think about how many people
are working in finance. Some in the best universities, about half of American, the best universities
go to finance. And they're basically playing a zero-sum game.
and with each other.
And we are not having nearly as much investment in basic science and technology and stuff as we used to.
Because all these people have gone to Wall Street.
And that would be fine if they were all working on something.
But they're basically working on the same thing, just in different offices, competing with each other.
So that's a really big problem.
If Numeri can make it be possible for those people to do other things.
And actually, a system like Numeri could do that.
There's many things in the world that have been automated.
Like, think about the phone switching.
There used to be people switching the phones to make calls, connect calls.
And now that happens automatically.
And I think we could have something like that.
Like if you're a public company, there's this thing called Numeri.
It'll tell you what the price is.
tell you what the price is and it give you liquidity and it'll just be like this thing.
And you don't really need many hedge funds in a world where you have an open hedge fund.
And I think Numerize are most likely candidate for this, where we can leverage blockchain and AI to actually be the last hedge fund.
And so I think that other hedge funds, it will be very difficult to compete.
Even the best, the biggest hedge fund right now has about 150 PhDs working at it.
That's the biggest one.
And we have 20,000 data scientists on Numeri now.
So that's already like, you know, a lot more.
And it'll be hard.
I mean, we'll definitely have 100,000 in a year.
So it's just like I don't really see how.
it will work for other hedge funds to compete with us.
Your answers were really great.
But with this one, my feeling is,
imagine there was only one hedge fund in the world, right?
And it was managing all the money of the world.
By definition, its returns would be the same as the returns of the market.
It couldn't beat the market if it were managing all the money in the world.
So I just intuitively feel that the bigger a hedge fund gets, the lower the margin by which it beats the market, until it gets to such a big size that it becomes the market.
And do you really think that in the hedge fund industry, they can any day be the single hedge fund that wins it all?
Why do you need to beat the market if you have all the money in the world?
it's like if i'm not if i'm not beating the market can i just not own index funds why do i need a hedge fund
it is but i mean this sort of happened to you know happening to warren buffett and uh and other hedge
funds you know he's got so much money it's very difficult for him to return anything different
than the s mp 500 return um but i i think it's it won't yeah it won't really be framed in this way
I guess think about like Bitcoin.
That's how I, like, Bitcoin is a sort of like a way to do money transfer.
It's like a bank.
It's sort of like a sort of bank, but it's like got this weird crypto feel.
And you can keep money there and you can transfer money.
And I think that's what we're trying to do is actually not really make another bank,
but make like the bit, or like another hedge fund, but make like the Bitcoin of hedge funds
where it's like this protocol thing.
and it's not really, it doesn't really play by the rules.
And it actually is a sort of utility for the world that no one really owns unless they own the cryptocurrency or something.
So I think that's kind of where things are going, that like this idea of making a protocol, something open.
I can see, yeah, I can see something like that working this century.
I mean, I can also see, I can see that claim, okay, we're going to, or that aspiration, we're going to manage all the money in the world.
I can see that direction being possible.
If you change something fundamental about how Numeri works, which would be you'd have to open up so that other people, other funds could go there with their own pools of capital and their own data sets.
and basically, you know, have the same data scientist, the same community run the thing, right?
Because, you know, you're a U.S. hedge fund, right?
Well, but there will be other ones, first of all, in other markets that, you know, they can't put money into a U.S. legal entity, right?
So you'll have that factor.
And of course, the other factor is also going to be, you know, we've already talked about how a data sets are this proprietary thing.
You know, some people are going to have these satellite data.
Other people are going to have all kinds of other data.
So it also makes sense there.
that, you know, there will be competitive differences in the type of data you have,
and that's not going to go away, right?
So it would be, it makes sense if those people also can have used numerai plug in there.
You know, maybe they'll pay some kind of fees via numerare or, you know, somehow drive the value
of the platform, but can compete in some way too.
And maybe even compete then in terms of, as a data scientist, maybe I, I,
I will be able to choose, okay, you know, this fund pays me better than that fund.
So that's where I'm going to try to do the best model for.
Yeah, I definitely see that.
I think first, there is one thing you said there, which is, uh, what if you have,
you have your own data set and you want to bring it to numerai?
Like we don't have any way for you to be able to use your own data set or
even your own analysis or you have to use.
are encrypted data.
And that's the only way we do it.
But I do think they could, and we have been thinking about things around,
what if we allowed people to submit data sets, you know,
and I have been thinking about that,
but we don't have anything to comment about that yet.
But on the other hand, you know,
letting other funds be on the numeride platform,
it would be great for them to give us their data.
But I don't think you kind of get a lot of benefits from having it be one fund.
And I think what most likely happen is we have very, very low fees because we can charge way
lower fees than any other hedge fund because just the way we do things.
We have everyone outside the company.
We only pay the people who are good, not the people who are bad.
So I think we could have extremely low fees in our hedge fund.
And actually that will be an incentive to have other hedge funds give us their money or other, their LPs switch from that fund to Oz if they don't.
And I think the fee in how we could have lower fees would actually be a very big innovation too.
And I'm optimistic that and you know, the hedge fund industry is very efficient.
Once something starts working, people really go for it.
And so it can actually happen very quickly.
You have a hedge fund with managing $5 million.
In the second year, they're managing half a billion.
And the third year, $5 billion.
So it's kind of amazing how quickly it can scale.
And so if we can have higher returns because we have more talent and lower fees,
it'll be a really good hedge fund product.
so uh what's one thing that's really interesting about numera is um this is one of the first
startup slash projects that is i think successfully merging the blockchains and the ai slash neural
network space right like these two these two's technology areas seem seem to be like silos they're
like experts in this area and experts in that area very rarely do you get uh get a mix
of it. Is the particular technology you're building applicable to other kinds of data fitting
problems like computer vision or something like that?
The machine learning we're using?
No, not the particular machine learning you're using, but like this way of building
a data model.
So essentially you're coming up with this function, right?
Like your company is this like this building this giant mathematical function that takes all
of this data as input and gives out predictions.
Now, I see like a lot of problems in like neural networks and artificial intelligence are
like that, right?
Like it's given all of this, let's say, visual data, what's a function that can result
in the identification of objects inside that particular piece of data?
So is your general approach of like crowdsourcing data models and combining them,
order to have a meta model and that meta model like predicting things is it
applicable just to this particular financial feel of like stock trading or stock
investing or can it be really scaled across different kinds of other kinds of
problems yeah some people have said what about um healthcare data because um it's
interesting you don't want to actually share healthcare data because it's sensitive so
Numeri seems like the perfect thing.
You can encrypt the healthcare data, let people model it,
and then find the best model.
I actually think Numeri might be one of the only applications
for crowdsourcing machine learning.
I don't actually think the healthcare one would work,
or really any other one.
And the big reason for this is in the stock market,
a very small edge.
really matters. But in other industries, it's actually many other kinds of
frictions that matter. So for example, they're already brilliant neural nets for
diagnostics. So you can have a neural net diagnose lung cancer or whatever based on
an x-ray better than a human. But it's really regulation and and just general, you
You need the doctors to believe it, you need the patients to believe it, you need some new laws,
you need all these different things, and actually what you don't need is just a 1% improvement.
That's actually not very, it's not going to move the dial on the friction.
So I think many other industries have these frictions and finance, it doesn't, you can actually,
no one can stop you from trading if you find, find an edge.
It's kind of amazing to me when I was a kid, you're saying, how come I can buy stock in a company without asking
their permission. How can you buy Apple stock without asking Tim Cook's permission?
And it is this amazing thing. It's public. It's your, it's totally, totally up to you
what you buy. And every other industry doesn't really have that. You can't, it's always
quite a lot of churn to get, to move things forward, but the stock market's kind of like instant.
So we're almost at the end of our episode, but I'll be curious, what's on the roadmap,
what's coming up for Numerai in the next two years? And then where do you see the project in,
like let's say two or three years from now? How, what stage is it going to be at?
We were working a lot on Numerare and it was, you know, we had to get it
audit, smart contract audited and things like that. And, um, and now,
Now it's just very interesting to see what people are doing with it and how much it's traded
and staked and it's amazing.
But I do want to build out that part.
I think that's the key thing is like how do we have a whole ecosystem that's kind of plugged
into this to the numerare smart contract and staking models and just generally develop
it.
this idea that people would actually speculate this would actually team up with with data
scientists. I had no idea that would happen when I was coming up with it. So, and, and then definitely
user growth is something we've never focused on. It just happens. But I do think we could do,
we could do more. We could also do more data. I think we, our data set, you know, their new,
there are new data sets coming that we haven't announced.
Yeah, I mean, if we do have good returns in our fund over the next year,
like that will be really a huge kind of proof of everything working,
because we will have had the numerare out and lots of data.
And we only recently did this neural cryptography thing,
only recently did the numerare.
So proving these things are working is going to be very important.
And then the question is, how can we manage a lot of money?
And a lot of, like you were saying earlier, it's actually difficult to manage more money as it goes up.
So figuring out models that haven't ever been invented and that can do huge capacity is something we have to do and take billions of dollars.
And can you share anything about the returns?
far is it working you know you're beating the stock market or hedge fund averages we actually can't
talk about the returns or the AUM because of regulations and so we you know we're not trying to
pitch when we do a podcast like this it's to you know get people to learn about the company
not not to pitch their hedge fund well Richard thanks so much for coming on that was super
fascinating learning about Numerai it's an
incredibly interesting project.
I think it's probably the first of many that we're going to see at this intersection of
blockchain and AI.
And it's just incredible, the surprising, strange things that are coming out of this.
So yeah, so thanks so much for coming on.
And of course, thanks for a listener for once again tuning in.
It was great being here this week.
So we'll be back with another episode next week.
And of course, we are part of the list of Bitcoin Networks.
you can find this show and other shows on Lysopricon.com.
And if you want to support the show,
the best thing you can do is you can leave us an iTunes review,
which helps new listeners find the show.
So thanks so much, and we look forward to being back next week.
