Epicenter - Learn about Crypto, Blockchain, Ethereum, Bitcoin and Distributed Technologies - Nansen: AI & Blockchain Analytics - Alex Svanevik
Episode Date: August 9, 2024The saying goes that knowledge is power, and this perfectly applies to blockchains due to their innate transparency and immutability. However, raw data could seem, at first, unusable. This is where an...alytics companies, such as Nansen, play a major role in demystifying blockchain data by labelling it. In turn, curated data holds value as it can give an edge to traders, by tracking ‘smart money’ wallets. In the age of AI, most of the heavy lifting of data analysis is performed by LLMs, but human input is equally valuable for discerning nuances and fine tuning the process.Topics covered in this episode:Alex’s background, from AI to cryptoNansen’s valuesQuerying blockchain dataSupported chainsEnsuring data accuracyGenerating wallet labelsThe role of LLMs and AIData privacy and monetisationOn-chain transparency, privacy and ethicsRoadmap and further enhancementsEpisode links:Alex Svanevik on TwitterNansen on TwitterSponsors:Gnosis: Gnosis builds decentralized infrastructure for the Ethereum ecosystem, since 2015. This year marks the launch of Gnosis Pay— the world's first Decentralized Payment Network. Get started today at - gnosis.ioChorus One: Chorus One is one of the largest node operators worldwide, supporting more than 100,000 delegators, across 45 networks. The recently launched OPUS allows staking up to 8,000 ETH in a single transaction. Enjoy the highest yields and institutional grade security at - chorus.oneThis episode is hosted by Friederike Ernst.P.S.: Our friends from @nansen_ai have offered us 10 discount codes for 10% off on their professional and pioneer plans! If you are interested in unlocking Nansen's true power, DM us on Twitter (X - @epicenterbtc) and we'll hook you up with a code (FCFS).
Transcript
Discussion (0)
So we sort of industrialized the whole EVM chain ecosystem,
and we can onboard EVM chains actually quite fast now.
We have 3 to 400 million addresses labeled.
For every one of those labels, we have evidence and documentation.
And of course, a lot of that documentation is algorithmically generated.
It can get out of hand really quickly, and you can get like a negative spiral if you start getting the wrong labels.
You know, living up to our name, we've seen.
started using more AI for the labeling. The challenge there is like you might end up with probabilistic
labels and like I was saying before you want to make sure that the precision is as high as it can
possibly be. The machine is going to be doing like 99.95% of the work in terms of just the
quantity of addresses. But the 0.05% that the human does can be very valuable and it can also
be used by the machine to lay a lot of the stuff.
This episode is proudly brought to you by NOSIS, a visionary collective committed to fostering
and expanding applications for a decentralized future.
NOSIS is at the forefront of innovation with NOSIS pay, circles, and Metri, revolutionizing
open banking and creating a superior form of money.
With Hashi and NOSIS VPN, they are building a more resilient and privacy-focused open internet.
Are you seeking a robust L1 to launch your project?
well, look no further than the Nosis chain.
Enjoy the same development environment as Ethereum, but with significantly lower transaction fees.
And with a robust network of over 200,000 validators,
nosis chain stands as a credibly neutral and resilient foundation for your application.
Governance of Nosis is driven by NOSIS Dow, where everyone has a voice in shaping the project's future.
Join the NOSIS community today by participating in the NOSISDA governance form.
You can deploy your project on the EVM-compatible and highly decentralized
nosis chain, or help secure the network by running a validator with just a single GNO and low-cost
hardware. Embark on your journey towards decentralization today at nosis.io.
Chars 1 is one of the biggest node operators globally and help you stake your tokens on 45 plus
networks like Ethereum, Cosmos, Celestia, and DYDX.
More than 100,000 delegators stake with Chars 1, including institutions like BitGo and Ledger.
Staking with Quarice 1 not only gets you the highest years,
but also the most robust security practices and infrastructure
that are usually exclusive for institutions.
You can stake directly to QuarS1's public note from your wallet,
set up a white table node or use the recently launched product, Opus,
to stake up to 8,000 eth in a single transaction.
You can even offer high-year staking to your own customers using their API.
Your assets always remain in your custody so you can have complete peace of mind.
Startsaking today at chorus.1.
Welcome to AppaCenter, the show which talks about the technologies, projects and people driving decentralization and the blockchain revolution.
I'm Friedrich Erns.
And today I'm speaking with Alex Vanuick, who is the co-founder and CEO of Nansen, which is a blockchain analytics company.
We'll discuss them in a lot of detail in just a bit.
it's a pleasure to have you on Alex
before we get
started with Nansen properly
tell us about yourself
what's your background and
how did you end up
where you're now?
Yeah, great to be here
so
depends how far back we want to go
I guess my background initially is an AI
that's my
degree from university
in Edinburgh, UK
so I was an AI before
AI was cool is what I like to say
So I spent a few years working with data science and machine learning, also a few years in management consulting.
And in 2017, I discovered Ethereum during lunch at work.
Some engineers were very excited about it at the company where I was working.
And then I fell down the rabbit hole very quickly because I think several people started talking about Ethereum at the same time.
And so it sort of piqued my interest.
This was the summer of 2017.
And so a few months later, I decided to leave my job as a data science manager at the time.
And I moved to Hong Kong to join a startup in the crypto space.
So that was how I basically got into crypto.
I would say I was not definitely not one of the earliest in crypto.
I felt that was very late at the time joining crypto.
Now I feel like I'm kind of a veteran.
almost, which I guess speaks to how young the industry is.
But basically, after working a few years in crypto, both with a startup, that unfortunately ran
out of money pretty quickly, I spent some time with the decentralized exchange protocol
zero X, helping them with analytics and understanding slippage across Texas and things
like that in 2019.
We worked a little bit with Aragon as well, the Dow platform.
farm and as mostly as a consultant helping them out with data and analytics.
And then I co-founded Nansen, our company, late 2019, or at least that's when we started
working on it.
I know you went to market April 2020, about one month after COVID started when everyone
was gambling on governance tokens and yield farming during COVID.
So that's kind of how I ended up where I am now.
So with your background and kind of makes sense that you would found a blockchain analytics company.
You also have a background in AI and I assume general IT stuff.
So was it kind of the fact that you felt like you had already done this before?
Or did you kind of, was there a larger mission to kind of ordering this mess that kind of actually is kind of like if you run an archive note, you'll learn nothing unless you kind of
do proper analytics on it, right?
So what was the main motivation to kind of go into this, you know, full time?
I think it's a good question.
I think there are a few different things happening at once.
I will maybe say that firstly from like a career perspective,
I thought of it as like a van diagram of two competencies,
one being data and the other one being blockchain.
I figured that if you are very good at both of those,
you probably end up in a intersection that's pretty small.
So from a career perspective, I figured that it's a good idea to learn about these two things
because not that many people in the world are going to know about those two things.
That was probably from a career perspective.
And then I also very rapidly became like sort of enamored with the crypto industry
because when I was interacting with people on Twitter and Telegram and things like that,
I found that people were very open-minded and they were very interesting.
inviting in a way that was almost a bit surprising to me. I kind of thought of crypto as being a
little bit kind of, you know, almost like antagonistic or adversarial because it's very technical
and all that. But I found that people were very open-minded and sort of intellectually quite
interesting. So I think that also appealed to me from like almost like a culture perspective.
And then there was another thing that was I think more specific to data, maybe even more specific
to Europe, you know, which is where I was based at the time, which is GDPR, like privacy regulations
were kicking into full force. And I think I kind of became a bit frustrated as a data scientist
working with data and there were so many regulations you had to navigate that it became really
hard to do your job, frankly. Like it was because everything, you had to, you know, check all the boxes
this and all this stuff. And so I was, I kind of wanted to work with data that was not like customer
or user data. I wanted to just work with data sets where you don't need to like fill out a form
to be able to use them. And so blockchains were interesting because so much of it is from public
blockchains, right? The information is just there and the data is there. And so you don't need to ask
permission for someone to dive into like all this exciting activity that's happening on chain.
So those were some of my personal reasons.
And then I think if I look at it from like the market opportunity side, you know,
you didn't have great analytics tools or products at that time.
I think there was really only one game in town for on-chain analytics, which was chain analysis.
I mean, you had other products like I don't want to belittle them.
But like most people knew about chain analysis for.
AML purposes.
But I felt that, you know, people who are just in crypto, who are trading, who are investing,
who are using blockchains, they're not necessarily law enforcement or tax authorities.
They should have great analytics products too.
And so I felt like there was an opportunity there to just provide them with a better
product so that they could understand what's happening on chain and they could make
better decisions investing.
They could, you know, make better decisions
building products and protocols and building
blockchains or L2s now these days, right?
So, so yeah, there's kind of,
there were many different factors that sort of led me,
led me here.
I can chime in here and say that as a blockchain founder
myself, I have used your tool a lot,
particularly for one use case,
for which it just beats
all the alternatives out there.
And that is kind of, you're very good at labeling wallets and kind of saying who you think they belong to, what kind of person or kind of it is.
And I'm super interested in how decentralized our token ownership is, right?
So kind of I would go to kind of like the Nosis token and kind of just listed by kind of like which address holds how many.
and I mean in the beginning
I knew who a lot of the people at the top of the list were, right?
I mean, that's kind of how project start out.
But kind of, how far can I go down the list
until I find the first person who I honestly don't know who that is?
To me, that's always been very comforting
to know that kind of like there's lots of people out there
who are involved in the project in some way.
and I have absolutely no idea who they are.
And I mean, that's only increased over the years.
So yeah, but I, yeah, so this is how I, this is how I first learned about Nansen,
I think when it came out in 2020 or so.
Yeah, you bring up that, that's like a very common use case, right?
Especially among builders who want to just understand their investor base or like who's
holding the token.
And I think you're right that this is also one of the opportunities.
that we saw that, you know, in a way, it wasn't that interesting to just get the blockchain data,
because in theory, anyone could do that. The hard part is to figure out, like, what's the entity
that's associated with the address? And we saw an opportunity there to sort of help people
get more transparency on that front. And, you know, that is one of the core things that we do very well, right? And we
have at this point like three, three, maybe even 400 million addresses labeled at this point.
I can speak to that as well because this is also one of the use cases I use Nansen for.
I check whether I have docks myself. So obviously kind of I have assets on different addresses
and I try not to, I try to kind of keep them apart. So kind of if you guys don't, if they're not
labeled on Nansen, I think I'm probably okay. Yes. Yeah, that is that is true. I mean, maybe it,
you know, we should also just call out that, you know, if individuals have their name label
announce and they can contact us and we will, and if you want to remove it, we will remove it.
There's a bit of nuance to that because sometimes people inadvertently docks themselves
on chain. So they might like buying a dot-eath name or something like an ENS.
And we can't do anything about that because that's immutable and like etched into the history
of the blockchain. But yeah.
So, you know, there's, this is like kind of a blessing and a curse of blockchains, that they are transparent, they're immutable, et cetera.
So something that I've always wanted to ask you, is Nansen actually named after Fridge of Nansen?
Yes, it is.
Okay, fantastic.
Yeah, that's right.
Maybe tell us about Nansen and kind of why you said it on that name.
Yeah, I mean, I think a lot about culture in the context.
of a company or a project and I felt that Nansen is kind of an embodiment of the values
we have in our company and so values like courage, curiosity, you know, Nansen, for those
who are not aware, it is most famous for actually having been a polar explorer. He crossed
Greenland on skis as the first person. He went as far north on the globe as anyone had
ever done and the same ship that he used. Another polar explorer reached the South Pole first of any
human being. But he was also a scientist and interestingly it sounds like you know him for his
work on creating passports for refugees. Yes. Yes. Which he did for almost half a million people.
For stateless people, right? Yes, for stateless refugees. I think mostly around Armenia. So he was
He was a kind of a renaissance person, an explorer, scientist, a humanitarian, even had played a big role in the creation of modern kingdom of Norway.
He convinced the prince of Denmark to become the king of Norway so that they could become independent for Sweden in 1905.
But yeah, so he's an embodiment of a lot of the values that we live by at Nansen.
curiosity, courage, transparency, speed,
which is important when you're doing an expedition.
You want to make sure you get there in time before you starve
or run out of what you need.
So, yeah, so he's kind of an icon.
And I think, like, I also kind of like the idea that,
a bit similar to Tesla, right,
where you've named the company after someone who's not the founder,
but it's like an inspiring person.
and then it's two syllables.
It's easy to pronounce in any language, which is nice.
So yeah, those are some of the reasons why we named the company else.
And was the URL nansen.a.i from the get-go?
Yes, it was.
But I will say that the dot AI was aspirational in the beginning, in the sense that, you know,
so I said that in the beginning.
My degree is an AI, and I always knew that we would be making use of AI for what we do, things like labeling addresses.
Now we use AI for estimating the price of an NFT that's fully machine learning powered and is part of our product, which is actually kind of non-trivial, right?
If you have a specific NFT, how much is this one valued based on its traits and transaction history, etc.
We use AI to weed out spam tokens, which there's a lot of, especially on chains that have lower gas fees and things like that.
We use AI for personalizing signals in the product.
So I think, you know, Nelson, you know, we did have some foresight in that we knew that AI was going to become, you know, a big part of the world.
It happened admittedly a bit sort of faster or more suddenly than I personally expected.
but we're leading into AI even more now than that we were originally.
And so it's not like there's one AI angle with Nansen.
It's more like it sort of powers the whole product in many different ways.
So yeah, that's always been the ambition to make sure that we are an AI trailblazer
and we're making use of AI in great ways in the product and in the organization.
Cool.
Let's maybe dive into the core of your product.
So kind of like you started out with,
and that's still very much your co-offering,
is kind of analytics for on-chain data.
Most people who don't work on actual blocks themselves,
they don't appreciate how much engineering effort
actually goes into kind of creating like a state and a database and so on.
can you maybe talk us through that kind of what kind of say I have an Ethereum archive node
it's a terabyte or whatever kind of the current side depends on kind of what you're running
and how do I get from there to kind of something that I can actually query?
Yeah so the way we do it mostly is we make use of
you know RPC JSON endpoints from the notes and then we use.
pull out specific data from the nodes.
And so you pull out the blocks and they have transactions and you, if you want to go one level
deeper, you parse out the events from smart contract interactions.
Like if you have the ABI of a smart contract, then you would use that to be able to parse
out the data that's included in the transactions.
So that's kind of the
I mean that's sort of like at a very high level
You know how you do it
And you know we
We actually started out with a pretty different tech stack and architecture
And we've changed that recently
So we used to use so one of my co-founders is the creator of an open source project called
Ethereum ETL
Which which basically does this in an open source manner for Ethereum
And so you can actually like
index all this data, you know, if you have an endpoint or you run your own node and you have
that endpoint, you can index all this into like CSV files or into, you know, a database.
And so he built that and that was kind of, that was one of the building blocks that we used
to get started. Over the years, though, we have basically moved over to it a different paradigm
of loading the data.
Initially it was, you know, Ethereum ETL,
so extract transform load,
which many data engineers and so on will be familiar with.
Now we do basically ELT,
extract load transform.
So one of the reasons we do it this way now
is because we integrate with lots of different chains,
and different chains might have slightly different schemas.
So the idea is if you first can just extract the data
from the JSON-RPC
endpoint and you can just load the raw data in, then you can transform it later.
So it's sort of you delay the transformation and the schema harmonization of all the data
and to a later point.
And then we've also changed the database that we use.
We used to be based on BigQuery, which is a Google Cloud, sort of proprietary analytical
data warehouse technology.
Now we use something called Click House, which is.
also an analytical database, but it's more performant for the type of use that we have.
So in the past, we might have a dashboard that, I don't know when you were mostly using Nansen,
but Nansen version 1 was actually pretty slow.
And some of the dashboards would load in like 30 seconds, which is kind of hilarious if we look
back at it now.
But with Clickhouse, you know, the same dashboard might load in like, you know, 300 milliseconds
or something like that.
So we made, we've actually.
kind of evolved our tech stack and
replace the whole thing, both the data
pipelines to sort of extraction of the
data and also how we
store the data and how we could query it,
etc.
So which chains
do you currently support?
So we actually have kind of a suite of
different products. So, for example,
if you look at Nansen portfolio,
which is our portfolio tracker,
we support more than 50 chains.
And so it's kind of
all of the usual suspects, you know,
Bitcoin, Ethereum, even Solana, and then a long tail of EVM chains.
And for Dunstan Query, which is kind of the enterprise product where you can write SQL queries
and people make dashboards, we support about, I think it's 20 plus chains, so a little bit fewer,
but still more than 20.
And then in Nonson 2, which is the product that actually most people know, which is kind of the product
that you've used and where you see your
holders for the token,
token God mode, profileer.
We support,
I think it's now just
over 12 chains, but we're adding
a lot of chains every quarter actually
to it.
And so, yeah, so
it sort of depends a bit on which
products you're actually using, but
the ambition is to be adding
like roughly one chain per month
or more going forward, because
you do have, like, the
the world is very multi-chain at the moment.
And so you want to make sure that you're supporting all the chains that people care about
and that they use.
And, you know, we've sort of invested a lot in our tech to make it both faster and frankly
cheaper for us to integrate new chains.
So we've sort of industrialized the whole EVM chain ecosystem.
And we can onboard EVM chains actually quite fast now.
Yeah, so that's kind of how we look at it.
Interestingly, there are a lot of non-EVM chains that want to integrate with us,
which on the one hand is great because you want to support them.
But on the other hand, it's also quite technically challenging
because you have to sort of build a spoke solution for every chain, almost.
But the EVM chain use case, we've sort of industrialized the PINhouse.
I am sure you kind of have protocols for that,
but how do you ensure kind of data,
and the reliability of the analysis.
Yeah, you can sort of talk about data, accuracy or data quality in a few different ways in our product.
So the most basic data accuracy is about the entre data itself, right?
So you want to make sure that you're not missing data, that, you know, you have tests where you see, you know, the number of transactions is it in line with today with what you saw yesterday and that kind of stuff.
so you can have basic sort of almost like unit tests, data quality checks on that.
I think the harder part, though, is on the attribution, the labeling of addresses, right?
And that's where there's potentially room for error.
And so, you know, our philosophy is that we would rather not have a label than to have a wrong label.
And so that means we have very strict requirements on precision.
And so as an example, we, for, you know, I mentioned we have 3 to 400 million adverts,
is labeled for every one of those labels, we have evidence and documentation.
And of course, a lot of that documentation is algorithmically generated.
But you can always look up, you know, if this address has this label, why does it have the label?
So there's, you always have the documentation for it.
And I think this is something that we, we take pride in, that we actually take that stuff really
seriously, because it can get out of hand really quickly and you can get like a negative spiral
if you start getting the wrong labels
because typically what happens
is if you're looking at a new address
and you want to label it, you start looking at
what are the labels of the addresses,
the neighbors of that address.
And so if you have a wrong label,
it can propagate very quickly
and it goes out of control
and of course, you know,
you get more wrong labels.
That's the first thing. But secondly, more importantly,
it can impact the user experience
if people see a wrong label
and they lose trust in your problem.
like so this is something we take very seriously and and we of course you will always you will always
have some errors like that's it's just not possible to have literally 100% precision but it's
actually very rare that we have incorrect labels and even if you do arguably have incorrect labels
very often there's a very logical explanation for it so at some point I remember we were called out
for having labeled
on the address Doe Kwan
and we were told that that was incorrect
but it turned out that it was basically
Terra Labs or the company
related to it so
so you know is that an error like
maybe it is in a strict sense
but of course you know it's a very
related entity
and you might have a similar thing with like
some Suzu or
or three arrows and things like that
but
you know we take pride in having
the best precision on the labelling that we do.
And this is something that's very important to us.
Which specific heuristics do you actually use to kind of generate the labels?
I mean, and how do you come up with them?
I'm sure you kind of add stuff all the time, right?
Yeah, so it's a combination of man and machine, right?
So the heuristics would be some of them are deterministic and quite simple, right?
So think of you want to label every uniswap pool.
Then you can literally just look at the uniswap factory.
And like we were talking about earlier,
you can look at the events that are emitted
and the events contain all the information
that deterministically say, here are all the uniswap pools.
This is like the easy case.
And in theory, anyone who can read blockchain data
and have a system for this could do this.
Then there are other things that are more complex,
like exchanges,
centralized exchanges because there
technically the information
is not deterministic
from just the on-chain data. You need to do some
inference and you need to understand like
how these entities manage
private keys and manage
addresses. And so there
you typically have kind of a
baseline
juristic that is sort of
somewhat universal for any exchange.
So you might say actually if you
send funds to an address and the address
automatically forwards it to
what we call a main wallet, like a finance main wallet,
then you can be pretty sure that that wallet is a deposit wallet for the exchange, right?
And so this is going to be, you know, correct in most cases,
but you may have to tweak it and you need to curate the main wallets because those can update,
right? Let's say HTX or, you know, Gate I.O might get a new main wallet.
You need to make sure that you're on top of that.
And you need to have sort of alerting in-house if you see lots of funds,
move because maybe they move to a new
cold wallet or new hot wallet and
so on and so forth.
And so you can state
these heuristics
programmatically and
you can label up low addresses
in this way. So it's kind of like
an inventory
of many different
heuristics and sometimes the heuristics
can build off of each other.
But
you know, living up to our name,
we've started
using more AI for
the labeling recently.
But
the challenge there is like you might end up with
probabilistic labels and like I was saying
before you want to make sure that the precision
is as high as it can possibly be.
And so
I kind of, but yeah, it comes back to the
same point where you have this man and machine
set up where
the machine is going to be doing
like 99.95% of the work
in terms of the just the quantity
of addresses. But the
0.05% that the human does
can be very valuable and it can also
be used by the
machine to label all this stuff.
And so, yeah, so
it's interesting, right? Because the
AI approach we've started making use of
now, they kind of like almost sit
in between the sort of the man and
machine, like the human and the machine.
But we've
seen anything, you can also, by the way,
you can look at, I mean,
depends like how far down the rabbit hole you want to go,
but you can also look at the
economics of it, like how much does it cost us to label an address, right? Like, if you, if you just
think of the human labor or even like the cloud cost of the heuristics, and then you start looking
at like optimizing that and saying, actually, you know, the heuristics is very cheap, so you want to
make sure that the heuristics can label as much as they can. And you want to be very selective of
what you use human labor power for, because that can be like $10 per label, maybe, depending
on, you know, many different factors. So, so, yeah. So, yeah.
Yeah, this is kind of an interesting optimization problem over time
that you have to balance out different things.
You want to make sure precision is very high.
You want to make sure that also the recall or the coverage is very high.
You want to label as many addresses as you can.
You also want to make sure that you can do it in a timely manner
so that you can label addresses very fast.
And you want to make sure that the economics are permissible
so you don't break the bank.
if it costs us like
$20 million to label
20 million addresses
like yeah that's probably not going to work
right so yeah
so there are many interesting
factors here and like this
in a way this is kind of the most
unique thing we do at the company right
if you think about it
and it's exciting
because it's one of the areas that
probably can be
you know enhanced the most with AI
in my opinion
yeah absolutely
How fast do you label these addresses and big movements?
I'm asking because obviously this, if you're a trader,
this can actually give you a lot of alpha, right?
So if someone kind of moves funds from a known cold wallet to a hot wallet,
chances are they going to sell them, possibly on an exchange.
So you might kind of want to front run them in the traditional sense,
not in the blockchain sense.
So how fast do you do this?
And do people explicitly use it for this sort of use case?
Yeah, so we have a feature called Hot Contracts.
And what Hot Contracts does is it looks at newly deployed smart contracts
that have a lot of funds going into them.
And Hot Contracts now, actually very soon,
like probably in a matter of weeks,
is going to be enhanced.
with AI labeling.
And so that means the idea is like,
probably, I don't know if it will be minutes
because you kind of need to accumulate a bit of data
on the address in terms of transactional patterns
and stuff like that.
But yeah, maybe minutes, at most hours,
you'd let loose the army of AI labelers
on these hot contracts.
And because we will have tuned
and quality assured the precision
you'll be able to get
pretty descriptive labels
of what these contracts actually are
right? It's interesting right
because in a way
like some of our users who are very
sort of power users and advanced users
they sort of see the alpha in
us not having labeled an address because they know
that that's like a new address
and because it's not labeled yet
if they figure out what this
addresses they might be sort of one step ahead of the game and so if you look at the hot contracts
table ironically a lot of the addresses are not labeled but I think that's going to change literally
like in a matter of weeks when we roll this out and so I think some of the people who are using that
feature hot contracts are probably going to see like almost like a night and day you know
change in that view for other types of addresses it depends right I
A fund typically it takes a while to actually figure out what the fund is.
So if they move like a VC fund or like a liquid venture fund or something,
if they move funds from one address to another,
that's maybe one of the more clear cut cases.
But if you have a totally new address that is like providing funds in a seed round
or something like that, it can be quite tricky.
You need to have multiple data points to figure it out.
And so those cases you can't talk about like minutes or hours.
hours, that's like, you know, days or weeks or maybe months.
So it really depends on like what kind of address or what kind of entity we're talking about.
So for the newly deployed smart contracts, do you also speculate about kind of what it's going to do?
I mean, is it kind of, is there a labor that says we think this is this is a newly
launched
pub exchange or something?
Yeah.
So the idea
is like
with labels, right?
There are a few different ways
to think about labels.
One is just give it a name.
So like, you know,
Gnosis chain,
something,
you know,
prognosis chain,
bridge or something like that, right?
But then there's a category to,
which I guess is what you're getting at.
So you have a category description.
So this is like a stake
contract. It's a bridge.
It's a Dex. It's a
DeFi pool. It's a
yield farm. There's
sort of a taxonomy of different things
it could be. And the idea is
what we're aiming to do is both.
Give it a name, a specific name,
and also give it a category.
And
in fact, the category,
also, it doesn't always need to fall into one category. So you might
want to give it like multiple different labels from
like multiple different indicators.
right so this is both a staking contract and it's a token like think of staked
eth with lido for example that's kind of it doesn't neatly fall into one category
so the ideas do both like give it an aim and give it a category so I've not tried
this yet so if if you were to kind of put a nudity deployed smart contract or any
smart contract into into chat GPT or any of its competitors will it be able to
tell you what it does?
No. If it were that easy, then we would have just done that. We have tried. No, but you have to,
you don't want to, I guess, like, give away too much the secret sauce, but, but you have to sort of,
you do have to use chat chitp t. There's a good idea to use chat chvety, but you have to
sort of guide it with the right prompts and make it use the right sources for it. Okay, so kind of
you want TertTBT to kind of figure out what's the business logic behind this smart contract?
Yeah, kind of.
And you have to sort of chain it, like do multiple steps, right?
So it's like, yeah, I don't want to give away too much.
But the idea is like roughly you want to try to make it understand.
You want to make sure, I want to make sure that it has all the information, A, and then
B, that it can synthesize all of the information, and then, you know, put it into like a meaning
category or give it a meaningful name.
And then, you know, so you can, you can think of this as like, you have LLMs and then, you know,
you could fine tune LLMs, but it's actually in practice you end up making better use of
the context than you do to fine tune the LLM.
And then you also do it iteratively.
So you kind of ask it to solve multiple different problems iteratively or like in a sequence.
And then at the end, you kind of get.
something that is useful but it's more so you could you know the very sort of short
form of putting it it's like it's a form of prompt engineering but it's it's actually like
pretty involved prompt engineering yeah i can imagine sorry just one more thing on that right because
it's not enough like chat chabit doesn't have our existing 300 million address labels right and so
that's where you get kind of an edge to because our own version of this can also tap into the
the existing labels we have.
And because of that compounding effect, I said earlier,
where like if you know the neighbors,
it can help you figure out what the label of this one is,
you get this kind of sort of a moat that's built around
kind of being able to label stuff with high precision and very fast.
Yeah, absolutely.
You guys also use AI on the other side.
So kind of if I may use it, I search for stuff.
You have smart search and similar things.
What does it allow me to do and how does it work?
Yeah, so maybe on this end, probably the best example is signals,
which is a feed, it almost looks like a Twitter feed or something,
and you have sort of these cards,
and each card is a signal that we've observed on chain.
And so this could be, you know, PEPA token has, you know,
this amount of million dollars going into centralized exchanges.
That's 20 times more than an average day.
That's an example of a signal.
And these signals are personalized based on what you have done in our platform.
So if you have saved certain tokens to your watch list, if you have maybe added certain addresses to your watch list or NFTs, and then soon, this is something we're rolling out in the next few months, we're bringing together our portfolio tracker with the analytics product.
So if you have your portfolio tracked with us, we can personalize signals that you see in your feed based on your your own portfolio and your history, the history of trading and so on and so forth.
So it's actually a toggle in the product where you can switch on and off personalization.
So either you could just get the kind of vanilla feed that everyone gets or you can get a personalized feed based on, you know, what you have indicated to us that you're interested in through,
your behavior in the platform, what you search for, what you've saved, and so on and so
forth. So this isn't new, right? This is just, you know, what Amazon has been doing
since almost the 90s or at least early 2000s in like people who are interested in this
are also interested in that, like recommender systems. So this part isn't necessarily that
new. But interestingly, you haven't seen a lot of personalization in Web3 yet, which is something
that's a bit puzzling. I think it's maybe because, firstly, it's a very young space, but
But secondly, we didn't have like enough data that it was needed.
People could still just go on Coin Gecko and like find the coin.
But I think we've we've entered the era now where you have literally millions of assets.
And so it's no longer feasible to just search for the asset you care about.
You actually need to get stuff recommended to you because the inventory has become so large that it's not, you can't just look through it in a catalog or like on a
ranking. And so that's why I think now personalization is probably going to play a big
role in crypto and that's what we're trying to lead into. And obviously you make use of machine
learning and AI to make that happen at scale. So that actually puts you in a super powerful
position because not only do you have really well organized repertoire of other data that's
on chain, you also have kind of like the private user data that they share with you. Do you,
is there kind of some sort of ethics codecs of kind of like how you treat the user data that's kind of
shared with you do you kind of do you monetize that do you kind of use that to kind of cross-rent
reference things behind the scenes yeah I mean there are yeah it's a great question right
naturally you have to first of all respect just general privacy regulations right GDPR and so on
and so forth and so so and that you know has its own sort of set of rules
and things to make sure that you're not violating, right?
And that there's consent and so on and so forth.
Secondly, we have a Chinese wall.
This is because this is a concern, I think, that many people have,
and it's a valid concern.
It's like, if I use nonsense,
are you going to use what I search for to label, like, my addresses,
for example?
If I search for my own address, you're going to use that?
And the answer is no, because there's literally a Chinese wall
between the department that has access to any user data.
and the department that has access to labeling wallets.
And it's kind of hard for us to prove this
because we're not like an open source company, obviously, or a project.
But that's the reality.
And so that's in our privacy policy,
and it's also how the company is structured,
and literally people don't have access to both of those two things at once.
We don't monetize that data.
I don't think we don't need to.
like we don't have an ads-based business model.
Like our business model is very straightforward.
You just pay for the subscription.
And, you know, you get access to the product.
So in a way, I kind of like that business model because it's the most transparent business model.
Yeah, it's very honest.
Many people don't like it because they're so used to getting stuff for free.
But on the back end, their data is being sold to.
Yeah, exactly.
So in a way, I feel like we're like the dumb honest people.
Like, we're just charging you to use the product.
And like, that's it.
we don't need to have some nefarious way to exploit their data on the back end.
You could give people the choice.
You could say, do you want to be private and you pay?
Or do you want us to kind of monetize your data in some other way and you get to use it for free?
Yeah, maybe that's an idea.
Yeah.
I mean, I will, like on the topic of business models, right, I think I sort of think of ads as the default business model of Web 2.
and I think that the default business model of Web 3 is going to be transactional.
So, you know, it's not unlikely that our subscription model at some point will get displaced by a more transactional business model.
So does that mean maybe you have CowSwap integrated into Nonson, and when you find tokens of token God mode, you click buy and you execute the trade through Kalswap.
and maybe CalSwap and Nansen share any fees that are involved in that.
So, you know, that to me seems like a more future-proof business model.
And I'm not super excited about the ads business model.
But yeah, that's a big kind of strategic topic on its own.
Kind of thinking forward, how do you see the rise of privacy?
I mean, a lot of your business model kind of hinges on the fact that things on chain are inherently transparent, right?
So with the rise of privacy preserving technologies on chain, how do you think that's going to change?
Yeah, I mean, I think of this as you can't have both at the same time to the maximum extent.
And so it becomes a trade-off, as with many things and technology.
you can't have full privacy
and you can't have full transparency
at the same time.
And so
our product makes the most sense
obviously when there's room to have transparency.
And so I think
the reality is that many people
value the transparency of blockchains
because it gives them a sense of comfort.
But if you know that
hey, the funds that are sitting in AVE,
I can actually see
all of the transactions
that have ever happened with Albe
and I can see all the funds
sitting in the smart contract
and so on and so forth. That gives people
a sense of comfort. They might not
actually do it, but the fact that
they know they can do it
gives them
sort of more trust. And if
you contrast that to say a
bank or an
FTCS, then
you kind of quickly realize that
the lack of transparency can become
an issue.
So I think our product naturally works best when you have chains that are public and transparent.
It seems obvious to me as like a consumer that blockchains in their current form don't really work really well for payments, for example,
and things you might want to do in your daily life where you do want to have more privacy.
and I think it makes sense that you'll probably get some
some world where either protocols
or even chains or L2s
have full privacy
but maybe there are some guardrails
or like some rules around it
so I'm not saying this is what I want
but I think like one way you might imagine this
is what if you had an L2
that basically had privacy
somehow, but you could not make transactions over some certain amount, you know, size, right?
Again, I'm not saying this is necessarily the world I want, but I could see that's being something
that regulators might be more comfortable with than one where there's like no limit and, you know,
Lazarus from North Korea can, you know, potentially transact hundreds of millions of dollars in
volumes.
So I think you can look at it from sort of an ethical slash moral perspective.
And then you can also look at it just from a sort of a pragmatic perspective of like what are regulators going to always allow.
And then finally you can look at it just from a tradeoff perspective.
Like if you interact with something that has full privacy, what are you giving up in terms of transparency?
And then there's like interesting solutions around zero knowledge proves and so on, which in some cases can give you sort of the trust you want and like some form of transparency without revealing everything that's going on.
But I think it's a really interesting space.
I don't think you will ever get to a point where everything you do on chain is totally private.
And I think that also defies the object to a certain extent, right?
So I mean, what ideally you kind of want is kind of transparency for the man and kind of like privacy for the little guy, right?
So kind of you want transparency to kind of hold power accountable.
But you don't.
So you want to know what your government spends its money on.
You don't need to know what your neighbor spends his money on.
And the crazy thing is it's like the inverse in, you know,
the world in many countries, it's the inverse, right?
We're like, governments can see everything you do in practice.
Like, they could just reach out to a bank, get all your data or whatnot.
Or they could sort of have a, you know, some sort of back channel into your Web 2 products,
you know, whether that's Google or Twitter or whatever.
But then, you know, there's really no transparency on like how they're, like, where did all
that money that was spent on initiative X
by the government go and so
and so forth. Yeah, totally.
So I think you do want transparency
for
the people you elect
for sure, right?
That's, in a way, it's kind of crazy
that you don't have that in like every democracy,
literally down to every transaction
that they make with taxpayers' money.
Right? So, so exactly.
Like that, you know, could there be like a nonsense
for all government spending? Yeah, that would be amazing.
I would love that.
I would absolutely love that.
So you alluded to this in the very beginning.
So you label well-known people on Shane.
So obviously there's ethical considerations kind of that come with it.
So say I'm dope one and I feel like I don't, I mean, you have rightly labelled me,
but I don't want this to be known on Shane.
can I can I kind of send you an email and you will delete the label or what's your policy?
Yeah you can. I mean that's the short answer as you can. But there's always there's always a bit a nuance.
It might be helpful just to explain like how you get there in the first place like why someone an individual might get their name on an address.
And that's typically because there's information in the public domain that we can point to.
So for example, someone says on a governance forum, hey, this is my address.
I'm voting on, you know, initiative X or proposal X.
And they are basically declaring that they own this address, right?
And of course, there are caveats to that.
Like someone could just be pretending to be them and so on and so forth.
But if that is credible, we would label that and then you could point to that in your evidence, right?
if they then choose that actually I don't want that to be labeled, I want it to be deleted,
yeah, then we will do that.
But at least that's the explanation of like how the information ended up in our database
in the first place.
Right.
So it's not like we go around trying to sniff out, you know, normal people's sort of individual
names and like label others with them.
You don't look over people, show just when they kind of do transactions at ECC.
No.
absolutely not and in fact like you know you if we if we wanted to do stuff like that i mean maybe not
that that thing exact but if we wanted to sort of go on twitter and hunt down every time someone
like is a little bit silly in declaring something for example responding to tweets about
post your address and you'll get an air drop or you know here is my new nfts that i bought which
again uniquely
you know basically doxes your wallet
if you wanted to do that kind of stuff
systematically like we could but we just
don't think that's the right thing
to do firstly and secondly
I don't think it's like newsworthy
in the sense like people
what we do can be seen as a form of
journalism right so
then you have to kind of ask yourself
like okay if Vitalik has
a wallet that has like you know
a billion dollars in it yeah that's newsworthy
people should probably know about that if the
founder of a project, you know, has a lot of money in that token. That's newsworthy
people should know about that. If some person has a 200 bucks, you know, and bought some
NFT on base and then they told someone about that on Twitter, we don't systematically track down
that information and put it into our system. We could in theory, because it's public information,
but we don't really do it. Sure, absolutely. So tell us about what's coming for Nansen. What kind of,
So you already talked about the hot contracts update.
What else do you have in store?
Yeah.
So the hot contracts update is kind of a smaller example of how the labels are just going to get a lot better
because we are investing a lot in AI-driven attribution.
And that's going to happen, I think, faster than we initially anticipated, actually.
The second thing that I also mentioned is portfolio is going to get integrated.
into Nonsense 2.
And this is actually kind of a big deal
because it allows us to personalize the product even more.
And I think, you know,
the ambition there is to be the preferred portfolio tracker
of any on-chain investor.
And so we are very much on-chain oriented, right?
So we believe that if you have the best coverage
of chains and assets and protocols,
and we can give you signal on what you might be interested in
investing in, that's a really potent combination. And so bringing portfolio into Nonsense 2,
I think is going to be a game changer, frankly, and it's going to happen in the next few months.
The third thing is we are integrating lots of new chains. The one chain that we have been
asked about the most is Solana. And so we're going to launch that hopefully within two months.
And I think that's going to be pretty big. And we have some exciting
ideas to try out with Solana
because it's kind of its own
little ecosystem and
pocket, so we're going to try out some more experimental
ideas with Solana.
And in addition to that, I think
we're going to strengthen some of the things that we're already
known for, like smart money tracking is getting
better. Now we have a new
squad internally that's just focused on
taking that to the next level. So things
like really good P&L tracking
of traders, finding
wallets that you might want to monitor because they're really good at trading. That's something
that we're leveling up and improving, making the overall product easier to use because it can be
a bit overwhelming for people, but it's, I think, constantly getting simpler to use in a way.
Like we're trying to strip away stuff, like eliminating stuff that's not absolutely necessary
in the user experience. So those are some of the things. But I think, like, you know, overall
We've put Nonsense 1 behind us.
In fact, we're switching off Nansen 1 literally tomorrow.
So we've kind of firmly made the transition to Nansen 2.
And that means we can sort of put that we can travel lightly.
We can put behind us a lot of the tech debt that we had from the first version of the product.
And then we can really just focus on the innovations for the new version of the product.
So, yeah, this actually literally in the next three months,
there's a lot to look forward to if you're in Nonsense news.
Wonderful.
So where can we send listeners to kind of check out Nansen?
Yeah, you can go to nansen.a.I.
That's the best place to start.
And you can also follow Nansen underscore AI on Twitter.
Perfect.
Thank you so much for coming on, Alex.
It's been a pleasure.
Thanks for having me.
Thank you for joining us on this week's episode.
We release new episodes every week.
You can find and subscribe to the show on iTunes, Spotify, YouTube, SoundCloud,
or wherever you listen to podcasts.
And if you have a Google Home,
or Alexa device, you can tell it to listen to the latest episode of the Epicenter podcast.
Go to epicenter.tv slash subscribe for a full list of places where you can watch and listen.
And while you're there, be sure to sign up for the newsletter, so you get new episodes
in your inbox as they're released.
If you want to interact with us, guests or other podcast listeners, you can follow us on Twitter.
And please leave us a review on iTunes.
It helps people find the show, and we're always happy to read them.
So thanks so much, and we look forward to being back next week.
