Epicenter - Learn about Crypto, Blockchain, Ethereum, Bitcoin and Distributed Technologies - Nansen: AI & Blockchain Analytics - Alex Svanevik

Episode Date: August 9, 2024

The saying goes that knowledge is power, and this perfectly applies to blockchains due to their innate transparency and immutability. However, raw data could seem, at first, unusable. This is where an...alytics companies, such as Nansen, play a major role in demystifying blockchain data by labelling it. In turn, curated data holds value as it can give an edge to traders, by tracking ‘smart money’ wallets. In the age of AI, most of the heavy lifting of data analysis is performed by LLMs, but human input is equally valuable for discerning nuances and fine tuning the process.Topics covered in this episode:Alex’s background, from AI to cryptoNansen’s valuesQuerying blockchain dataSupported chainsEnsuring data accuracyGenerating wallet labelsThe role of LLMs and AIData privacy and monetisationOn-chain transparency, privacy and ethicsRoadmap and further enhancementsEpisode links:Alex Svanevik on TwitterNansen on TwitterSponsors:Gnosis: Gnosis builds decentralized infrastructure for the Ethereum ecosystem, since 2015. This year marks the launch of Gnosis Pay— the world's first Decentralized Payment Network. Get started today at - gnosis.ioChorus One: Chorus One is one of the largest node operators worldwide, supporting more than 100,000 delegators, across 45 networks. The recently launched OPUS allows staking up to 8,000 ETH in a single transaction. Enjoy the highest yields and institutional grade security at - chorus.oneThis episode is hosted by Friederike Ernst.P.S.: Our friends from @nansen_ai have offered us 10 discount codes for 10% off on their professional and pioneer plans! If you are interested in unlocking Nansen's true power, DM us on Twitter (X - @epicenterbtc) and we'll hook you up with a code (FCFS).

Transcript
Discussion (0)
Starting point is 00:00:00 So we sort of industrialized the whole EVM chain ecosystem, and we can onboard EVM chains actually quite fast now. We have 3 to 400 million addresses labeled. For every one of those labels, we have evidence and documentation. And of course, a lot of that documentation is algorithmically generated. It can get out of hand really quickly, and you can get like a negative spiral if you start getting the wrong labels. You know, living up to our name, we've seen. started using more AI for the labeling. The challenge there is like you might end up with probabilistic
Starting point is 00:00:35 labels and like I was saying before you want to make sure that the precision is as high as it can possibly be. The machine is going to be doing like 99.95% of the work in terms of just the quantity of addresses. But the 0.05% that the human does can be very valuable and it can also be used by the machine to lay a lot of the stuff. This episode is proudly brought to you by NOSIS, a visionary collective committed to fostering and expanding applications for a decentralized future. NOSIS is at the forefront of innovation with NOSIS pay, circles, and Metri, revolutionizing open banking and creating a superior form of money.
Starting point is 00:01:29 With Hashi and NOSIS VPN, they are building a more resilient and privacy-focused open internet. Are you seeking a robust L1 to launch your project? well, look no further than the Nosis chain. Enjoy the same development environment as Ethereum, but with significantly lower transaction fees. And with a robust network of over 200,000 validators, nosis chain stands as a credibly neutral and resilient foundation for your application. Governance of Nosis is driven by NOSIS Dow, where everyone has a voice in shaping the project's future. Join the NOSIS community today by participating in the NOSISDA governance form.
Starting point is 00:02:05 You can deploy your project on the EVM-compatible and highly decentralized nosis chain, or help secure the network by running a validator with just a single GNO and low-cost hardware. Embark on your journey towards decentralization today at nosis.io. Chars 1 is one of the biggest node operators globally and help you stake your tokens on 45 plus networks like Ethereum, Cosmos, Celestia, and DYDX. More than 100,000 delegators stake with Chars 1, including institutions like BitGo and Ledger. Staking with Quarice 1 not only gets you the highest years, but also the most robust security practices and infrastructure
Starting point is 00:02:46 that are usually exclusive for institutions. You can stake directly to QuarS1's public note from your wallet, set up a white table node or use the recently launched product, Opus, to stake up to 8,000 eth in a single transaction. You can even offer high-year staking to your own customers using their API. Your assets always remain in your custody so you can have complete peace of mind. Startsaking today at chorus.1. Welcome to AppaCenter, the show which talks about the technologies, projects and people driving decentralization and the blockchain revolution.
Starting point is 00:03:22 I'm Friedrich Erns. And today I'm speaking with Alex Vanuick, who is the co-founder and CEO of Nansen, which is a blockchain analytics company. We'll discuss them in a lot of detail in just a bit. it's a pleasure to have you on Alex before we get started with Nansen properly tell us about yourself what's your background and
Starting point is 00:03:45 how did you end up where you're now? Yeah, great to be here so depends how far back we want to go I guess my background initially is an AI that's my degree from university
Starting point is 00:04:00 in Edinburgh, UK so I was an AI before AI was cool is what I like to say So I spent a few years working with data science and machine learning, also a few years in management consulting. And in 2017, I discovered Ethereum during lunch at work. Some engineers were very excited about it at the company where I was working. And then I fell down the rabbit hole very quickly because I think several people started talking about Ethereum at the same time. And so it sort of piqued my interest.
Starting point is 00:04:35 This was the summer of 2017. And so a few months later, I decided to leave my job as a data science manager at the time. And I moved to Hong Kong to join a startup in the crypto space. So that was how I basically got into crypto. I would say I was not definitely not one of the earliest in crypto. I felt that was very late at the time joining crypto. Now I feel like I'm kind of a veteran. almost, which I guess speaks to how young the industry is.
Starting point is 00:05:13 But basically, after working a few years in crypto, both with a startup, that unfortunately ran out of money pretty quickly, I spent some time with the decentralized exchange protocol zero X, helping them with analytics and understanding slippage across Texas and things like that in 2019. We worked a little bit with Aragon as well, the Dow platform. farm and as mostly as a consultant helping them out with data and analytics. And then I co-founded Nansen, our company, late 2019, or at least that's when we started working on it.
Starting point is 00:05:49 I know you went to market April 2020, about one month after COVID started when everyone was gambling on governance tokens and yield farming during COVID. So that's kind of how I ended up where I am now. So with your background and kind of makes sense that you would found a blockchain analytics company. You also have a background in AI and I assume general IT stuff. So was it kind of the fact that you felt like you had already done this before? Or did you kind of, was there a larger mission to kind of ordering this mess that kind of actually is kind of like if you run an archive note, you'll learn nothing unless you kind of do proper analytics on it, right?
Starting point is 00:06:36 So what was the main motivation to kind of go into this, you know, full time? I think it's a good question. I think there are a few different things happening at once. I will maybe say that firstly from like a career perspective, I thought of it as like a van diagram of two competencies, one being data and the other one being blockchain. I figured that if you are very good at both of those, you probably end up in a intersection that's pretty small.
Starting point is 00:07:06 So from a career perspective, I figured that it's a good idea to learn about these two things because not that many people in the world are going to know about those two things. That was probably from a career perspective. And then I also very rapidly became like sort of enamored with the crypto industry because when I was interacting with people on Twitter and Telegram and things like that, I found that people were very open-minded and they were very interesting. inviting in a way that was almost a bit surprising to me. I kind of thought of crypto as being a little bit kind of, you know, almost like antagonistic or adversarial because it's very technical
Starting point is 00:07:43 and all that. But I found that people were very open-minded and sort of intellectually quite interesting. So I think that also appealed to me from like almost like a culture perspective. And then there was another thing that was I think more specific to data, maybe even more specific to Europe, you know, which is where I was based at the time, which is GDPR, like privacy regulations were kicking into full force. And I think I kind of became a bit frustrated as a data scientist working with data and there were so many regulations you had to navigate that it became really hard to do your job, frankly. Like it was because everything, you had to, you know, check all the boxes this and all this stuff. And so I was, I kind of wanted to work with data that was not like customer
Starting point is 00:08:33 or user data. I wanted to just work with data sets where you don't need to like fill out a form to be able to use them. And so blockchains were interesting because so much of it is from public blockchains, right? The information is just there and the data is there. And so you don't need to ask permission for someone to dive into like all this exciting activity that's happening on chain. So those were some of my personal reasons. And then I think if I look at it from like the market opportunity side, you know, you didn't have great analytics tools or products at that time. I think there was really only one game in town for on-chain analytics, which was chain analysis.
Starting point is 00:09:17 I mean, you had other products like I don't want to belittle them. But like most people knew about chain analysis for. AML purposes. But I felt that, you know, people who are just in crypto, who are trading, who are investing, who are using blockchains, they're not necessarily law enforcement or tax authorities. They should have great analytics products too. And so I felt like there was an opportunity there to just provide them with a better product so that they could understand what's happening on chain and they could make
Starting point is 00:09:53 better decisions investing. They could, you know, make better decisions building products and protocols and building blockchains or L2s now these days, right? So, so yeah, there's kind of, there were many different factors that sort of led me, led me here. I can chime in here and say that as a blockchain founder
Starting point is 00:10:11 myself, I have used your tool a lot, particularly for one use case, for which it just beats all the alternatives out there. And that is kind of, you're very good at labeling wallets and kind of saying who you think they belong to, what kind of person or kind of it is. And I'm super interested in how decentralized our token ownership is, right? So kind of I would go to kind of like the Nosis token and kind of just listed by kind of like which address holds how many. and I mean in the beginning
Starting point is 00:10:53 I knew who a lot of the people at the top of the list were, right? I mean, that's kind of how project start out. But kind of, how far can I go down the list until I find the first person who I honestly don't know who that is? To me, that's always been very comforting to know that kind of like there's lots of people out there who are involved in the project in some way. and I have absolutely no idea who they are.
Starting point is 00:11:24 And I mean, that's only increased over the years. So yeah, but I, yeah, so this is how I, this is how I first learned about Nansen, I think when it came out in 2020 or so. Yeah, you bring up that, that's like a very common use case, right? Especially among builders who want to just understand their investor base or like who's holding the token. And I think you're right that this is also one of the opportunities. that we saw that, you know, in a way, it wasn't that interesting to just get the blockchain data,
Starting point is 00:11:57 because in theory, anyone could do that. The hard part is to figure out, like, what's the entity that's associated with the address? And we saw an opportunity there to sort of help people get more transparency on that front. And, you know, that is one of the core things that we do very well, right? And we have at this point like three, three, maybe even 400 million addresses labeled at this point. I can speak to that as well because this is also one of the use cases I use Nansen for. I check whether I have docks myself. So obviously kind of I have assets on different addresses and I try not to, I try to kind of keep them apart. So kind of if you guys don't, if they're not labeled on Nansen, I think I'm probably okay. Yes. Yeah, that is that is true. I mean, maybe it,
Starting point is 00:12:49 you know, we should also just call out that, you know, if individuals have their name label announce and they can contact us and we will, and if you want to remove it, we will remove it. There's a bit of nuance to that because sometimes people inadvertently docks themselves on chain. So they might like buying a dot-eath name or something like an ENS. And we can't do anything about that because that's immutable and like etched into the history of the blockchain. But yeah. So, you know, there's, this is like kind of a blessing and a curse of blockchains, that they are transparent, they're immutable, et cetera. So something that I've always wanted to ask you, is Nansen actually named after Fridge of Nansen?
Starting point is 00:13:33 Yes, it is. Okay, fantastic. Yeah, that's right. Maybe tell us about Nansen and kind of why you said it on that name. Yeah, I mean, I think a lot about culture in the context. of a company or a project and I felt that Nansen is kind of an embodiment of the values we have in our company and so values like courage, curiosity, you know, Nansen, for those who are not aware, it is most famous for actually having been a polar explorer. He crossed
Starting point is 00:14:12 Greenland on skis as the first person. He went as far north on the globe as anyone had ever done and the same ship that he used. Another polar explorer reached the South Pole first of any human being. But he was also a scientist and interestingly it sounds like you know him for his work on creating passports for refugees. Yes. Yes. Which he did for almost half a million people. For stateless people, right? Yes, for stateless refugees. I think mostly around Armenia. So he was He was a kind of a renaissance person, an explorer, scientist, a humanitarian, even had played a big role in the creation of modern kingdom of Norway. He convinced the prince of Denmark to become the king of Norway so that they could become independent for Sweden in 1905. But yeah, so he's an embodiment of a lot of the values that we live by at Nansen.
Starting point is 00:15:15 curiosity, courage, transparency, speed, which is important when you're doing an expedition. You want to make sure you get there in time before you starve or run out of what you need. So, yeah, so he's kind of an icon. And I think, like, I also kind of like the idea that, a bit similar to Tesla, right, where you've named the company after someone who's not the founder,
Starting point is 00:15:41 but it's like an inspiring person. and then it's two syllables. It's easy to pronounce in any language, which is nice. So yeah, those are some of the reasons why we named the company else. And was the URL nansen.a.i from the get-go? Yes, it was. But I will say that the dot AI was aspirational in the beginning, in the sense that, you know, so I said that in the beginning.
Starting point is 00:16:11 My degree is an AI, and I always knew that we would be making use of AI for what we do, things like labeling addresses. Now we use AI for estimating the price of an NFT that's fully machine learning powered and is part of our product, which is actually kind of non-trivial, right? If you have a specific NFT, how much is this one valued based on its traits and transaction history, etc. We use AI to weed out spam tokens, which there's a lot of, especially on chains that have lower gas fees and things like that. We use AI for personalizing signals in the product. So I think, you know, Nelson, you know, we did have some foresight in that we knew that AI was going to become, you know, a big part of the world. It happened admittedly a bit sort of faster or more suddenly than I personally expected. but we're leading into AI even more now than that we were originally.
Starting point is 00:17:11 And so it's not like there's one AI angle with Nansen. It's more like it sort of powers the whole product in many different ways. So yeah, that's always been the ambition to make sure that we are an AI trailblazer and we're making use of AI in great ways in the product and in the organization. Cool. Let's maybe dive into the core of your product. So kind of like you started out with, and that's still very much your co-offering,
Starting point is 00:17:42 is kind of analytics for on-chain data. Most people who don't work on actual blocks themselves, they don't appreciate how much engineering effort actually goes into kind of creating like a state and a database and so on. can you maybe talk us through that kind of what kind of say I have an Ethereum archive node it's a terabyte or whatever kind of the current side depends on kind of what you're running and how do I get from there to kind of something that I can actually query? Yeah so the way we do it mostly is we make use of
Starting point is 00:18:26 you know RPC JSON endpoints from the notes and then we use. pull out specific data from the nodes. And so you pull out the blocks and they have transactions and you, if you want to go one level deeper, you parse out the events from smart contract interactions. Like if you have the ABI of a smart contract, then you would use that to be able to parse out the data that's included in the transactions. So that's kind of the I mean that's sort of like at a very high level
Starting point is 00:19:07 You know how you do it And you know we We actually started out with a pretty different tech stack and architecture And we've changed that recently So we used to use so one of my co-founders is the creator of an open source project called Ethereum ETL Which which basically does this in an open source manner for Ethereum And so you can actually like
Starting point is 00:19:31 index all this data, you know, if you have an endpoint or you run your own node and you have that endpoint, you can index all this into like CSV files or into, you know, a database. And so he built that and that was kind of, that was one of the building blocks that we used to get started. Over the years, though, we have basically moved over to it a different paradigm of loading the data. Initially it was, you know, Ethereum ETL, so extract transform load, which many data engineers and so on will be familiar with.
Starting point is 00:20:08 Now we do basically ELT, extract load transform. So one of the reasons we do it this way now is because we integrate with lots of different chains, and different chains might have slightly different schemas. So the idea is if you first can just extract the data from the JSON-RPC endpoint and you can just load the raw data in, then you can transform it later.
Starting point is 00:20:36 So it's sort of you delay the transformation and the schema harmonization of all the data and to a later point. And then we've also changed the database that we use. We used to be based on BigQuery, which is a Google Cloud, sort of proprietary analytical data warehouse technology. Now we use something called Click House, which is. also an analytical database, but it's more performant for the type of use that we have. So in the past, we might have a dashboard that, I don't know when you were mostly using Nansen,
Starting point is 00:21:09 but Nansen version 1 was actually pretty slow. And some of the dashboards would load in like 30 seconds, which is kind of hilarious if we look back at it now. But with Clickhouse, you know, the same dashboard might load in like, you know, 300 milliseconds or something like that. So we made, we've actually. kind of evolved our tech stack and replace the whole thing, both the data
Starting point is 00:21:32 pipelines to sort of extraction of the data and also how we store the data and how we could query it, etc. So which chains do you currently support? So we actually have kind of a suite of different products. So, for example,
Starting point is 00:21:48 if you look at Nansen portfolio, which is our portfolio tracker, we support more than 50 chains. And so it's kind of all of the usual suspects, you know, Bitcoin, Ethereum, even Solana, and then a long tail of EVM chains. And for Dunstan Query, which is kind of the enterprise product where you can write SQL queries and people make dashboards, we support about, I think it's 20 plus chains, so a little bit fewer,
Starting point is 00:22:19 but still more than 20. And then in Nonson 2, which is the product that actually most people know, which is kind of the product that you've used and where you see your holders for the token, token God mode, profileer. We support, I think it's now just over 12 chains, but we're adding
Starting point is 00:22:37 a lot of chains every quarter actually to it. And so, yeah, so it sort of depends a bit on which products you're actually using, but the ambition is to be adding like roughly one chain per month or more going forward, because
Starting point is 00:22:53 you do have, like, the the world is very multi-chain at the moment. And so you want to make sure that you're supporting all the chains that people care about and that they use. And, you know, we've sort of invested a lot in our tech to make it both faster and frankly cheaper for us to integrate new chains. So we've sort of industrialized the whole EVM chain ecosystem. And we can onboard EVM chains actually quite fast now.
Starting point is 00:23:25 Yeah, so that's kind of how we look at it. Interestingly, there are a lot of non-EVM chains that want to integrate with us, which on the one hand is great because you want to support them. But on the other hand, it's also quite technically challenging because you have to sort of build a spoke solution for every chain, almost. But the EVM chain use case, we've sort of industrialized the PINhouse. I am sure you kind of have protocols for that, but how do you ensure kind of data,
Starting point is 00:23:55 and the reliability of the analysis. Yeah, you can sort of talk about data, accuracy or data quality in a few different ways in our product. So the most basic data accuracy is about the entre data itself, right? So you want to make sure that you're not missing data, that, you know, you have tests where you see, you know, the number of transactions is it in line with today with what you saw yesterday and that kind of stuff. so you can have basic sort of almost like unit tests, data quality checks on that. I think the harder part, though, is on the attribution, the labeling of addresses, right? And that's where there's potentially room for error. And so, you know, our philosophy is that we would rather not have a label than to have a wrong label.
Starting point is 00:24:42 And so that means we have very strict requirements on precision. And so as an example, we, for, you know, I mentioned we have 3 to 400 million adverts, is labeled for every one of those labels, we have evidence and documentation. And of course, a lot of that documentation is algorithmically generated. But you can always look up, you know, if this address has this label, why does it have the label? So there's, you always have the documentation for it. And I think this is something that we, we take pride in, that we actually take that stuff really seriously, because it can get out of hand really quickly and you can get like a negative spiral
Starting point is 00:25:23 if you start getting the wrong labels because typically what happens is if you're looking at a new address and you want to label it, you start looking at what are the labels of the addresses, the neighbors of that address. And so if you have a wrong label, it can propagate very quickly
Starting point is 00:25:40 and it goes out of control and of course, you know, you get more wrong labels. That's the first thing. But secondly, more importantly, it can impact the user experience if people see a wrong label and they lose trust in your problem. like so this is something we take very seriously and and we of course you will always you will always
Starting point is 00:26:00 have some errors like that's it's just not possible to have literally 100% precision but it's actually very rare that we have incorrect labels and even if you do arguably have incorrect labels very often there's a very logical explanation for it so at some point I remember we were called out for having labeled on the address Doe Kwan and we were told that that was incorrect but it turned out that it was basically Terra Labs or the company
Starting point is 00:26:31 related to it so so you know is that an error like maybe it is in a strict sense but of course you know it's a very related entity and you might have a similar thing with like some Suzu or or three arrows and things like that
Starting point is 00:26:47 but you know we take pride in having the best precision on the labelling that we do. And this is something that's very important to us. Which specific heuristics do you actually use to kind of generate the labels? I mean, and how do you come up with them? I'm sure you kind of add stuff all the time, right? Yeah, so it's a combination of man and machine, right?
Starting point is 00:27:13 So the heuristics would be some of them are deterministic and quite simple, right? So think of you want to label every uniswap pool. Then you can literally just look at the uniswap factory. And like we were talking about earlier, you can look at the events that are emitted and the events contain all the information that deterministically say, here are all the uniswap pools. This is like the easy case.
Starting point is 00:27:40 And in theory, anyone who can read blockchain data and have a system for this could do this. Then there are other things that are more complex, like exchanges, centralized exchanges because there technically the information is not deterministic from just the on-chain data. You need to do some
Starting point is 00:27:57 inference and you need to understand like how these entities manage private keys and manage addresses. And so there you typically have kind of a baseline juristic that is sort of somewhat universal for any exchange.
Starting point is 00:28:14 So you might say actually if you send funds to an address and the address automatically forwards it to what we call a main wallet, like a finance main wallet, then you can be pretty sure that that wallet is a deposit wallet for the exchange, right? And so this is going to be, you know, correct in most cases, but you may have to tweak it and you need to curate the main wallets because those can update, right? Let's say HTX or, you know, Gate I.O might get a new main wallet.
Starting point is 00:28:43 You need to make sure that you're on top of that. And you need to have sort of alerting in-house if you see lots of funds, move because maybe they move to a new cold wallet or new hot wallet and so on and so forth. And so you can state these heuristics programmatically and
Starting point is 00:29:02 you can label up low addresses in this way. So it's kind of like an inventory of many different heuristics and sometimes the heuristics can build off of each other. But you know, living up to our name,
Starting point is 00:29:16 we've started using more AI for the labeling recently. But the challenge there is like you might end up with probabilistic labels and like I was saying before you want to make sure that the precision is as high as it can possibly be.
Starting point is 00:29:32 And so I kind of, but yeah, it comes back to the same point where you have this man and machine set up where the machine is going to be doing like 99.95% of the work in terms of the just the quantity of addresses. But the
Starting point is 00:29:49 0.05% that the human does can be very valuable and it can also be used by the machine to label all this stuff. And so, yeah, so it's interesting, right? Because the AI approach we've started making use of now, they kind of like almost sit
Starting point is 00:30:05 in between the sort of the man and machine, like the human and the machine. But we've seen anything, you can also, by the way, you can look at, I mean, depends like how far down the rabbit hole you want to go, but you can also look at the economics of it, like how much does it cost us to label an address, right? Like, if you, if you just
Starting point is 00:30:25 think of the human labor or even like the cloud cost of the heuristics, and then you start looking at like optimizing that and saying, actually, you know, the heuristics is very cheap, so you want to make sure that the heuristics can label as much as they can. And you want to be very selective of what you use human labor power for, because that can be like $10 per label, maybe, depending on, you know, many different factors. So, so, yeah. So, yeah. Yeah, this is kind of an interesting optimization problem over time that you have to balance out different things. You want to make sure precision is very high.
Starting point is 00:30:57 You want to make sure that also the recall or the coverage is very high. You want to label as many addresses as you can. You also want to make sure that you can do it in a timely manner so that you can label addresses very fast. And you want to make sure that the economics are permissible so you don't break the bank. if it costs us like $20 million to label
Starting point is 00:31:19 20 million addresses like yeah that's probably not going to work right so yeah so there are many interesting factors here and like this in a way this is kind of the most unique thing we do at the company right if you think about it
Starting point is 00:31:34 and it's exciting because it's one of the areas that probably can be you know enhanced the most with AI in my opinion yeah absolutely How fast do you label these addresses and big movements? I'm asking because obviously this, if you're a trader,
Starting point is 00:31:54 this can actually give you a lot of alpha, right? So if someone kind of moves funds from a known cold wallet to a hot wallet, chances are they going to sell them, possibly on an exchange. So you might kind of want to front run them in the traditional sense, not in the blockchain sense. So how fast do you do this? And do people explicitly use it for this sort of use case? Yeah, so we have a feature called Hot Contracts.
Starting point is 00:32:26 And what Hot Contracts does is it looks at newly deployed smart contracts that have a lot of funds going into them. And Hot Contracts now, actually very soon, like probably in a matter of weeks, is going to be enhanced. with AI labeling. And so that means the idea is like, probably, I don't know if it will be minutes
Starting point is 00:32:53 because you kind of need to accumulate a bit of data on the address in terms of transactional patterns and stuff like that. But yeah, maybe minutes, at most hours, you'd let loose the army of AI labelers on these hot contracts. And because we will have tuned and quality assured the precision
Starting point is 00:33:18 you'll be able to get pretty descriptive labels of what these contracts actually are right? It's interesting right because in a way like some of our users who are very sort of power users and advanced users they sort of see the alpha in
Starting point is 00:33:36 us not having labeled an address because they know that that's like a new address and because it's not labeled yet if they figure out what this addresses they might be sort of one step ahead of the game and so if you look at the hot contracts table ironically a lot of the addresses are not labeled but I think that's going to change literally like in a matter of weeks when we roll this out and so I think some of the people who are using that feature hot contracts are probably going to see like almost like a night and day you know
Starting point is 00:34:08 change in that view for other types of addresses it depends right I A fund typically it takes a while to actually figure out what the fund is. So if they move like a VC fund or like a liquid venture fund or something, if they move funds from one address to another, that's maybe one of the more clear cut cases. But if you have a totally new address that is like providing funds in a seed round or something like that, it can be quite tricky. You need to have multiple data points to figure it out.
Starting point is 00:34:41 And so those cases you can't talk about like minutes or hours. hours, that's like, you know, days or weeks or maybe months. So it really depends on like what kind of address or what kind of entity we're talking about. So for the newly deployed smart contracts, do you also speculate about kind of what it's going to do? I mean, is it kind of, is there a labor that says we think this is this is a newly launched pub exchange or something? Yeah.
Starting point is 00:35:18 So the idea is like with labels, right? There are a few different ways to think about labels. One is just give it a name. So like, you know, Gnosis chain,
Starting point is 00:35:31 something, you know, prognosis chain, bridge or something like that, right? But then there's a category to, which I guess is what you're getting at. So you have a category description. So this is like a stake
Starting point is 00:35:43 contract. It's a bridge. It's a Dex. It's a DeFi pool. It's a yield farm. There's sort of a taxonomy of different things it could be. And the idea is what we're aiming to do is both. Give it a name, a specific name,
Starting point is 00:36:00 and also give it a category. And in fact, the category, also, it doesn't always need to fall into one category. So you might want to give it like multiple different labels from like multiple different indicators. right so this is both a staking contract and it's a token like think of staked eth with lido for example that's kind of it doesn't neatly fall into one category
Starting point is 00:36:22 so the ideas do both like give it an aim and give it a category so I've not tried this yet so if if you were to kind of put a nudity deployed smart contract or any smart contract into into chat GPT or any of its competitors will it be able to tell you what it does? No. If it were that easy, then we would have just done that. We have tried. No, but you have to, you don't want to, I guess, like, give away too much the secret sauce, but, but you have to sort of, you do have to use chat chitp t. There's a good idea to use chat chvety, but you have to sort of guide it with the right prompts and make it use the right sources for it. Okay, so kind of
Starting point is 00:37:12 you want TertTBT to kind of figure out what's the business logic behind this smart contract? Yeah, kind of. And you have to sort of chain it, like do multiple steps, right? So it's like, yeah, I don't want to give away too much. But the idea is like roughly you want to try to make it understand. You want to make sure, I want to make sure that it has all the information, A, and then B, that it can synthesize all of the information, and then, you know, put it into like a meaning category or give it a meaningful name.
Starting point is 00:37:44 And then, you know, so you can, you can think of this as like, you have LLMs and then, you know, you could fine tune LLMs, but it's actually in practice you end up making better use of the context than you do to fine tune the LLM. And then you also do it iteratively. So you kind of ask it to solve multiple different problems iteratively or like in a sequence. And then at the end, you kind of get. something that is useful but it's more so you could you know the very sort of short form of putting it it's like it's a form of prompt engineering but it's it's actually like
Starting point is 00:38:20 pretty involved prompt engineering yeah i can imagine sorry just one more thing on that right because it's not enough like chat chabit doesn't have our existing 300 million address labels right and so that's where you get kind of an edge to because our own version of this can also tap into the the existing labels we have. And because of that compounding effect, I said earlier, where like if you know the neighbors, it can help you figure out what the label of this one is, you get this kind of sort of a moat that's built around
Starting point is 00:38:52 kind of being able to label stuff with high precision and very fast. Yeah, absolutely. You guys also use AI on the other side. So kind of if I may use it, I search for stuff. You have smart search and similar things. What does it allow me to do and how does it work? Yeah, so maybe on this end, probably the best example is signals, which is a feed, it almost looks like a Twitter feed or something,
Starting point is 00:39:20 and you have sort of these cards, and each card is a signal that we've observed on chain. And so this could be, you know, PEPA token has, you know, this amount of million dollars going into centralized exchanges. That's 20 times more than an average day. That's an example of a signal. And these signals are personalized based on what you have done in our platform. So if you have saved certain tokens to your watch list, if you have maybe added certain addresses to your watch list or NFTs, and then soon, this is something we're rolling out in the next few months, we're bringing together our portfolio tracker with the analytics product.
Starting point is 00:40:05 So if you have your portfolio tracked with us, we can personalize signals that you see in your feed based on your your own portfolio and your history, the history of trading and so on and so forth. So it's actually a toggle in the product where you can switch on and off personalization. So either you could just get the kind of vanilla feed that everyone gets or you can get a personalized feed based on, you know, what you have indicated to us that you're interested in through, your behavior in the platform, what you search for, what you've saved, and so on and so forth. So this isn't new, right? This is just, you know, what Amazon has been doing since almost the 90s or at least early 2000s in like people who are interested in this are also interested in that, like recommender systems. So this part isn't necessarily that new. But interestingly, you haven't seen a lot of personalization in Web3 yet, which is something
Starting point is 00:41:00 that's a bit puzzling. I think it's maybe because, firstly, it's a very young space, but But secondly, we didn't have like enough data that it was needed. People could still just go on Coin Gecko and like find the coin. But I think we've we've entered the era now where you have literally millions of assets. And so it's no longer feasible to just search for the asset you care about. You actually need to get stuff recommended to you because the inventory has become so large that it's not, you can't just look through it in a catalog or like on a ranking. And so that's why I think now personalization is probably going to play a big role in crypto and that's what we're trying to lead into. And obviously you make use of machine
Starting point is 00:41:44 learning and AI to make that happen at scale. So that actually puts you in a super powerful position because not only do you have really well organized repertoire of other data that's on chain, you also have kind of like the private user data that they share with you. Do you, is there kind of some sort of ethics codecs of kind of like how you treat the user data that's kind of shared with you do you kind of do you monetize that do you kind of use that to kind of cross-rent reference things behind the scenes yeah I mean there are yeah it's a great question right naturally you have to first of all respect just general privacy regulations right GDPR and so on and so forth and so so and that you know has its own sort of set of rules
Starting point is 00:42:32 and things to make sure that you're not violating, right? And that there's consent and so on and so forth. Secondly, we have a Chinese wall. This is because this is a concern, I think, that many people have, and it's a valid concern. It's like, if I use nonsense, are you going to use what I search for to label, like, my addresses, for example?
Starting point is 00:42:52 If I search for my own address, you're going to use that? And the answer is no, because there's literally a Chinese wall between the department that has access to any user data. and the department that has access to labeling wallets. And it's kind of hard for us to prove this because we're not like an open source company, obviously, or a project. But that's the reality. And so that's in our privacy policy,
Starting point is 00:43:20 and it's also how the company is structured, and literally people don't have access to both of those two things at once. We don't monetize that data. I don't think we don't need to. like we don't have an ads-based business model. Like our business model is very straightforward. You just pay for the subscription. And, you know, you get access to the product.
Starting point is 00:43:39 So in a way, I kind of like that business model because it's the most transparent business model. Yeah, it's very honest. Many people don't like it because they're so used to getting stuff for free. But on the back end, their data is being sold to. Yeah, exactly. So in a way, I feel like we're like the dumb honest people. Like, we're just charging you to use the product. And like, that's it.
Starting point is 00:44:00 we don't need to have some nefarious way to exploit their data on the back end. You could give people the choice. You could say, do you want to be private and you pay? Or do you want us to kind of monetize your data in some other way and you get to use it for free? Yeah, maybe that's an idea. Yeah. I mean, I will, like on the topic of business models, right, I think I sort of think of ads as the default business model of Web 2. and I think that the default business model of Web 3 is going to be transactional.
Starting point is 00:44:36 So, you know, it's not unlikely that our subscription model at some point will get displaced by a more transactional business model. So does that mean maybe you have CowSwap integrated into Nonson, and when you find tokens of token God mode, you click buy and you execute the trade through Kalswap. and maybe CalSwap and Nansen share any fees that are involved in that. So, you know, that to me seems like a more future-proof business model. And I'm not super excited about the ads business model. But yeah, that's a big kind of strategic topic on its own. Kind of thinking forward, how do you see the rise of privacy? I mean, a lot of your business model kind of hinges on the fact that things on chain are inherently transparent, right?
Starting point is 00:45:33 So with the rise of privacy preserving technologies on chain, how do you think that's going to change? Yeah, I mean, I think of this as you can't have both at the same time to the maximum extent. And so it becomes a trade-off, as with many things and technology. you can't have full privacy and you can't have full transparency at the same time. And so our product makes the most sense
Starting point is 00:46:05 obviously when there's room to have transparency. And so I think the reality is that many people value the transparency of blockchains because it gives them a sense of comfort. But if you know that hey, the funds that are sitting in AVE, I can actually see
Starting point is 00:46:22 all of the transactions that have ever happened with Albe and I can see all the funds sitting in the smart contract and so on and so forth. That gives people a sense of comfort. They might not actually do it, but the fact that they know they can do it
Starting point is 00:46:38 gives them sort of more trust. And if you contrast that to say a bank or an FTCS, then you kind of quickly realize that the lack of transparency can become an issue.
Starting point is 00:46:54 So I think our product naturally works best when you have chains that are public and transparent. It seems obvious to me as like a consumer that blockchains in their current form don't really work really well for payments, for example, and things you might want to do in your daily life where you do want to have more privacy. and I think it makes sense that you'll probably get some some world where either protocols or even chains or L2s have full privacy but maybe there are some guardrails
Starting point is 00:47:33 or like some rules around it so I'm not saying this is what I want but I think like one way you might imagine this is what if you had an L2 that basically had privacy somehow, but you could not make transactions over some certain amount, you know, size, right? Again, I'm not saying this is necessarily the world I want, but I could see that's being something that regulators might be more comfortable with than one where there's like no limit and, you know,
Starting point is 00:48:06 Lazarus from North Korea can, you know, potentially transact hundreds of millions of dollars in volumes. So I think you can look at it from sort of an ethical slash moral perspective. And then you can also look at it just from a sort of a pragmatic perspective of like what are regulators going to always allow. And then finally you can look at it just from a tradeoff perspective. Like if you interact with something that has full privacy, what are you giving up in terms of transparency? And then there's like interesting solutions around zero knowledge proves and so on, which in some cases can give you sort of the trust you want and like some form of transparency without revealing everything that's going on. But I think it's a really interesting space.
Starting point is 00:48:58 I don't think you will ever get to a point where everything you do on chain is totally private. And I think that also defies the object to a certain extent, right? So I mean, what ideally you kind of want is kind of transparency for the man and kind of like privacy for the little guy, right? So kind of you want transparency to kind of hold power accountable. But you don't. So you want to know what your government spends its money on. You don't need to know what your neighbor spends his money on. And the crazy thing is it's like the inverse in, you know,
Starting point is 00:49:39 the world in many countries, it's the inverse, right? We're like, governments can see everything you do in practice. Like, they could just reach out to a bank, get all your data or whatnot. Or they could sort of have a, you know, some sort of back channel into your Web 2 products, you know, whether that's Google or Twitter or whatever. But then, you know, there's really no transparency on like how they're, like, where did all that money that was spent on initiative X by the government go and so
Starting point is 00:50:12 and so forth. Yeah, totally. So I think you do want transparency for the people you elect for sure, right? That's, in a way, it's kind of crazy that you don't have that in like every democracy, literally down to every transaction
Starting point is 00:50:28 that they make with taxpayers' money. Right? So, so exactly. Like that, you know, could there be like a nonsense for all government spending? Yeah, that would be amazing. I would love that. I would absolutely love that. So you alluded to this in the very beginning. So you label well-known people on Shane.
Starting point is 00:50:50 So obviously there's ethical considerations kind of that come with it. So say I'm dope one and I feel like I don't, I mean, you have rightly labelled me, but I don't want this to be known on Shane. can I can I kind of send you an email and you will delete the label or what's your policy? Yeah you can. I mean that's the short answer as you can. But there's always there's always a bit a nuance. It might be helpful just to explain like how you get there in the first place like why someone an individual might get their name on an address. And that's typically because there's information in the public domain that we can point to. So for example, someone says on a governance forum, hey, this is my address.
Starting point is 00:51:40 I'm voting on, you know, initiative X or proposal X. And they are basically declaring that they own this address, right? And of course, there are caveats to that. Like someone could just be pretending to be them and so on and so forth. But if that is credible, we would label that and then you could point to that in your evidence, right? if they then choose that actually I don't want that to be labeled, I want it to be deleted, yeah, then we will do that. But at least that's the explanation of like how the information ended up in our database
Starting point is 00:52:15 in the first place. Right. So it's not like we go around trying to sniff out, you know, normal people's sort of individual names and like label others with them. You don't look over people, show just when they kind of do transactions at ECC. No. absolutely not and in fact like you know you if we if we wanted to do stuff like that i mean maybe not that that thing exact but if we wanted to sort of go on twitter and hunt down every time someone
Starting point is 00:52:45 like is a little bit silly in declaring something for example responding to tweets about post your address and you'll get an air drop or you know here is my new nfts that i bought which again uniquely you know basically doxes your wallet if you wanted to do that kind of stuff systematically like we could but we just don't think that's the right thing to do firstly and secondly
Starting point is 00:53:09 I don't think it's like newsworthy in the sense like people what we do can be seen as a form of journalism right so then you have to kind of ask yourself like okay if Vitalik has a wallet that has like you know a billion dollars in it yeah that's newsworthy
Starting point is 00:53:26 people should probably know about that if the founder of a project, you know, has a lot of money in that token. That's newsworthy people should know about that. If some person has a 200 bucks, you know, and bought some NFT on base and then they told someone about that on Twitter, we don't systematically track down that information and put it into our system. We could in theory, because it's public information, but we don't really do it. Sure, absolutely. So tell us about what's coming for Nansen. What kind of, So you already talked about the hot contracts update. What else do you have in store?
Starting point is 00:54:06 Yeah. So the hot contracts update is kind of a smaller example of how the labels are just going to get a lot better because we are investing a lot in AI-driven attribution. And that's going to happen, I think, faster than we initially anticipated, actually. The second thing that I also mentioned is portfolio is going to get integrated. into Nonsense 2. And this is actually kind of a big deal because it allows us to personalize the product even more.
Starting point is 00:54:34 And I think, you know, the ambition there is to be the preferred portfolio tracker of any on-chain investor. And so we are very much on-chain oriented, right? So we believe that if you have the best coverage of chains and assets and protocols, and we can give you signal on what you might be interested in investing in, that's a really potent combination. And so bringing portfolio into Nonsense 2,
Starting point is 00:55:03 I think is going to be a game changer, frankly, and it's going to happen in the next few months. The third thing is we are integrating lots of new chains. The one chain that we have been asked about the most is Solana. And so we're going to launch that hopefully within two months. And I think that's going to be pretty big. And we have some exciting ideas to try out with Solana because it's kind of its own little ecosystem and pocket, so we're going to try out some more experimental
Starting point is 00:55:34 ideas with Solana. And in addition to that, I think we're going to strengthen some of the things that we're already known for, like smart money tracking is getting better. Now we have a new squad internally that's just focused on taking that to the next level. So things like really good P&L tracking
Starting point is 00:55:52 of traders, finding wallets that you might want to monitor because they're really good at trading. That's something that we're leveling up and improving, making the overall product easier to use because it can be a bit overwhelming for people, but it's, I think, constantly getting simpler to use in a way. Like we're trying to strip away stuff, like eliminating stuff that's not absolutely necessary in the user experience. So those are some of the things. But I think, like, you know, overall We've put Nonsense 1 behind us. In fact, we're switching off Nansen 1 literally tomorrow.
Starting point is 00:56:29 So we've kind of firmly made the transition to Nansen 2. And that means we can sort of put that we can travel lightly. We can put behind us a lot of the tech debt that we had from the first version of the product. And then we can really just focus on the innovations for the new version of the product. So, yeah, this actually literally in the next three months, there's a lot to look forward to if you're in Nonsense news. Wonderful. So where can we send listeners to kind of check out Nansen?
Starting point is 00:56:58 Yeah, you can go to nansen.a.I. That's the best place to start. And you can also follow Nansen underscore AI on Twitter. Perfect. Thank you so much for coming on, Alex. It's been a pleasure. Thanks for having me. Thank you for joining us on this week's episode.
Starting point is 00:57:15 We release new episodes every week. You can find and subscribe to the show on iTunes, Spotify, YouTube, SoundCloud, or wherever you listen to podcasts. And if you have a Google Home, or Alexa device, you can tell it to listen to the latest episode of the Epicenter podcast. Go to epicenter.tv slash subscribe for a full list of places where you can watch and listen. And while you're there, be sure to sign up for the newsletter, so you get new episodes in your inbox as they're released.
Starting point is 00:57:39 If you want to interact with us, guests or other podcast listeners, you can follow us on Twitter. And please leave us a review on iTunes. It helps people find the show, and we're always happy to read them. So thanks so much, and we look forward to being back next week.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.