This Week in Startups - Neeva CEO Sridhar Ramaswamy on AI chatbots and the search revolution | E1686

Episode Date: February 27, 2023

Neeva CEO Sridhar Ramaswamy joins Molly to talk all things search and AI chatbots, including Neeva’s technology and the difficulty of building a search engine (1:41), Neeva’s consumer focus and bu...siness model (20:00), and the search-to-answer revolution (41:31). (0:00) Molly kicks off the show (1:41) Neeva’s Purpose (5:06) The technological underpinning of Neeva (8:00) The hurdles building Neeva (10:26) Microsoft for Startups Founders Hub - Apply in 5 minutes for six figures in discounts at http://aka.ms/thisweekinstartups (11:53) The problems search aims to solve (18:31) LinkedIn Marketing - Get a $100 LinkedIn ad credit at https://linkedin.com/thisweekinstartups (20:00) Why choose Neeva (24:01) Neeva’s business model (29:18) Moving from search to answers + citing sources (35:53) The ad supported publishing ecosystem (37:58) Pilot - Get 20% off the first 6 months at https://pilot.com/twist (39:13) Leaving Google (41:31) The search to answers revolution (44:34) Compensating publishers and dealing with content moderation (48:51) Open loop vs. Closed loop (54:29) Neeva copycats FOLLOW Sridhar: https://twitter.com/ramaswmysridhar FOLLOW Jason: https://linktr.ee/calacanis FOLLOW Molly: https://twitter.com/mollywood

Transcript
Discussion (0)
Starting point is 00:00:00 Hey, everybody. Happy Monday. Jason is still away. If you follow him on Instagram, you can see his ski adventures in Japan. It's incredible. I am back, though, today with another great interview. Today on the show, I'm joined by Shredar Ramoswamy, the CEO and co-founder of Neva, an AI-powered search engine and chatbot. It's private. It's AI-powered search, but it's a very, very different model from OpenAI, ChatGPT, Google's Bard, and Shreddardar himself is a long-time Googler who's trying to find a different way forward in the search world. It is a fascinating conversation. You do not want to miss it.
Starting point is 00:00:38 It's going to be a great show. Stick with us. This week in Startups is brought to you by Microsoft for Startup Founders Hub. It helps all founders build a better startup at a lower cost from day one. Startups get up to $150,000 in Azure credits, access to open AISR.I. APIs, free dev tools like GitHub, technical advisory, access to mentors and exports, and so much more. There is no funding requirement and it only takes minutes to join. Sign up today at aka.m.s This Weeka in Startups. LinkedIn Marketing. To redeem a $100 LinkedIn ad credit and launch your first
Starting point is 00:01:16 campaign, go to LinkedIn.com slash this week in startups. And Pilot. Grow your business sustainably and operate more effectively. Pilot provides the most reliable accounting, CFO and tax services for startups and small businesses. Head to pilot.com slash twist and get 20% off the first six months. All right, everyone, I am having a great day. I am delighted to be joined by Shredar Ramoswami,
Starting point is 00:01:47 the co-founder and CEO of Neva, the next generation search engine that is using the power of AI to just give us answers with maximum privacy and efficiency. and that I personally am already willing to pay for. Welcome to the show and thanks for coming on. Thank you, Molly. Thank you, Jason. Super excited to be here.
Starting point is 00:02:06 So you probably heard us, everybody, audience, talk about Niva back on episode 1674. Shridaar is also a former SVP of Google's ad business, currently a VC at Greylock. And if you wouldn't mind, I guess, let's start for those who may have missed that episode. I hope no one did. Tell us what Niva and Niva AI is and what you're building.
Starting point is 00:02:26 And also, are you building that while you are a VC at Greylock? I'm a venture partner, so it's very much a part-time job. I'm on a board of like a synthetic data AI company on Greylock's behalf. Most of my time is spent at Neva. Yeah, just unwinding a little bit, I was at Google for close to 16 amazing years, early part of the search ads team, but sort of grew with Google to run the ads and commerce teams. Left about four years ago with my co-founder Vivek, and we embarked on this crazy mission of reimagining search. We love the problem of search, but we just felt like Google
Starting point is 00:03:08 was trapped in its own success. And so Niva is about rethinking, you know, what is a core part, of our daily lives, most of us search without even thinking about it. So we wanted to create a product that was all about the user, all about actually just, you know, creating a great product serving So that's why we had an early focus on privacy, on being ads-free. But just as importantly, we thought we could build a better product. Like privacy is not a product. Privacy is like it's a feature of a product. So over the first three years of Neva, we've been busy building up a search stack,
Starting point is 00:03:42 running a crawl at Internet scale, building the search system, widely acknowledged to be one of the hardest problems out there. And some nine, ten months ago, we saw, you know, early versions of things like, GPD 3 and realized that this was a magic power that we could harness in the context of search. And so we've been working on deeply integrating language models into search. But, you know, while chat GPD took the world by storm, we again took a contrary in approach and said language models should enhance search, things that you and I love about
Starting point is 00:04:19 search, which is you want quick authoritative information, you want timely information. We said we want those characteristics to be present. So we've been working on this technique to basically run a search engine and language models in parallel. And while there's work to do, we're pretty proud of what we have accomplished as a 50-person team being able to generate answers, fluid responses to more than half the queries that people put into NEVA. And we expect that number to keep growing with time.
Starting point is 00:04:47 We are pretty confident about showing you answers for 75, 80% of queries. So fundamentally, Neva's now sort of, again, reinventing itself to be an answer engine, but provide you with reliable, cited, timely information about things that you might be interested in. So you were sort of unpacked some parts of that. That gets into my next question, which is sort of what's the technological underpinning. Are you using the kind of large language learning models that we keep hearing about with GPT3? It sounds like you're not using that exclusively. you're combining that with some special sauce to make sure that the answers are more than
Starting point is 00:05:27 maybe what we have seen recently from being in chat GPT. Turns out it's not all as accurate as one might have imagined. That's right. That's right. So, you know, chat GPT is what I would describe as like open loop, meaning that it is in purely generative mode. Now, these large language models have been trained pretty much on every document that exists in the world.
Starting point is 00:05:51 But they don't understand things like provenance. When you and I take a course or when you and I want to learn about something, we pay attention to, well, who is authority on this topic? Who is saying what? Who should we believe? Like, you know, we go through a process of constructing our belief systems. But what chat GPT doing is it's just like understood literally every sentence on the planet, but at a superficial level.
Starting point is 00:06:14 And if you ask it a question, it'll start saying things. Some things will be true, but some things will just. not be. What we do is, you know, when you put in a query, we run the search stack, and what that gives us is, or things like, you know, we know what the authoritative sites are. We know that the New York Times is more believable for news than someone's blog. And we then take the top results for that query, look at the contents of those pages. Often it's a lot. We extract, summarize portions that are relevant to the query that you have. And then we do a second level of summarization, which is what generates the answers.
Starting point is 00:06:55 But it's basically constrained generation. We try very hard to make sure that these models are not in this open loop mode where they are making up things. And we try to, you know, we would rather not give you an answer that's wrong than, you know, like we don't want to be in the business of providing bad answers. So we refrain from answering questions that we are not really sure about. We do get things wrong. Jason pointed out earlier this morning that if you ask us about how the Nix are doing this year, we take old articles. To a certain extent, this is like a problem with how search engines have operated
Starting point is 00:07:29 where they return results for every query no matter what. The fact of the matter is like there are very few articles that are talking about how the NICS are doing this year, but we just picked like the top ones that there are, some of them are from last year, and we don't understand like in the temporal meaning of his query, and that's why we generate a bad answer. These are things that the team is busy fixing. But this combination of a search engine and a generative model, we think, provides a good balance for making AI useful.
Starting point is 00:08:01 Without exactly knowing the baseline for this question, how hard is this to build? If the baseline is, I don't know, right? Zero to 10. Or if the baseline is Google to the moon. Like, how hard is it to integrate? I mean, search technology already, all of that signals processing, all of that determination of noise versus value, plus integrating these language models,
Starting point is 00:08:30 which are somewhat new, at least. This must be, like, you must have a really good team. Well, we have an amazing team. It's very small. It's a little over 50 people. But, you know, we've been working at search for a while. We have many of, you know, the brilliant engineers that help build search that are part of our team.
Starting point is 00:08:52 Search is widely acknowledged to be one of the hardest problems to crack because it's just runs at a scale that is like not really fathomable to most people. And then integrating the large models, also doing all of these at a budget that works for us. Now, don't get me wrong, we are a well-funded startup, but we are a well-funded startup, but we are We're not open AI, we are not Bing. We don't have, you know, hundreds of millions of dollars to throw out the problem. Our annual infrastructure budget is less than $10 million. So part of what we have had to do is be very inventive about distilling models, about making them smaller.
Starting point is 00:09:32 How do you get more bank for the buck? So we pre-trained a lot of our own models. We fine-tune them. And so a lot of effort and sweat has gone into making this work at scale. but within the constraints of how much budget we have. And a lot of it goes back many years. You know, from early on, we've been building a search stack. What we release would not really have been possible
Starting point is 00:09:57 if you were just like a thin shim on top of someone else's API. At this point, for pretty much any meaningful page on the planet, we can generate summaries for you, and we will soon have some of these pre-computed. So it's a very technical problem, but obviously the result is a simple, easy to understand, hopefully brief answer that gets to the heart of what you're looking for. Doing more with less is more important than ever.
Starting point is 00:10:28 You know this, especially for startup founders. We all have to be efficient. And if you're running a startup, I want you to know about the Microsoft or startups founders hub. It's a no-brainer. They're going to help you scale efficiently while preserving your runway. How are they going to do that? Well, they're going to do it with the best startup program.
Starting point is 00:10:45 we've ever seen. Up to $150,000 in Azure credits. Plus, access to open AIs APIs. Think about that, as well as the new Azure Open AI service. And listen, there's other things that Microsoft has available to you as part of this program. How about free access to GitHub and Visual Studio? How about one-to-one technical advisory and expert help on topics like product, fundraising, and go to market? How about access to a network of mentors that are plugged into the startup world, plus free access to partners like LinkedIn and Bubble, which can help you build your MVP even quicker. Microsoft, so generous. The Microsoft for startups founders hub is open to everyone. Whether you're in the idea phase or you're further along, there's no funding or requirement. They want any founder
Starting point is 00:11:32 to be able to get access because they want all founders to succeed. It's really a no-brainer. It takes five minutes to apply and startups can get massive benefits immediately. Go to sign up right now. Let's take a pause and write this down, everybody. AKA.m.S. This Weekan Startups. Like I said, most generous program I think we've ever seen on this podcast, aka.m.s slash this week in startups. Thanks again, Microsoft. Let's break down. We're talking about sort of parallel technologies working together and then parallel sets of problems to solve with each one of those technologies. So let's go back to Google for a minute and talk about the problems with search that you wanted to solve and dive into those a little bit
Starting point is 00:12:10 more, we can start with privacy, but it's a longer list than that. Yeah, so I would broadly put them into two buckets, and the irony does not escape me that I certainly had a large role to play in one of those buckets. One is that ad load just keeps going up. Text ads are one of, you know, it's pretty much the most incredible business invented. no one in their right mind for like the first 10 years of this decade thought that Google would make more than $100 billion of revenue just in search ads. It's remarkable, but that's the business it is. And it's also a reflection of our times that, you know, there's no limit to expectation while Google's a very successful business. It is to judge by how much it can grow.
Starting point is 00:13:03 And so, as I said, I approved many of the decisions that increased ad load, but I also felt ultimately that there was no limit to it. It would just keep going up and up unless some externality like the one we are, you know, going through happens. And then on the organic side, you know, they are hemmed in their own way by the ads model. the dirty secret behind the ads model, and this actually goes back to even TV, is that ads have to stand out. And so broadcasters, for example, would increase the volume of the ads as they started playing
Starting point is 00:13:42 because they knew that that would get a little bit more attention. Similarly, you can't make organic search too attractive because that's going to kill how much money you make on the text ads. It's the same reason why on your Facebook feed or Instagram feed, you're going to see video ads interspersed with mostly pictures because moving video gets more attention than a static picture. It's the same team. And so it's hard for the organic team to say,
Starting point is 00:14:09 we can create the best product that there is while they're constrained by but don't kill ads monetization too much. These are some of the problems. The ads ecosystem, you know, with things like the double-click acquisition, essentially Google became like this purveyor of ads for the entire internet. The other side of ads is measurement. His conversion tracking is basically measuring to see which ads are successful in getting the customer to take the action that you want. And that sort of unleashed a whole ecosystem of thousands of companies keeping track of
Starting point is 00:14:45 every single thing that you and I do. And Google's a big part of that. And that's what people talk about when they talk about this rampant loss of privacy, every single thing we do goes into all of these databases to be used for sort of monetizing at a latter time. And so, you know, a lot of Neva is really about, okay, start with a clean slate, start with no constraints, focus on the product, what can you create? How let's, in terms of Google, there are also sort of other maybe complaints about search, like the idea of a filter bubble that is created almost as a result of that data collection.
Starting point is 00:15:28 Is that something that Niva has an opportunity to avoid? I mean, there still is even some controversy around ranking and ranking pages and the idea, like there is somebody out there who just heard you say that the New York Times is a trusted source who was losing his mind, which haven't helped us, but still, it's happening. Yeah. I think, honestly, that filter bubbles. are less of an issue in a search-based system because it's not a recommendation system.
Starting point is 00:15:58 YouTube definitely has filter bubble issues where if you start on a topic, I'm sure you've run into this, all of a sudden there'll be a whole bunch of exciting material about that topic that start appearing in your feed, and it can be a vicious cycle. my mom, who is 80 years old, got super energized about one of our last two presidential candidates for reasons I simply did not understand. I'm like, how do you know this and how do you have these opinions? You barely watch anything outside of religious videos. It turns out that she was in a loop that fed her a bunch of videos from one candidate.
Starting point is 00:16:36 And so that's obviously is extreme, but a lot of us fall victim to these subtle ways in which content that we see is engineered. And the way we think about this is we give control back to the user. So as part of Neva's design early on, we said, you can express your preferences, but which sources you prefer. We don't think we should be in the business of deciding that one newspaper is better or more trustworthy than others if it roughly looks like, you know, they are, you know, they are high quality.
Starting point is 00:17:13 We've also launched features like bias buster, which is a fun little feature that lets you see viewpoints from different ends of the spectrum. So you can, you know, if you search for something political, you'll see a slider and you can decide, ah, I want more, you know, right-wing opinions on this, or I want more left-wing opinions on this, and we change the news results based on what you pick.
Starting point is 00:17:39 So our approach to a lot of this is to, return choice back to you, people will, you know, did ask us for features like not having Amazon or not having Walmart in their search results because they wanted to shop from the little retailers out there. And ad-supported search engine cannot do these things. On the other hand, a customer-supported search engine, you know, we are all about serving customers. So we are like, oh, you don't want that particular site?
Starting point is 00:18:06 That's fine. We'll give you all the others. And so very much an aspect of Neva is this point. personalization is giving that control back to you. But doing that again, with privacy in mind, we don't track your search queries. By default, there's like, you know, there's no list of queries that you have issued. Search is a deeply personal product for most people, and we want to respect that. Let's talk about marketing to senior level executive. You know, the ones that make the purchasing decisions, the ones that listen to this podcast, well, when you're selling business to business
Starting point is 00:18:39 solutions, you want to market to decision makers. It's really hard to find. find these executives on social platforms where people are dancing and arguing about politics or movies. I get it, right? Social media is like a, it's a whole spectrum of conversation. But I have a solution for you. I think you know what it is. LinkedIn have 180 million senior level executives and 10 million C level executives on the platform. That's out of the 875 million people, right? These are the Crem de la Crem. That's a ton of purchasing power waiting for you to use LinkedIn ads. And LinkedIn ads is built specifically for B2B marketers. That's you. They know what you're trying to do. Trying to get a demo, trying to get a meeting, trying to get a white paper out there.
Starting point is 00:19:22 It's all built around the business to business use case and no other platform in the world can offer these kind of eyeballs. LinkedIn is going to help you reach them in a very respectful environment. Context matters, right? And when people are on LinkedIn, they're doing business. LinkedIn equals business. Business equals LinkedIn. Let's back that up with some data. Audiences that are exposed to brand messages on LinkedIn, six times more likely to convert above average. Okay? Make B2B marketing everything it can be and get a hundred dollar credit on your next campaign. Go to LinkedIn.com slash this week and startups to claim your credit.
Starting point is 00:19:57 That's LinkedIn.com slash this weekend startups terms and conditions do apply. When people use Niva and when you interact with your customers, what do they tell you is kind of the number one reason? Is it privacy? Is it customization? Is it just simplicity? I mean, I feel like just the idea of an ad-free page that gives you what you want all by itself, if I didn't care about privacy, it might be enough.
Starting point is 00:20:21 But I'm curious why people come to Niva. It's changed over time. A set of people definitely were attracted by the customer focus, by how we actively prevent tracking by third parties. People have said things like, after a few days of using Niva, the world just feels like a quieter place. It's just, there's not, you know, stuff chasing you all across and demanding your attention. Andrew Olavski, who's a journalist for The Telegraph, he once wrote an article about us,
Starting point is 00:20:58 he's like, ah, this feels like breathing clean alpine air after you have lived in the smog of a city for, you know, for years. But, you know, privacy by itself doesn't, in my mind, attract enough of a mainstream audience. Privacy is a little bit like being fit or eating clean. All of us want it, but, you know, not so many people are actually going to do it. Part of what is really exciting about the current moment for us with AI is we are able to harness its power to truly create better experiences. I know we are going to get into things like, you know, copyright and fair use.
Starting point is 00:21:40 But these short answers are super attractive to people, especially in this day, of not that much attention. And so these answers are a big step forward in the experience of search. And in a bizarre way, we are able to do this quickly much faster than others because we've been at this problem for a long time. and we have a business model where I don't have to worry about, oh, is this going to lose 5% of revenue? Because this is all about making the product better. So it becomes, Neva's original part was about ads free and private, but it is very rapidly transitioning into it's just a superior product experience in addition. It's just efficient. You can also, side note, there's all kinds of things you can search, right?
Starting point is 00:22:32 you can attach your email to it and different accounts and actually be able to search things that have otherwise been a little bit hard to. It's almost like organize instead of the world's information like your life. That's right. That's right. So we've built these apps, these connectors to various things. We have not pushed those, especially in a work context, as much as we should be. We're looking into things like how do we get more companies to adopt these things,
Starting point is 00:22:57 as you can imagine, especially with AI thrown in. there's just a lot of time efficiency to be had and there's stock of things like enterprise language models but part of what we have already figured out is how to pay attention to permissions, how to make sure that like your docs don't get mixed up ever with anyone else's doc, how do we make sure that that stuff is kept strictly segregated
Starting point is 00:23:22 and then this retrieval augmented generation, this search and large language models working in parallel are also very privacy safe because we don't have to ship information from your private documents out over the API or have a language model that mingles these things. So there's a lot of other exciting stuff
Starting point is 00:23:44 that we can add on top of the base product. Yeah, once you have like local indexing and you can just sort of index all the 75 sticky notes that I have all over my desktop for various things and I can search those, please. I'll pay extra for that. Let's talk about their business. model and then I do want to move into some of the kind of more controversial aspects of large
Starting point is 00:24:05 learning. How do you make money? So we are a premium product. We, you know, the base product is free to use. It's a pretty generous, you know, basic product that is, that is free. For much of Neva's existence, we've been focused on user growth. We have a, we have a paid subscription product, it's roughly $50 a year, about $6 a month. And so in addition to essentially getting unlimited NIVA, unlimited number of connectors, unlimited searches, and so on. We also work with partners to give you additional benefits like a VPN and a password manager.
Starting point is 00:24:47 But it's really, it's a freemium model. And the people that end up converting are the ones that use search a lot, use things like our personalization features, like setting preferred products. providers for news, for example. But yeah, there's a lot of focus on getting people to try the product. And we have a pretty decent conversion rate from people that try the product over to people that become paid subscribers. How does that, I mean, knowing that we live in a world where Google, as you mentioned,
Starting point is 00:25:20 has nearly unlimited growth potential and revenue potential, how do you pitch to VCs? that this is a Google-esque 100x investment? Well, as you know, multiples for companies and expectations have changed dramatically over the last year. And yeah, it is a very different, you know, environment. The important thing to remember, though, is that Google is so large that our search is so large because it's everyone on the planet, that even a small fraction of people becoming NIVA subscribers is enough to make us a self-sustaining company.
Starting point is 00:26:08 And so, you know, I've done these estimates, something on the order of 5 to 10 million subscribers. I mean, now, that's a lot of ARR. Mind you, are talking $250 to $500 million a year, by our estimates would be enough for us to run a search engine sort of for the whole world, so to say, you know, like being available in all countries. And that is enough to get going. So even a few percentage points of market share is still very large,
Starting point is 00:26:39 simply because, you know, the population of the world is very large. In addition, we are also looking into, we have active conversations about taking our technology and licensing it, whether it is the search APIs that sites like Reddit or others can use, or being a search provider for large language models, because other people have realized that doing the kind of constrained generation is actually very useful for generating authentic output
Starting point is 00:27:13 that users will like. So there's a lot of basically business-to-business opportunities that have also come about just within the last three months, given how much excitement that there is in the area. A direct answer to your question is like VC scare because the potential market is so grangenthe. Everyone needs search at the end of the day, right?
Starting point is 00:27:37 And people want better search. I think it's also a reasonable question about whether the ad-supported model, which obviously makes insane amounts of money now, but it's like under attack from a million different directions. Like one assumes that you just cannot continue to make infinite money forever on a model. that may be outlawed in entire countries. More than that, right?
Starting point is 00:27:59 It's, you know, if you get into sort of the excitement of last week with Microsoft and Google, a sufficiently deep pocketed competitor is basically saying, I am going, the state of the art for search is no longer going to be a wall of links, many of which are ads, the state of the art is going to be this succinct answer that people can consume and they will be much happier with it than a wall of links. I think that is something that is true if somebody can do a good job. That's number one. They also have the cloud then to be able to take large amounts of market share.
Starting point is 00:28:45 This is a problem with very successful business models. when there is a fundamental change in the assumptions of the model, you are very, very vulnerable. And that's the place Google is at. Google is amazing as a search engine, as a search company. But if you rapidly move to a world in which you want a paragraph of text and that is the state of the art, it's not really clear where the wall of ads fit into that. And that, as much as anything else, might be the thing that drives change into this whole ecosystem. Right. And this gets back to your tagline and our kind of accidental tagline.
Starting point is 00:29:22 We're moving from a paradigm of search to a paradigm of answers. That's right. Fundamentally. Yeah. At what point did you realize that that could be the disruption? Like at what point were you like, nobody wants to search. Everyone wants answers. That's the key.
Starting point is 00:29:36 This is actually, this has been known for some time. Yeah. Google has this feature called featured snippets. Right. It's an edgy feature where they would pull out a snippet of text from a site. and show it directly on the search result. And every experiment that's been done to raise coverage of that feature
Starting point is 00:29:58 would have wildly user-positive bet. I mean, users always love it. And it's very, you know, to a certain extent, this is like instinctive, which is, would you, like, do you want to read three sentences and have your question be answered? Or do you want to tap on a site and go to that site?
Starting point is 00:30:18 and try and figure out where that answer is. You know, a few years into Google, I sort of realized it's dumb, but I was like, yeah, no one wakes up and says they want to click on an ad. Similarly, no one wakes up and says, yeah, I want to click on a link to find out something. If you can just give it to them, they're much happier. We've always known this, but the question, the real breakthrough is, oh, wait, we can do this at scale for billions of web pages. and essentially assemble a single answer on the fly. We are like constructing mini Wikipedia pages on the fly for every query. That required a lot of things to come through.
Starting point is 00:31:01 And that happened over the course of like Q3 and Q4. We had a lot of stumbles just trying to figure out how to make this thing work at scale. Our first systems would take eight seconds to, you know, do like the first level of summaries. We're like, ah, that's never going to work. So a lot of things had to come together, but the basic insight still is answers trump links all the time. And that gets us, I think, directly to this question about publishers. Even when Google introduced those snippets and the summaries, publishers, you know, somewhat rightly. I mean, I have worked at these publishers.
Starting point is 00:31:34 Plus freaked out. I mean, it was a massive drop because the link to N-answer ecosystem created its own secondary and tertiary ad cottage industry. They couldn't have exactly. I'm thinking of all of the recipe pages right now that are about to die because it's like 1,500 words, 5,000 words of personal story with an ad every paragraph and then finally the recipe at the bottom. That only exists because of the search paradigm that we live in now. What happens when, even when you cite your sources, no one clicks on those links anymore.
Starting point is 00:32:10 They have their answer. It's a real question. We try to be very thoughtful with how we do these snippets. We want to keep the length short. We want to convey the answer. And clearly, people want to take action. They're going to go, you know, click on the link and actually go to the site. But I do think that we are at a moment where the traditional search engine publisher model
Starting point is 00:32:46 is just going to go through a lot of a lot of stress. The second order, in many ways, you can think of this as like the ads ecosystem sort of coming back full circle. What I mean by that is that the contract between search engines and sites was that they would make their content available to search engines, and search engines would deliver traffic. Now, for the past 20 years, Google slowly but surely has been lowering the amount of organic traffic that goes out, especially for commercial
Starting point is 00:33:21 clicks. You show more and more ads, then fewer and fewer people are clicking on those organic links. So this is a slow but steady degradation. And similarly, because of the ads model and the kind of things that search engines have prioritized, which is like how much time do people spend on site, bloggers, for example, rather than just show you the recipe, they have made them long because you want more space to put in more blocks of ads because you know that that drives more click-through. I think we are headed to a world in which a lot of these things are going to be short-circuited.
Starting point is 00:34:01 And I won't pretend that I know exactly what the consequence is going to be. But I think one important consequence will be that a lot of sites that simply thought of their role as just creating lots of content and traffic would just roll in from search engines and ads would monetize, are going to think long and hard about what does it mean to actually attract, acquire, keep customers, and have them coming back. So for example, one of our predictions is that the same technology that's used to generate answers on NEVA can be used as a publisher product so that a publisher like Vox Media can essentially offer a conversational interface to content that's on Vox Media. We have the tech to already do this.
Starting point is 00:34:51 So that's a much more engaging experience right on the site, where right now most of these sites don't even bother to put a search box because no one's going to search here. They're just going to search on Google, and that's how we get people back. So I think this is a little bit of back-to-basics of every customer that shows up what our site is perfect. We have to figure out how to convert that into a meaningful long-term relationship and what are the tools that we need to have in order to do that. And I, for one, you know, have to say like, I will not be sad to see the anonymous internet and pages full of ads designed to grab your attention go. I'm old enough that I remember the time when I used to like subscribe not just to magazines,
Starting point is 00:35:38 but also newsletters that I paid for. And, you know, I think there will be quite a bit of back to basics, plus also consolidation. I think that is something else that will definitely happen. It's great to have this honest conversation about it, because it reminds me of the conversations about something like minimum wage. Like we always say, okay, if we want people in America to be able to afford houses, we have to raise a minimum wage to, you know, $15 minimum, maybe $20,
Starting point is 00:36:07 and companies will go out of business. And that is just a fact. And it is a hard fact. And we are sorry that that will happen. But it's very interesting because the Google ad-supported search model created in some, I mean, the ad-supporting publishing model obviously predated Google. But because you had this huge ecosystem that got even bigger, exponentially bigger under Google, it sounds like what you're saying is that a lot of the ad-supported publishing ecosystem,
Starting point is 00:36:34 as we know it now, only exactly. exists because of the Google ad-supported search ecosystem. That's right. That's right. And by the way, the thing that I'll also, I don't think anyone planned this. The thing that I'll also say is that ultimately, I don't think that the ad-supported ecosystem turned out to be a great deal for publishers, for creators of great content. I think it's the platforms.
Starting point is 00:37:02 If you think about it, it's the Facebooks. It's the Googles of the world. that are the largest media companies on the planet because they became aggregators of all of our collective attention. So, you know, again, this is me being a little philosophical, but I think there is something fundamentally unjust about, you know, one company like spending a billion or so to run a product and making $120 billion of money on it. and essentially advertising became a winner take-all sort of game, and there are only three winners. They're called Amazon, Google, and Facebook. And I don't think, as I said, anyone predicted this,
Starting point is 00:37:49 but definitely in an answer world, I think one of these players is going to be under heavy stress. All right, everybody, I'm here with Asim Dahar. He is the CEO and founder of Pilot. You guys know Pilot. They help everybody with their account. accounting CFO and tax services. Welcome to the program. Thanks for having me. Let's talk about one or two of the serious mess ups that founders make when it comes to doing their taxes and their
Starting point is 00:38:16 books properly. Lots of good stories here. The first is simply just not doing your tax return at all. I think a lot of people think if the business didn't make any money, I don't know any tax, which is true. And therefore, that I don't need a file tax return. That part is false. You still have to do the return no matter what. This is a classic place. People again. bit in. Yeah, and if you don't do your tax returns, what about R&D credits? Right. You're missing out on a bunch of good stuff, including the R&D credit. The second one that people really get burned on is a big no-no is paying people without using a payroll system. Like, you can't just write someone a check. You've got to do payroll withholding,
Starting point is 00:38:52 got to pay payroll tax, and you've got to set up one of these systems. All right, pay your taxes, even if you lost money this year, and make sure you use a payroll system critically important. All right, everybody, here's your call to action. Twist listeners can get 20% off their first six months at pilot.com slash twist. That's pilot.com slash twist for 20% off your first six months. It's so, it is so interesting to be you. Like how, what has it been like to sort of undergo this transition? And you must have been realizing this at,
Starting point is 00:39:23 at Google for all of those years that there could be this better way that it would be massive. Like, it's very brave to, you know, I'm not trying to blow smoke. Like, it's scary. It's a scary thing to be the person at the center. And now you've got some cover because other companies are headed in this direction also. But to be saying, this is going to be a huge disruption. This is going to be a massive change, right?
Starting point is 00:39:48 Billions of dollars of value will be at best redirected under this new regime. And that even though it will be better, revolutions are always painful. Volutions are very painful. I think, uh, Yeah, I mean, leaving Google itself to start Neva was was and is a scary thing. You know, you realize, I'm often reminded by life that a 50-person company is nothing. And it's very hard to get the browsers to work with you. It's very hard to like, you know, stop Chrome from putting up these obnoxious screens to get people that try to get people to convert back to Google after they have made Neva their search engine.
Starting point is 00:40:33 But, you know, in many ways, I think of like what we have done with Neva Ayan answers as the ultimate vindication that like competition matters, that a small team can create a product that is state of the art in some way in my mind tells me that the more competition that there is, the better that we'll all be. And I'm not just saying it in a trite passion. As I said, it's very hard for the organic team at Google to launch a lot of things because they will have a big impact on ads revenue. I also like to remind people that AT&T invented the answering machine in the 1930s, but basically canned it for a long time because they thought it would reduce the number of phone calls that people made. Successful people always prevent things that are going to undermine that model. What is really interesting about this moment is this dam is, this dam is. about to break quite open.
Starting point is 00:41:32 Yeah, let's talk a little bit more about the revolution that we're in, because one of the things that Jason has said many times on the show, and I'm sure you've heard, is that publishers are lawyering up, that the lawsuits are going to come flying, that the copyright questions are intense. We've had, he and I, a little bit of an argument about how citations can even work in a world of neural networks, where in theory, they're learning and ingesting information and repackaging it.
Starting point is 00:41:57 Where do you sit there? What do you think that the next couple years of this upheaval look like? So I think there's going to be all kinds of lawsuits. Jason's absolutely right. Yeah. And I do disagree that language models cannot cite. If you run them open loop, yes, they cannot cite. But there are other ways to run them.
Starting point is 00:42:24 As I said, we provide context for the generation that, also protects us because we are less likely to hallucinate and just make up, you know, make up something. So this is how we are able to provide citations for, you know, each sentence that we write. We went to a lot of trouble to make sure that we were able to do that rather than, say, provide like, you know, four links at the bottom. And we also try to keep our summaries short. We've been playing around with length. You know, sometimes it's gone to like 300 words, but we're trying to reduce that back to 80 to 100 words. You know, but fair use is a complicated legal concept, not just involving things like the nature of the work or the purpose of the use, but also things like the effect of the use on the potential market.
Starting point is 00:43:20 Now, Neva's so small that I can tell you that we're not going to have an impact on the potential market, but Google doing. this at a large scale clearly is not going to be that. So I do think that there is going to be litigation on this topic. And I don't, I agree with Jason. I don't think it's okay for a model to just ingest all of somebody else's content and have it be duplicated very easily. So for example, image models, I think, are particularly vulnerable if they're going to ingest every painting that someone's made so that you can easily say,
Starting point is 00:43:56 ah, you know, draw a sunset in the style of Mali and off it goes and creates a painting that's just like one that you have created. I would say that looks the weakest. You know, on the other hand, if sort of open-loop generation becomes a big business, I think that can also be subject to attack. We are trying to, for our own reasons, we are trying to create a space in which,
Starting point is 00:44:23 A, we are being thoughtful. Citations are mutually beneficial, but there's obviously a lot more to come as the big players get up to change their products. Are you compensating publishers? So we made an early commitment to compensate publishers. That was part of the whole subscription model. So, you know, we have arrangements with people like Cora and Medium
Starting point is 00:44:50 where we show their content on the search results. search result page. So we committed to sharing 20% of our subscription revenue. Now, we are a pretty small company. We are going to, you know, we have millions of users, but in terms of subscription dollars, we're going to hit like a million in ARR shortly. So we are pretty small. And, but there are informal arrangements already with publishers where we pay them a fixed token amount of money every month in exchange for the privilege to be able to show, you know, essentially these are. right-hand side paints that show more content than what would be deemed fair use. So, yes, we've made a commitment. We are paying out small amounts of money. If we grow, this will absolutely become a more meaningful thing. But will that be sustainable for every publisher or will you have chosen partners and then will that sort of distort what gets presented?
Starting point is 00:45:47 Seems like it could get messy in a hurry, huh? 100%. Can and will get messy in a hurry. Again, the rule that we have, we've actually talked about this, there's even an internal doc that we have about this. The nice thing is like, you know, part of the benefit you get from running ads for a decade is you've sort of run into every policy problem that there is on the planet. And so we have a document internally, for example, that says that we are not allowed to take into account whether we have a partnership with somebody when it comes to ranking their content, the ranking, has to be organic and things like an answer pain on the right hand side is more, that's the consequence of the partnership, but it's not that we are going to, that we're going
Starting point is 00:46:33 to change the ranking. And, but I agree with you that this is going to become a, this is going to become a complicated area. And then other hard problems you have decided to tackle because you do not make your own life easy, involve content moderation. Now we're starting to have this question about what should be allowed to show up from these models, to what extent content is moderated or filtered or censored, depending on what kind of word you want to use. Obviously, that has been a problem. Certainly a policy to grapple with at Google for a really long time.
Starting point is 00:47:06 How are you approaching that? So roughly the way a search engine thinks about content is that a search engine's job is to present you with all the legal content in response. to like a query from you. We represent the internet. We don't pretend that we know what is right or wrong or that we know who is, you know, who's position is better.
Starting point is 00:47:35 So in that sense, it is a little bit different from, say, a hosted platform. Like Twitter and YouTube are often held to a different standard because they also host the videos that express these opinions. So a search engine in some sense is very much of show all the content that's out there. And by the way, European people who think about things like content moderation and free speech differently would be livid with us for showing response. This is Google, for showing responses to things like how to make a bomb. They're like, you're helping terrorists on the answer from Google would always be like, hey, listen, as long as the page that tells you how to make a bomb is legal and can exist, it's our job to help you find it. We are not in the business of saying these queries are wrong. Obviously, this does not apply to illegal things.
Starting point is 00:48:25 But hopefully that gives you an idea of like how does a search engine think about content moderation. We do not do open loop content generation. So none of the stuff that, for example, you've seen an EYI answer is something from a language model that is not constrained to, okay, please focus on the content from these sites. This is what they say. now summarize it for the Neva user. Can you give it, this seems like a good place and probably that good place was 42 minutes ago, but to give us a baseline definition of open loop versus closed loop,
Starting point is 00:49:00 is that the difference between generative and not? So, generative AI is a broad term. It is used to, you know, it's used to refer to models that can, like, synthesize, generate content that has not existed before. I think the primary breakthrough in these models is that they understand language really, really well. I think a few years from now, while there is hype about Chad GPD and Sydney and other things, I think what we are also going to realize is that these are amazing, like, interfaces between
Starting point is 00:49:39 us and the world of computers. So traditionally, like things like websites, they're hard to deal with. They want information in a very specific format. God help you if you like, you know, change the order of months and days when you're entering a birthday on some site. Like, they're very finicky. I remember the days of Boolean. Yeah.
Starting point is 00:49:56 They're very finicky. These models are much more flexible. And they are able to understand our input much better. They're also able to generate output that we can consume much better. They can also be used in a way where you just query them and they will tell you about whatever topic that you ask them. what we do is basically constrained generation.
Starting point is 00:50:19 That's what I mean by open loop versus constrained. Open loop is the model has learned once and you're treating it like an Oracle and asking it questions. While in our case, as I said, we do constrain generation because we essentially fix the belief system
Starting point is 00:50:35 of the model to content from a small set of pages and say, answer this question within this context. If you can't answer, like say so. Don't try to make up facts. I think it is a revelation to people that these models can make up facts.
Starting point is 00:50:53 That seems like a thing that maybe not quite enough people realize that open loop means, you know, people are encountering these hallucinations where they're getting these kind of rants or machine freakouts. Well, these models do not understand, as I said, they don't understand provenance. They don't understand, you know, what is right and wrong. They don't understand which site is more trustworthy than others. They don't really understand, like, which author wrote what book. They might understand a little bit off it.
Starting point is 00:51:25 So they're a language savants. I think it is pretty amusing what they can do if you just ask them questions. But it's like someone gave you a box that, like, knew all of the books in the world, but didn't, like, really understand it at any deep level. You're asking it questions. It's giving you answers. sometimes it's right, sometimes it's amusing, sometimes it's hardifying.
Starting point is 00:51:50 And sometimes it's subjective. It feels like that's sort of the important difference too. Constrained models are going to give you facts. They cannot be subjective. That's correct. That's correct. They're subjective. There are also all of these artifacts in how these models are trained
Starting point is 00:52:05 that we don't even begin to understand. Now, we used to run in my ad team, we ran these massive machine learning models like in 2007, And it would freak out everybody that came into contact with that team, like how little we understood about what the models did. They were right on average, but it is impossible to predict any particular action that's taken. So it won't surprise you, for example, that Chad GPD will not write a poem about Donald Trump's
Starting point is 00:52:34 accomplishments, but will happily write a poem about Joe Biden's accomplishments. And I don't think, like, the people that created the model understand why. It does that. So these models are pretty mysterious, but... So they don't know why. This is a Jason point that I need to clip and take back to them. They don't know why that's happening. That was not like a decision by some engineer to say, don't do this.
Starting point is 00:52:57 No, no, no. It's like it could have been what they trained on. It could have been feedback data that they got. Like, you know, maybe there were poems of Donald. I don't know. There were poems generated. And somebody pressed like the, you know, the thumbs down button lots of times. and that made the model realize maybe it should not be generating those.
Starting point is 00:53:17 We don't really understand that. But again, I think our obsession on sort of the strange things that these models can create, I mean, it's fine, but there are lots of really good practical uses for these models. It's a little bit like, if you go back and think about evolution, clearly language was a big breakthrough for humans. all of a sudden, you can express concepts. Similarly, you know, tool usage was a big breakthrough. I think what you're seeing with Neva, for example, is very much just like, oh, a language
Starting point is 00:53:55 model, oh, a tool, that's a search engine. Let's see what they can do together. But these models are now going to be integrated with browsers, with different APIs that can get you fax with structured information. So we are very much at the beginning of what these can accomplish. I said, it's fun to look at them and see, oh my God, they're saying crazy things. But I think it takes away from all of the practical users that you can put these models to. And, you know, as a business, that's the thing that we focus most on.
Starting point is 00:54:29 I mean, I wonder, does it start to create a marketing problem for you? If everything that you see, you know, an engine that looks a little bit like Niva produced is made up or somehow biased in someone's eyes or do you just think this is a long game and we will win it with accuracy in the end? It must be kind of annoying. Yeah, it's a long game and it feels very much
Starting point is 00:54:52 like, you know, so much can happen in one week. You know, all of this was super quiet and then Microsoft came out with all guns blazing last week and they were like, they could do no wrong. And today Kevin Ruse writes an article about how deeply disturbing
Starting point is 00:55:08 interacting with Sydney is. It's a wild ride. But as I said, there is so much raw power in these models that, like, for Vivek and me, every week we are like, we can get more done in one week than we could have in like three months of sweating it with the search engine even last year. So there's a lot of benefit that you can get from it. And, you know, this, you know, as a startup, you have to roll with the punches. If answers are a problem, then, you know, we're going to rebrand.
Starting point is 00:55:40 as believable answers and work hard to earn that trust. These are things that you just have to be, you have to be reactive. We used to make quarterly like goals. Early this year, we said like, no more than six weeks. The world is just like got a whole lot more unpredictable. You know, as a technologist, it's exciting, it's exhausting. But as I said, all technologies have positive and negative sides. We are focused on how do we actually use them to create better.
Starting point is 00:56:10 products and you're going to see a lot of people do that. What a fascinating time you're right at the eye of the storm. Shradar Ramoswami is CEO and co-founder of Niva found at niva.a. I think you should all check it out. Thank you so much, Molly. Thanks for your time. Thank you.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.