This Week in Startups - 24/7 autonomous agents for everything with Induced AI’s Aryan Sharma | E1854

Starting point is 00:00:00 Number one, you consumed a bunch of information. Two, you did the work and learned how to code and build products. And number three, you got on a plane and you went to where the action was happening and you met people. You networked. Yeah. This is so easy. And the industry is so wide open. So for people in America who are saying, oh, my God, I don't know how to break into tech, consume every bit of content you can about what you want to do and being an entrepreneur.

Starting point is 00:00:28 number two, learn how to build products to all the information is on the internet. And number three, go to wherever the most action is. It happens to be the Bay Area, but there's also stuff happening in Dubai. There's stuff happening in Tokyo. There's stuff happening in Sydney and Melbourne. Sometimes you've got to get on a plane and meet people in network. It's very simple. One, two, three.

Starting point is 00:00:48 You figured it out, kid. I love it. This Week in Startups is brought to you by Miro helps take ideas from in your head to out there in the world with its ability to democratize collaboration and input. Sign up for free at Miro.com slash startups. Northwest Registered Agent. When starting your business, it's important to use a service that will actually help you. Northwest Registered Agent is that service.

Starting point is 00:01:19 They'll form your company fast, give you the documents you need to open a business bank account, and even provide you with mail scanning and a business address to keep your personal privacy intact. Visit Northwest Registeredagent.com slash twist to get a 60% discount on your next LLC. And nuts.com is your one-stop shop for the highest quality foods for business. They offer delicious office snacks, corporate gifts, and wholesale ingredients. Nuts.com is offering new business customers of, free gift with purchase and free shipping on orders of $125 or more at nuts.com slash twist. All right, everybody, welcome back to this week in startups.

Starting point is 00:02:06 We're covering this AI trend, well, for over 10 years. But the last year has been absolutely extraordinary since chat GPT launched, specifically 3.5 and then later 4.0. Everybody else has gotten in on the action. And auto GPTs and baby GPDs were all the rage a couple of months ago. What were those? They were like little user agents. They would perform tasks anonymously, maybe string a couple of them together. Like, go do a search for a flight and then buy it for me. This is what the internet and intelligent agents were always supposed to do, but they never worked. And it's a brilliant idea, obviously, a little bit scary to some people, but it's the obvious future. And as is the case, founders always try to make the future get here a little

Starting point is 00:02:53 bit quicker. And today, we're having a founder on the show, Aryan Sharma. He is the CEO and co-founder of Induced AI. And what they're building is agents that live in a browser. And it's called, as I said, Induced AI. Welcome to the show, Aryan. Thanks for having me. I'm scared to be here. Yeah. And you heard my, thanks for coming. You heard my intro there. Explain to the audience what your building and why it's important. Yeah, I can give you a quick background on why this is even a consideration and how we came to the concept of agents. I think one of the first traces of this, the whole agent idea stems from the RL world, which is reinforcement learning. It's one of the earliest AI research areas. And what's basically means is you can have these

Starting point is 00:03:47 models that are learning how to do different things and they can try a bunch of different patterns and they get rewards when they're doing something right and then they get penalties, think of it as penalties when they're doing something wrong. If you're teaching one of these models how to play tennis, every time they hit a good shot, they get plus one. Every time to do something wrong, they get a nice one. And the way intelligence is baked into these systems is you just allow them to do a bunch of patterns and then you keep rewarding and sort of giving them penalties when they're doing

Starting point is 00:04:15 wrong. And over time, they try out a bunch of different patterns and they become smarter. That was kind of the earliest traces of what agents meant where they have. have kind of agency to figure out and then self-develop a little bit of intelligence. We used to do all of this. The whole agent premise was restricted to reinforcement learning for a while because you want to try out a bunch of things. And that was kind of the only architecture that allowed people to work and build these agents.

Starting point is 00:04:40 There's this paper called World of Bits that OpenAI wrote in 2016 and it was Andric Apathy and a bunch of others. There was the first version of these agents where they were teaching agents how to control the web, how to play some games, Minecraft, and a bunch of environments. And they kind of tried building it in a reinforcement learning first way. What has happened since then is obviously we've had transformers and we've had these amazing language models come in. And they've opened up a new set of capabilities.

Starting point is 00:05:10 And after Chad GPD came out last year, it was evident that it's got a bunch of interesting capabilities. You can ask questions. It will give you responses. It can sort of reason on its own. you can, there was interesting patterns that people, the developer community was creating auto-GPT, baby AGI versions of this where you have a prompt itself and then chat GPD can talk to itself and have these loops and a little bit of that loop that is important in reinforcement

Starting point is 00:05:36 being created. Give an example of that. Yeah. So a great example of that is you, let's say, you give Chad GPD a task, come up with a copy for Jason's new fund and it comes up, you give it some description. This is the highlights of the funds. This is what we're focused on, et cetera, et cetera. And then it comes up with a three-line copy.

Starting point is 00:05:56 And then you take the three-line copy and then you just give it back to the system, analyze this copy that you came up with and make it better or give me criticisms. It gives you like three criticisms. These are the things that you can improve. And then you just kind of keep continuing the looper. It takes those three criticisms, goes back. And it's sort of trying to self-heal or self-correct in some sense. But it's all happening at the prompt level.

Starting point is 00:06:17 So this is super easy for anybody to do. nobody, there's no reinforcement learning, there's no model architecture, nothing involved. It's just smart use of language models. You're stringing them together in interesting ways to create this sort of agentic behavior. And what was interesting for a lot of people in the developer community

Starting point is 00:06:33 was that now that we have these language models and we have these interesting new capabilities, maybe we can go back to the premise of agents, but design them in a language model first manner instead of reinforcement learning or some of the older versions of agents that were created. and auto-GPD baby HIP are first versions of that where you have a language model sitting at the center

Starting point is 00:06:53 to go and give them a command, maybe go to Google, search up for something, and then give me a summary, and then extract top three highlights from the summary. And then that model that is sitting and taking in your input has access to a bunch of tools. It can choose which tools it wants to use. You can decide and go to this website,

Starting point is 00:07:12 perform this action, capture this data, process this data, and other. So it has a little bit of this reasoning engine built in to itself, and you can execute the task using those tools. So it's not, the language model is no longer restricted to its training corpus. It's no longer restricted to just giving you text responses. It has a little bit of agency to use just text input output to interact with the outside world to interact with the internet, leverage APIs as different tools and do more stuff for you.

Starting point is 00:07:37 Now, you can't do that currently with chat GPT. If you try to get it to do things on the internet, I did the other day. I said, hey, tell me the best pairs of boxer shorts. It gave me six different types of boxer shorts. and I said, okay, buy me one pair of each in medium, 32 waste, and it was like, I can't do that. So in your world, it could do that. It could go search the web for those six brands, put it in the cart, check it out, use my credit card, know my address, and then ship it.

Starting point is 00:08:05 That's what you were building, correct? Essentially, we are building, we kind of think of them as many digital workers. They are these language models that are sitting and you can give them instructions and they have full access to a browser. We're only doing browser stuff now. There's different ways of doing this where you can have it at desktop level, you can have it at mobile level, you can have it at the EDI level, but we sort of think that the browser is the most exciting and the broadest way to do it.

Starting point is 00:08:32 So I could create a personal shopper using your software. I could tell the personal shopper, your job is to find the best item, make a list of those, report back on this, and then I'll tell you which ones I want to order and then you go order them and make sure they get shipped based on this criteria. Or I could even take that out. Just say, get me the three best coffees, the highest rated ones

Starting point is 00:08:57 and ship them to my home address and it would actually do it. Yeah, so the one caveat there is there are two ways to think of these agents. One very exciting premise of these agents is running them fully autonomously, which is what AutoGPD and a bunch of other agents originally started with. You're just given a text description like this.

Starting point is 00:09:15 Go find me shoes. or go find me, flight tickets, and then it automatically figures out where it needs to go. It'll ask you questions to figure out what you want and then automatically figure out whatever needs to be done to get you your output. That is great. That's kind of the future that we are building towards. But in the near term, in the next four, five, six months, based on the current capabilities of the models that we have, it is a great vision.

Starting point is 00:09:39 It's a great demo. But when you actually implement it, there's problems with reliability because fundamentally these models are non-deterministic and they come up with new outputs every time. So you cannot be sure that every time you ask, you know, go and purchase shoes for me, it's going to get you shoes from the exact same place or get it from. And more likely than not, it will get lost because in these, there's no rules that are surrounding them. There's no guardails around these models. All right. Founders always ask me for pitch deck punchups.

Starting point is 00:10:10 And you know what? I got some great news for you. We worked with the team at Miro, The awesome whiteboarding software I've been talking about to create an amazing pitch deck template for founders, which you can see if you're watching the video right now. This is going to help bring your pitch neck from zero to hero, from zero to VC ready. And our founder university participants love this template. We use it all the time.

Starting point is 00:10:29 It saves them time and it gets them more meetings. So head to mero.com slash mirrorverse, M-I-R-O-com slash Miroverse, and search for pitchneck to check it out. And if your team is hybrid or fully remote, Miro is so useful for you. It's like an old school in-person whiteboarding session, but distributed and asynchronous, so you can do it on your own time.

Starting point is 00:10:48 Miro lets you brainstorm ideas and collaborate on projects from anywhere in the world. When you think Miro thinks zero to one, but faster. And Mero is so much more than a simple digital whiteboard. Your team can collaborate on important stuff like research, design, planning, and feedback cycles. And faster inputs equals faster outcomes. And we all know, product velocity and startup velocity is how your company is going to win. So to access our new mirrorverse template and thousands of others, sign up today for a free

Starting point is 00:11:15 Miro.com slash startups, Miro.I.O.com slash startups. That's mirror.com slash startups to sign up for free. So you're trying to do this with repetitive tasks like SDR, a sales development rep, is a perfect example of a job people hate. It's repetitive. You go and find targets to sell some SaaS software to. Everybody gets these annoying, you know, email sequences. But they obviously work. People are still doing them. So that's one of the first use cases that you're building. Yes.

Starting point is 00:11:46 So that's the simplest analogy that, this example of that is, it's all stuff that was done previously with RPS software. So UiPath and, you know, some of these like large RPA companies that have existed for several years and decades. Their idea was let's string together tools that don't have APS and we can connect them together and build these bots. What did you refer to those as? What kind of companies?

Starting point is 00:12:09 It's called robotic process. automation. It's the RPA industry. RAPA, robotic process automation, as opposed to business process automation. RPA is these repetitive tasks, robotic process automation. Interesting. Yeah, and there it's just, you have these robots that are created, mini digital scripts and workers that do a bunch of repetitive tasks. And that has existed for a long time. So it's one of, it's a last-spend category for a lot of enterprises. That's how enterprises automate work, especially when they're dealing with tools that don't have APIs

Starting point is 00:12:42 and that you cannot just string together through Zapier or existing API tools. LinkedIn would come to mind, right? Lincoln slows you down. It doesn't have an API. Everybody wants an API. They don't want to give an API because they know they go fast. But with robotic process automation,

Starting point is 00:12:58 you set up a browser, you search, it goes and does these things automatically looking for, I don't know, CTO's chief technology officers, puts them into a database, you know, and sends them a link, an email, whatever, tries to validate, guess their email and then validate it with an email validation service. And that's how people have these databases,

Starting point is 00:13:18 if they ever try to sell your databases, based on LinkedIn data, it's these RPAs that have done that searching. And of course, they get turned off if they load too many pages, so it's a bit of a cat and mouse thing, correct? LinkedIn is a great example. You can look at a bunch of legacy industries like healthcare

Starting point is 00:13:33 that have insurance platforms and claims processing platforms. All of these don't have APAs, a lot of real legacy industries rely on RPA. The problem is that RPA has existed for a long time. It used to be done in a very manual way. Even though it's the eventual goal is you want to automate a workflow, it takes a lot of effort to kind of set up these processes because these RPA companies go top down.

Starting point is 00:13:56 It's almost like for every dollar you spend on implementing RPA, you have to spend $5 or $6 on consultants, who will actually come and understand your process. They will buy one of the software from one of the vendors. And then the reason it's so expensive and time-consuming is that traditionally, if you want to string together and automate a workflow on the browser, you have to script every step. So simple Google search for launch or this week in startups is go to Google. You will have to click on the field, get the selector, the HTML selector, which is behind the scenes, the DOM or whatever of the webpage. Then click on that field, type in your text, get the identifier of the button, then go to the page.

Starting point is 00:14:35 It's just every field button element that you'll interact with on the web in completing your workflow, you have to manually go and script it. And the problem with that is scripting takes time, but also these scripts can break. Because if these websites keep changing layouts all the time, they keep changing selectors and class names all the time. And because you're hard coding it to selectors and class names from the HTML, if any of that changes, your scripts will break. So you have to constantly keep maintaining these scripts. So that's kind of one part of why. And this is where a language model might come in, because the language model would, look at the page if your profile page gets updated,

Starting point is 00:15:09 or I should say LinkedIn changes profile pages, and they just move the HTML around and people who are looking for what city you're in, and they called it location instead of city, well, that breaks the RPA, right? And so now I just said, what's the location? The language model should be able to figure out, what's the location in the first, I don't know,

Starting point is 00:15:27 that, you know, 500 words of text on the page. Yeah, the language model is doing real-time inference on every run, So it can basically handle these changes. You don't have to spend as much time shifting, just broad directional input of the city. You don't need to point where the city is. You look at the page, get the thing reliably. You can obviously set guardrails around that as well.

Starting point is 00:15:47 That's kind of one problem of RPA that solves in this new world. The other problem is that with traditional RPA, you can only, because it didn't have any reasoning skills, you can only automate things that are, you know, rule set-based. Let's go to Google, put in this exact text, click on the first link. they go to this exact way. You cannot do things like, you know, go to this LinkedIn,

Starting point is 00:16:07 analyze if it fits my ideal customer persona, or analyze if this falls into the five cities that I want to target, and then basis that, you know, do this action or draft a custom message. So any level zero cognitive reasoning tasks that you, language models can be pretty good at. You cannot do them with traditional RPS. So we've kind of taken this whole industry of how RPA was done, and designed in an AI-native way of how can we make set up and maintain it's easy.

Starting point is 00:16:37 You have a demo, you can show us actually how this works, yeah? Yeah, I can pull up a quick demo. It should give you a good idea of how the base version works. So for people who are listening, we'll describe for you what's happening on the screen. Yeah, so you just go to the induced platform. You have, it's just an empty screen with no workflows right now. I'm going to click on new workflow on the top right and it just asks you, you can either design the whole work from scratch where you give step-by-step input or you can use

Starting point is 00:17:06 AI assistance and I'll go through what that means but just ask for a workflow name. I'm going to put in employee timesheet and I'll run through what workflow I'm making. So I just put in an employee timesheet and then I'm going to put in a step by step. This is just English description of a workflow that I want to design and for context. The workflow is basically think of a construction company or a company that has warehouses, of physical centers across the country and they have employees coming in and filling in paper timesheets. There is basically a log of when they're coming into work, how many hours they're working, what are the breaks they're taking, etc., etc. And this company takes in all of these paper timesheets,

Starting point is 00:17:45 puts them on an air table, and they have to manually calculate payroll for every employee because it's all on paper. So you have to take whatever is on paper, understand it, then run it against a company policy doc to calculate. This guy work five hours. This is the deductions. This is over time. and then go and enter whatever payroll you've calculated back into an air table. So this was, we did this for one of our early pirate customers. They have a physical, they have a real back office of 15 people in their finance team that, finance and ops team that does this. But with kind of in the new world with some of these agents,

Starting point is 00:18:15 you can just describe this on our platform. It's crazy. So it has, hey, identify, access the air table payroll, navigate to the employee payroll base from stored variables, identify a relevant employee, search for employee with, the payment status, review timesheet data, open the employee's time sheet, et cetera, and then it says calculate the payment, access the employment payout info sheet on Google Docs. Using the timesheet data, note the total hours work, end time, minus start time, deduct any breaks, multiply the hours work

Starting point is 00:18:47 by the hourly rate mentioned in the Google Doc to get the gross amount, deduct any break time expenses or other deductions. And so this is the step-by-step process that some human did. You're just describing it in essentially plain English, not code. Yeah. And then you just click Create Workflow. What will do with the AI systems is unlike a lot of traditional, traditional is a bad word to use because it's all very new, but unlike a lot of other autonomous agents,

Starting point is 00:19:14 we don't directly start running it based on English. And this is actually the first time we are ever showing this product on media, it's like a podcast or a video. But this is at this week in Startup's Exclusive. Thank you. But it takes. in whatever input you've given. It compiles it.

Starting point is 00:19:31 We have this middle layer in between, which is just the input that you've given, but structured into smaller steps. So all of the steps that I took in, it just broken them down, chunked them in. So it's basically access head table, then loop through the entries, pick one entry, just smaller, smaller chunks of whatever I described. And the reason this is useful is one for visibility, for who's designing this workflow. They can see whatever input, what's the final translation. And then what you can also have visibility.

Starting point is 00:19:58 is we split it into a bunch of different action types. So it's going to the web page is one action, clicking, filling, all of those are standard web actions. Then there are a bunch of data actions like looping, identifying, filtering that you can do on the page. And then we have the agent blocks, which are all of the smart actions. So once you go to an employee's timesheet,

Starting point is 00:20:19 any calculation that you want to do, you can use the agent block to delegate and get input from a model. So it's basically a bunch of different block types, depending on what your workflow step is, that we automatically identify and put in. You can obviously edit, you can obviously make changes to this. And at the end, you can have standard, you know, ELT or outputs in whatever format. So after the workflows is run, you know, you can have triggers that if you have API

Starting point is 00:20:41 call, put it in a Google sheet or anything that you need. So it just puts an in into these formats. And if you notice for those you can see the screen, it automatically puts in these variables across the step. So if your workflow involves capturing data from one place, and then using it in another step later on. It's basically, I'll talk more about the browser environment, but this is basically a runtime that is designed for agent.

Starting point is 00:21:05 So it has access to its own file system, it's on memory, it can store data, retrieve data. So it's basically firing up a computer in the cloud, essentially, or a browser session. I don't know if you're using Chrome or Chrome OS or Windows. You can light up a virtual browser or a virtual machine anywhere, AWS, etc. So you're basically popping up a desktop,

Starting point is 00:21:27 and then running this stuff, yeah? Yeah, so we spin up a chromium fork on the cloud. So it's a virtual machine with a chromium fork. It's a custom browser that we've designed specifically for running bots and autonomous agents and systems like that, which we can go into. But that's kind of, it spins that up, which is why it has access to all of these tools that can use in the floor. And then once you're just happy with the workflow, just click run.

Starting point is 00:21:50 And this is kind of interesting because the way it runs is it's been, the browser that has spun up on the cloud, you can see a real-time live stream of the browser when the workflow is running. So it's as if you're watching a team member's screen. As you can see the stuff that's happening on the remote browser, streamed live here on the left. And on the right, you can see step by step what's being done. So you can see the stream, goes to Airtable, it loops through the entries,

Starting point is 00:22:13 it picks one of the employees, opens the employees data, opens up this time sheet. And then now it's going to run OCR. And it can use tools on its own as long as you're giving it step descriptions. I'll use OCR document processing to capture. text input from this document, store it in memory, then I'll go to this Google Doc, which has information about how you calculate payroll based on the timesheet data. It'll capture this in memory as well.

Starting point is 00:22:38 And then once it has both of these things in memory, it's going to compare both the data points, run a reasoning step, which is why reasoning steps usually take longer. But it's going to take both of those things in, calculate a final payroll amount, and you can see the variables being referenced here. We'll go back to a table, and it's just going to fill in. We've got a bunch of fields that were to be calculated. We're just going to come back, fill in. Total hours work is nine, big duration is one, over time zero.

Starting point is 00:23:05 Total payable is two seven. You'll put in the hours. You'll put in total amount. And the interesting thing is because this is all running remotely, we kind of call this the mission control view of the platform where you can have a bunch of different browser instances doing different things, all the same thing running at the same time. So it's like a real back office.

Starting point is 00:23:26 you can scale up, scale down, you can have 15 instances running at once, you can have five. You've got one that's, you've got the Airtable instance, you've got one running LinkedIn, we've got one on AWS. There's a tech crunch thing here and you can kind of see live streams of all of them. So for the task of a manager who's managing a 15 member back office team goes from actually delegating tasks to real people to just sitting and watching 15 of these screens and if something goes wrong, just go in and flag it. But otherwise you can have a top level view of everything. So if I'm running some insurance company, team, customer support, whatever,

Starting point is 00:23:59 my startup has these six or 15 agents running doing the tasks that humans were previously done and you just watch them and make sure it's doing it correct. And it feels like you've been working on this for less than a year. How many months into this are you? We started working on this around April of this year. March, April is when we started. So you're six months into this process?

Starting point is 00:24:21 Yes. And we've kind of identified. So just as a disclaimer for this, this run was very well described. So I gave explicit input, I described everything, and that's how it run. It doesn't require me to manually skip the whole thing and it runs reliably once I've given input, but it requires explicit input. And I think we took that, the approach that we've taken architect actually, which I was talking about, is like the most important thing in this six-month process that I think has helped

Starting point is 00:24:50 us a lot with. Instead of designing an automation product that sits at a couple. at a Chrome extension and then you use it synchronously on our computer where I have an extension, I record something, and then I can replicate. But I have an extension that is an assistant and I give it commands on micro-manage doing things. We design everything to be remote and have its own environment. So we took chromium, forked, made some changes to it of how the DOM comes out, how the HTML comes out.

Starting point is 00:25:15 Has anybody started using this yet in like a real-world situation or are you still in the laboratory? We launched early October and then we went live with a couple of folks. We are live with about 15 now and that is across different sizes. Small companies, mid-sized companies. We've kind of found our sweet spot in these mid-market to upmarket companies that are operating in sort of older industries like healthcare, you know, financial services, etc. So it's live with a few of them. We are constantly, we're changing the form factor a little bit because the text input that you

Starting point is 00:25:50 saw where you have to describe your workflow and then it translates to a workflow, which is just the actual workflow that's running. That is not, it's the easiest way to start. That's how most AI language model products start, but still not very intuitive if you think of it from the user's perspective, because when you're typing the text description for your workflow, you basically have to have a browser screen that's open and then you're typing out every step where I'm going to this and I was just like type out, okay, go to this. Then what am I clicking on?

Starting point is 00:26:16 I just type out, you know, click on this. And what we are designing now is it's sort of like, And this is the video that I'll share, so you can probably edit in because it's not even an alpha right now. We're going to put it out in a few weeks where it's sort of like think of a co-lab notebook, a Google Collab notebook where you have a bunch of cells. And if you want, I can show you Google Collab as a reference. But it just we have, so this is how Collab works. It has these bunch of cells and then you can basically give in like you can run each cell. So I can be like, you know, print hello world.

Starting point is 00:26:49 This is how regular co-lab works. And then you can add, like, each cell can be run. And then you can have a bunch of more cells that come together. And you can string together cells to create something. We are creating a new interface for how we set up workflows. But instead of you having to type everything at once and then edit them and then run it and then come back and debug, we have this environment that is set up for you to create workflows. So you kind of just come in and instead of typing out code snippets, you have, you just type out

Starting point is 00:27:15 your English steps. And we have in the place of this section on the right. we have the stream that we had opened up. So you just come in and go to Google, and then you can actually see it go to Google. And you're like, click on, like, search for this week and start it actually see it happen. So it's basically real time, you know, setting it up.

Starting point is 00:27:32 Once you're happy with every step, just click confirm and then it automatically migrate store workflow. It's much more intuitive. It's easier. If you had a press monitoring service, you could say, search the web for this person's last name or this person's name, go to Google News, click on the link, summarize it,

Starting point is 00:27:48 email it to this person, let them know they were in the news, you know, or mentioned in a news story. That's a job that PR people do for a living. Now, are you going to get the, you know, you'll probably do some mistakes and probably will put old stories in. It's going to, you know, not be perfect in your world, but it would certainly make the person who does PR clipping and what they call media monitoring or web monitoring,

Starting point is 00:28:12 it would make them bionic. They would be able to do 100 times what they do every day. So essentially, that's what you're doing. And then anybody who has a business process can basically script this. Starting a business used to be such a painful process. You needed to get a lawyer. There were tons of fees. It was a mess, but not anymore. Just check out Northwest registered agent. They're going to help you form your company fast. Remember, speed matters. And then they're going to get you the docs you need to open a business bank account instantly. Then they're going to provide you with mail scanning and a business address.

Starting point is 00:28:47 they're going to do all that keeping your personal privacy intact. Northwest can form LLCs, corporations, and nonprofits. And here's why founders love Northwest. There's no hidden fees. There's no upselling. You can call them or cancel at any time. And Northwest has the best of both world solution. It's simple and self-serve, but they can be hands-on if you need help with their amazing registered agent service. Northwest provides everything you need to start and maintain your business. And they're giving twist listeners a 60% discount for just 30% $39 plus state fees, they'll form your LLC corporation or non-profit. So visit Northwest Registeredagent.com slash twist today. Northwest registered agent.com slash twist.

Starting point is 00:29:30 So what happens to the business processing, outsourcing industry in India, where you're from, and I know you have investment from Sequoia, India, or what is now, there's a new name for that firm. Yeah, Peak 15. peak 15. Yeah. So you have an investment from them. When you show this to the people who are in the business process outsourcing, did their heads just blow up and go, oh my God, what's going to happen to these 100 million people employed in the business processing world? Or has this just been a continuation and there's always going to be more business to process? Yeah, I think there's one. It's an extremely large market where this, where outsourcing happens. And it's just

Starting point is 00:30:14 increasingly, we're all, the tech is growing, but it's like the legacy industries and there's all sorts of unique things that keep coming up, which keeps growing the business process, outsourcing, or the back office, or the Philippines, India, all these kind of work, things that get delegated outside and you have these economies that are created for people that are remotely doing these tasks. So I think the market is huge. They always need more automation. So every time we go and we've had a lot of people who run these global, they're called GCC. So, the global back office centers and like a bunch of different words for them. We have a lot of them reach out. And they are always looking for automation, even though they have, you know, 50,000 people sitting and doing these things because they want to make it more efficient. They want to be able to take on more business.

Starting point is 00:30:56 So a lot of them want to be customers of this so they can improve internal processes. But I think in general, the trend that we're seeing is this will kind of take the low-level things first. So they're extremely repetitive things that have very little. cognitive involvement are almost always easy to automate even with traditional RPA. With these models coming in, we'll be able to do a little bit of cognitive tasks as well, where a little bit of thinking, filtering, you know, profile validation, lead enrichment, things like that will go in.

Starting point is 00:31:24 But then there's always, you know, complex tasks or sensitive tasks that you want these back offices to do like payments, et cetera. So I think it'll be a slow migration, but this kind of just trajectory of techware. Incredible. Yeah, GCC stands for global capability centers. This is India. when you want to outsource a back office, you were Uber or Airbnb

Starting point is 00:31:44 and you have a huge amount of, I don't know, refunds to process or claims to process or complaints, whatever. You would hire a GCC to work with you on the best practice, hire some people in a lower per hour location and to just get it off your plate in America

Starting point is 00:32:02 where, you know, people candidly don't even want to take these jobs. No matter what you pay, then they just wouldn't take them. And, you know, that's what's been happening for, I don't know, for all time, but certainly in the last 30 or 40 years, this process of outsourcing. So I guess, since you're so close to all of this and you're building it, you're talking about very low-level things, people who are doing data entry, I guess, data cleanup, SDRs. These are, you know, very the lowest paying white-collar jobs, I would say, right?

Starting point is 00:32:35 These are how, you know, in three years, what job you think you could do? Can you do a bookkeeper and accountant in five years? Could you do a paralegal? In 10 years, can you do every job? Could you do a salesperson's job in full? You're sending stuff negotiating, et cetera. Where do you see this winding up? I think the way I think about this, and I can share a few tweets if you want.

Starting point is 00:33:03 It should be interesting to see as a reference. There was this paper that Jim. from Nvidia released and this blew up on Twitter you've probably seen it before, where it was basically GPD4 playing Minecraft. And the way this operates is called Voyager. What it's doing is it does a bunch of things. And then once it does a bunch of things, it analyzes those things and it buckets them into skills. So, you know, chopping the tree, it's done a bunch of times.

Starting point is 00:33:31 It sees what the reactions come in and then it's going to bucket that into a skill of, you know, this is shopping. And then every time it does chopping, it kind of feeds into that loop of refining the chopping. skill and basically what's written here is it unlocks a new training paradigm where training is execution and it's like it will help the runs that this agent is doing in Minecraft helps iteratively compose a bunch of skills that it can slowly like learn but it's all confined to Minecrafts it's a constrained environment where this agent is operating and developing skills and continuously improving its skills I think the way to think about how this slowly

Starting point is 00:34:07 these agents become powerful is we will not have to be. have, in my view, generally capable, super autonomous global agents that can do everything. The way we'll get closer to these agents being more powerful is the way you said, bookkeeping, you take three tools that are involved in book treatments, maybe quick books, maybe, you know, Excel or Google sheets and maybe like a bunch of code interpreter calculation stuff. You get these agents to use these tools. You build this repository of skills across these tools. Or you kind of let them explore and build these repository of skills.

Starting point is 00:34:38 and then the more they run, the better those skills get. As much training data as we can put into them, they slowly get better as using those three, four tools, doing those three, four kinds of interaction on those tools. And then when you kind of go into that agent and do this calculation for me on my quickbooks and then run this math function or do a prediction or a regression for me, it's able to use those three tools for well and start doing things. So I think it's going to happen sooner.

Starting point is 00:35:03 I don't think it's two, three, four, five years out. It will start, we'll start seeing specialized versions. of these agents and specialized agents that can use a bunch of tools come in in the next few months. It's just the way to approach that is get as many real-world use cases, get as many real-world tools, get these agents to learn these real-world use cases, tools run them a bunch of times, and then kind of keep improving how they can. So I think that it will slowly go in skill-by-skill. Makes sense.

Starting point is 00:35:29 And so what happens when some bad actor pops up 15 windows to go cause chaos on the open web? across services. What do you think the potential there is? Because unlike ChatGPT, if you ask it, you know, chat GPT is not going to go out and start trolling somebody on Twitter or Reddit or harassing them, let's say. But you could start very easily with your software. And I'm not saying you would do this, obviously, but it's obviously on the path.

Starting point is 00:35:57 And people do this already. But you could fire up 15 accounts or 15 browser windows, 15 different accounts, and then try to maybe swing an election, right? We saw the Russians had boiler rooms doing this. And they were, you know, all, we found out about it. That was part of the Mueller report. Trigger warning, Russia Gate. But they were actually doing this, but they were using humans, right?

Starting point is 00:36:23 And that you do have those boiler rooms, I think, in Manila, India, and other places where people do fake reviews of products. So somebody here could fire up 15 of these windows and start posting pro-Palestinian, pro- Hamas, pro-Israel, whatever, comments, or just generally causing chaos. So how do you think about that? Because your tool would allow a neophyte, a non-intelligent person, you know, a bad actor, to go absolutely buck wild and destroy everybody else's experience on the web. I think so. First part is there is a lot of this is what OpenA has also dealt with, with their browsing

Starting point is 00:37:03 being limited to only a few websites, and they had to take down browsing in between because is bypassing authenticated pages and giving you paywall content through its scapers without actually. So there's like a bunch of things that are happening there. So I think it's important to add guardrails. And in the way we think about this is because we've designed this as a browser environment that is meant for bots, we can build it up in. So we don't need to build it up like a human browser environment. We can add a bunch of guardrails that are specific to these bots that allow them only a limited set of capabilities.

Starting point is 00:37:34 and a bunch of websites are just out of bounds, the bunch of capabilities are just out of bounds, and the users that are controlling these things, they can define a bunch of rules sets, but there are a bunch of global rules sets that just prevent bad things from happening. So your terms of service and then what it's allowed to do, you could say, hey, listen,

Starting point is 00:37:51 we don't want you using these bots to go post to social media sites and ruin them. So you could just, I assume you're just banning the ability to do that. These bots can't go out and post on Reddit or whatever. Yeah, and we're banning a bunch of these. We are going safer than we should be going right now just because it's easier to build up the safety spectrum than come down. And the other kind of way we think about this is slowly it's going to evolve with the platforms where maybe if a bunch of bots are being run on Airbnb, they should be involved in the decision-making process of what kind of bots are allowed or not allowed. And there will be some sort of transaction that will eventually happen given the rate of progress that these autonomous agents and browser bots are seeing.

Starting point is 00:38:34 that websites will have to define, and this has happened for a long time. So robots.txT files exist on the internet for a lot of websites. And Open Air recently, a couple months ago, open source their scrapers, and they opened their signatures.

Starting point is 00:38:47 So a website can choose if they want to let open air scrape them or not. And I think we'll see similar versions of that with these bots and agents where you'd be able to allow bots to run on certain parts and some sort of web standard. Somebody wanted to go to Airbnb, and they thought,

Starting point is 00:39:03 oh, let me just go to the checkout and check out with a fake account, then cancel the reservation just to ask the personal question, right? So once you have the book, you can, I guess, have a dialogue with the person and it wanted to fish for information or whatever. You could just say, you know what? That's not an allowed use case. And then you have a conversation with Brian and the team over at Airbnb, and they say, yeah, we don't mind somebody building an agent that looks at up to 50 pages per week or something.

Starting point is 00:39:30 But after that, we want you to go through the API or we want you to get a license. or something, and you'll just be a good actor watching that. Of course, bad actors will do what you're doing and not require that. So there's going to be a bit of chaos, I think, we'll all predict on the open web in the coming year or two. And this is a little bit of sci-fi,

Starting point is 00:39:50 but I think it'll sort of be a new version of capture where you have, the way we have captures differentiating humans and bots, you probably have a version of captures that differentiate good bots and bad bots and you have some way of proving what you're going to do. Maybe there's some digital signature exchange that happens there. But I think we're all in this new world.

Starting point is 00:40:09 And I think the reason a lot of the larger companies are also moving slower now is because they're afraid of infringing policy or content policy in terms of service, etc., of different web products. And it'll slowly evolve both sides where they get more clarity on what they want to allow. And then, yeah, builders of bots like us get more clarity on what should be allowed for users. If you run a business, you know that having reliable vendors is non-negotiable. And whether you need office snacks, holiday gifts, or wholesale ingredients, you need to check out Nuts.com. That's it, Nuts.com.

Starting point is 00:40:43 That's a crazy, amazing domain named N-U-T-S.com. And Nuts.com is your one-stop shop for the highest quality foods for your business. Again, they offer delicious office snacks, corporate gifts, and wholesale ingredients. I got a gift pack. I have been eating these beautiful roasted nuts and other amazing premium products like chocolate covered sweets. I love the trail mix, popcorn. You know, I stopped eating the candy, so I went for the dried fruit.

Starting point is 00:41:11 But they also have wrapped candy as well, and my favorite jerky. And of course, they have all the gluten-free stuff or whatever dietary option you're into. Over 50,000 companies choose nuts.com for their business needs, from offices to hotels, to restaurants, to retail stores. nuts.com has something for every business. And so here is your call to action. Nuts.com makes ordering for your business quick and easy. And right now, nuts.com is offering new business customers a free gift with purchase and free shipping on any orders of $125 or more at Nuts.com slash twist. Go check out all the delicious options at nuts.com slash twist and you'll receive your free gift and free shipping when you spend $125

Starting point is 00:41:56 or more. That's nuts.com slash twist. Here's a crazy idea. People who are building these bots have credit cards and real names and are validated as real humans and have a tax ID.

Starting point is 00:42:12 And if you want to have a bot doing things in the real world, you have to put a credit card in, you have to have a social security number, and or whatever the business number is, and you have to have a valid phone number, and you have have a valid email and you have to be authenticated on the phone or have a driver's license on file. So to use these things, you could just have, like if you want to buy a gun or have a car,

Starting point is 00:42:35 you have insurance, you have a driver's license. So if it's deemed to be too dangerous, you can look at the amount of danger this causes in the world or chaos it could cause. And then just like cars and guns are treated differently than pens and paper and, you know, maybe there's fertilizer that, you know, if you buy it, we know that people can make bombs out of fertilizer, you know, you have to have your passport and driver's license, and you can only buy a certain amount of fertilizer, and there's a waiting day, right? So all of these things. We have a little world.

Starting point is 00:43:05 Yeah, we have some version of KYC or, you know, throttling when people can get it. We have cool off periods with guns and, you know, hopefully the good folks in the world, like yourselves who are building this stuff are thoughtful about it. What I love about this is I think this is work that people don't want to do. It's soul-crushing work in most cases. Repetitive tasks. It's, you know, going out and chopping wood would be more pleasurable, I think, for most people and healthier than doing some of the jobs that you're going to eliminate. And I think there are jobs that should be eliminated.

Starting point is 00:43:38 Nobody wanted to be a phone operator and sit there and plugging cables all day for 40, 50 hours a week. It was arduous and painful. Same thing here. Nobody wants to be in the fields, down on their knees, picking strawberries. Robots should do that better. It's the same kind of analogy. I wish you a great success with this. of you, I know you're very young.

Starting point is 00:43:56 I didn't want to bring that up because when I was young and people referred to me as the 23-year-old founder of this magazine, I always found it kind of annoying that that was in the first sentence. But you are 19 years old. You have raised a couple million dollars from Sam Altman and the former Sequoia India. And you get that AI grant, right, from Daniel Gross, too, I understand. Yeah. So for folks who are young founders out there, how the heck did you do it? I think I had a very interesting story because I didn't grow up in the Bay Area or the U.S.

Starting point is 00:44:27 I was sort of an outsider in that sense, but I grew up in India and I always, I used to see your podcast, I used to see C-Y-C videos. I used to see a bunch of things from outside and I used to be like, you know, this is something something cool is happening. What was the youngest age you watched one of my podcasts? I'm curious. I think 12 or 13 when I were 12 or 13 in India watching this week in startups. You have no idea.

Starting point is 00:44:51 how much that fills my heart with joy. Because I always said, you know, I think there are people around the world who might hear this podcast and be inspired to start a company or just get stoked to be an entrepreneur and to hear you actually say you listen to this at 12 or 13 halfway around the world. And now you're on the program six years later is just mind-blowingly joyful for me. So thank you for that. No, thank you for doing the show. I remember the clip that you did with Patrick.

Starting point is 00:45:21 where he spoke about how they started Stripe and how they raised money from Sam and how they came to the valley and started doing stuff. So it's like a bunch of these things that I used to see him outside. And this was stuff that I wanted to do. I started writing code very early. So I was already working in tech while I was here. I was taking up remote jobs while I was still in school. I had built like a bunch of projects. And then as soon as I made a little bit of money, I started making trips to the US.

Starting point is 00:45:44 So just to come to the Bay Area, stay in these hacker houses, try to meet people, go to these events. That's how I met a lot of these investors and people who eventually kind of invested or became a part of the company. But I just started, that was, I sort of had this brute force way of breaking in. Twitter was in fact also super useful, but I should just call DM a bunch of people, reach out to them, pitch them. Yeah, it's just like a lot of brute forcing. Number one, you consumed a bunch of information. Two, you did the work and learned how to code and build products. And number three, you got on a plane and you went to where the action was happening and you met people.

Starting point is 00:46:24 You networked. Yeah. This is so easy and the industry is so wide open. So for people in America who are saying, oh my God, I don't know how to break into tech, consume every bit of content you can about what you want to do and being an entrepreneur. Number two, learn how to build products to all the information is on the internet. And number three, go to wherever the most action is. It happens to be the Bay Area, but there's also stuff happening in Dubai. There's stuff happening in Tokyo.

Starting point is 00:46:54 There's stuff happening in Sydney and Melbourne. Sometimes she's got to get on a plane and meet people in network. It's very simple. One, two, three. You figured it out, kid. I love it. I think the other thing that's great about the Valley and journalist culturally, everybody's open to taking meetings.

Starting point is 00:47:09 And that sort of people like, I think, the classic, it's an advantage for young founders. So even though we're building for a very old industry, like legacy industries, you can have all these questions around. How do you get to these customers? How do you talk to them? They will like, how do you break in, stuff like that? But I think on the other side, everybody gets excited when they see a large market opportunity and a young team that wants to move fast.

Starting point is 00:47:31 It's kind of the classic story that people get excited about. So I think everybody had been very open. We're grateful for everybody who took meetings with us and helped us in the process. But I think it's, yeah, it's very doable if you put in the work and to show up people like backing people who are doing the work and have interesting stories. You know that people who are of action, people who are doing stuff in the world are one in a hundred of the people who generally interact with us. So I get hundreds of emails where people tell me their ideas, tons of DMs, people tell me their ideas. And then once in a while I get a link to a product or a screenshot of a product or a loom or a

Starting point is 00:48:15 you know, a quick demo or a Figma. And I click the link and I go, wow, what you built is freaking cool. Let's get on a Zoom or meet somebody on my team, you know, or come to our accelerator or maybe we can invest. And that really does differentiate you. I think you figured it out, Aryan. And, man, your parents must be so proud of you.

Starting point is 00:48:36 May I ask, what do your parents do? And what do they think of all this? Because they're in India and you are 19-year-old. years old and you raised over $2 million, are your parents like entrepreneurs themselves? No, they're about doctors. So they come from the opposite end of spectrum. They were pretty disappointed when I was not going to college. And they, it's still not off the books for them.

Starting point is 00:49:00 Like some point, maybe you want to reconsider and, you know, maybe apply and get into some school and do it. But I think they are generally, they've kind of become more supportive over time. It's like, you're doing, you're not doing something wrong. So as long as you're not like a criminal. And you're happy you're doing whatever. That's a pretty good benchmark. You're not a doctor, which are also not a criminal.

Starting point is 00:49:20 So there's something in between those two things that is acceptable. They don't have to bail you out. So shout out to your parents, but message to your parents. Not everybody is going to just go through and do the standard thing. Some people have a lot of creativity and they have more energy than those career paths allow for. And so I think I would have just been full mode out of. college even if I would have gone just because I think the rate at which stuff is accelerating,

Starting point is 00:49:49 it's almost like the opportunity cost of going is, it's just, it's huge. Like I would, the same thing happened, by the way, in the dot-com era.

Starting point is 00:49:58 And I told everybody, like, if you're going to college during the dot-com era when all this was changing, it's a big mistake because I've never seen a gold rush like this. And then I saw a second one, which was in mobile in 2000, 2008, nine,

Starting point is 00:50:11 10, 11, 12, and now this is the third one I've seen in my lifetime. It was really, three very unique ones. The internet and the dot-com era, the mobile shift, and then now AI. And they come along every

Starting point is 00:50:23 10 years and it's like this incredible season where there's a ton of snow on the mountain and you can really ski really well or there's just great waves to surf. There's not always great waves to serve. I mean, you can build a great company anytime, but listen, I am so proud of you. And I'm not your parents, but I'm super

Starting point is 00:50:41 proud of you that you're doing it. And I wish you great success. And my only regret is I didn't get a chance to be in this seat round. But, hey, maybe you'll raise money again and I can slide a quick. Maybe your Uncle J-Cal could slide a quick $100K or $250K into this. I think you're going to knock it out of the park, by the way. Congratulations. And take your time, focus on product.

Starting point is 00:51:01 You know what to do. You've listened to all these talks and podcasts and you've got great investor, Sam's amazing. It's all about just focusing on the product and the customers. And you seem incredibly product and customer focus. You must have picked that up, from Sam and just watching our videos and Ycombinator videos and blog posts, yeah? Yeah, and I think that's the only way to do it.

Starting point is 00:51:23 I've had versions of doing things. Otherwise, it just doesn't work. You have to, the only two things matter is just be heads down, starting the space. I'd like to be very analytical. So all of these tweets and, like, data points that I consume, they are actually how I think. And I can translate the little bit of this macro reasoning of what's happening with just product, what the customer is saying and how that kind of sort of strings together in an narrative, which I think is equally important as doing the work.

Starting point is 00:51:47 So it's, yeah, that's the only way that I think things can happen. And we had a good launch. So it's just now iteration. We have a lot of backlog of demand. So we're kind of slowly serving up and figuring out how to, yeah, get the capacity. All you got to do is delight those customers. Everything will be fine. Everybody check out, induced.aI, I-N-D-U-C-E-D-A-I.

Starting point is 00:52:08 And you can follow Arien. He's on Twitter, as he mentioned, X. R-A-R-Y-X-N-S-H-A-R-M-A. Go ahead and follow him, and all the links are in the show notes, and we'll see you all next time. On this week and service, bye-bye.

This Week in Startups - 24/7 autonomous agents for everything with Induced AI’s Aryan Sharma | E1854

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.