Software Huddle - AI Engineer, Web Frameworks, & more with Tejas Kumar

Starting point is 00:00:00 A lot of people don't understand this term. And a lot of people often conflate it with, I don't think I could ever be an AI engineer because I didn't go to university for linear algebra and machine learning. And they couldn't be further apart. What models are you using most of the time? If I want to look up movies to watch, if I want to do something like that, I'll go and talk to Claude 3.5 Sonnet.

Starting point is 00:00:22 This is like the premier, the greatest model of all time for end users. But in coding, like if I'm building a software system, I use GPT-4.0 Mini just because OpenAI's developer experience is leaps and bounds above anything else I've ever seen. And for the most part, it does the job. What about any, like, are you doing any like local model running, maybe like Lama or anything like that, messing with with those i do run a ton of models and and workflows locally just because of the security guarantees and the cost savings and it's not dependent on the internet tell me about um you know if when you're back in that react world how do you feel about react server components hey folks this is alex i'm excited because today we have tejis kumar on the show Tejas is someone I've followed for a while. I watch his talks on YouTube. He's got a great

Starting point is 00:01:08 podcast, all this sort of stuff. Very broad-based guy. Really good at front-end, lots of React stuff. He's got a React book that I've bought and I really enjoy. But also lately, he's been doing a lot of AI engineering work. He's going to be speaking at the Shift Conference in Croatia, talking about AI engineering and what that means. So we just went through a lot of that today. We talked about AI engineering. We talked about React. We talked about just content creation, education, and even him running his consultancy and things like that.

Starting point is 00:01:34 So a lot of great stuff on here. If you have any questions for me, if you have any guests you want to have on, anything like that, feel free to reach out to me or to Sean. With that, let's get to the show. Tejas, welcome to the show. Hey, thanks for having me. Yeah, well, I'm very excited to have you on because I've followed you for a while. And I consider myself like more of a backend engineer generally, but I've been doing more like full stack work and especially like React stuff lately. So I've been trying to like follow more React

Starting point is 00:01:59 people and your stuff has just been super helpful for that. So thanks for that. And I feel like you also like bridge the gap into backend pretty well, which is great. But I guess for people that don't know you, you can maybe introduce yourself and what you're up to. Sure, yeah. Thanks for the intro, Alex. I'm Tejas.

Starting point is 00:02:18 I've done lots of things. As you mentioned, bridging the gap between frontend and backend. I've traditionally come up as a design intern and then was a front-end engineer, engineering manager, did some back-end, did some DevOps. And I just tend to follow the trail of things I don't know, things I'm not good at, or things that are emergent. And so lately, I've been an AI engineer,

Starting point is 00:02:40 an AI developer, relations engineer at Datastacks. Again, AI is somewhat emergentgent and I have an interest in it and I have the privilege to actually go find a job in it and do it now full-time. So that's what I've been up to. So you mentioned coming up in graphic design. Are you self-taught on the developer aspect? You're more in the design initially? I never went to school for anything. It started with Photoshop.

Starting point is 00:03:10 I think this is actually the experience of many people. They'll start with Photoshop. When I was growing up, the OS X Aqua UI was in vogue. And everyone was into these glass balls where you have these reflections and things. And so we were making a lot of those on photoshop and there was a small community of folks making tux so the linux penguin but like glassy glossy versions of them and i did a lot of that and then thought wait a second a user interface a website is just is just a picture with buttons and so i made that got into the slicing tool and then you know dreamweaver front page oh i can actually make

Starting point is 00:03:44 it do something when you click on it. And this was sort of the foray into the web. And the cool thing is, this is not unique at all. I love it. A lot of people do this, or they go the visual basic route if they're coming from Windows, and then they'll make desktop apps. But I love that this is sort of, we all grew in this direction for the most part. For sure.

Starting point is 00:04:03 And I laughed a little bit because I was thinking of you like not going to school at all. Like, like even as a, as a small child, but of course like even for programming, I'm, I'm the same way.

Starting point is 00:04:10 I'm self taught. Like I didn't have any computer science background at all. Um, went like a different route into backend. Like I always wish I could do design stuff. I'm jealous of people with design. Cause I feel like, I feel like the most powerful people on earth are like people can design well and can write front-end and stuff.

Starting point is 00:04:28 I feel like backend is kind of generic. All the front-end stuff and making it magical, you can make a demo for someone and put it on the web and make it live, and it just clicks with people so much more than a backend. This is where AI makes things very interesting. I feel like backend doesn't get as much glamour as frontend because it's not something that users interact with daily. Even among engineers, it's not something... Engineers are slowly being taught to edge out backend

Starting point is 00:04:56 in lieu of using a serverless platform. Oh, just use DigitalOcean, don't worry about it. Just use AWS, just use Vercel, just use Netlify. And in exchange, you don't actually understand the whole spectrum of backend, which is more than just a database and an API that someone hosts for you. DevOps is part of backend. And I think

Starting point is 00:05:15 we're sort of losing that. But with the advent of GenAI, I think this threatens the frontend side of this as well, where frontend is just less glamorous because, hey, V0 or some like Claude can just make a nice front-end for you now. It really is amazing how good those things are in terms of like, hey, just describe that first cut of a component or something like that

Starting point is 00:05:35 and get it out there. And then you can tweak it if you need to and that stuff. But just going from blank page to getting something out there. So I'll have to say, you maybe, as a back-end, predominantly back-end person, may be able to experience some of the things that a lot of the front-end folks who use serverless databases get to experience now with just having most of the work done for you by a system.

Starting point is 00:05:56 Yeah, for sure. It has been. I've had a lot of fun becoming more full-stack and doing that with the Gen AI stuff over the last couple of years. And that segues into, the first thing I want to talk about is AI engineering. So you're talking at ShiftConf coming up about AI engineering today and tomorrow. Maybe give us a preview of that talk, and then I just want to ask you a bunch of questions about AI engineering and things like that. Yeah, I got into AI engineering straight, I think it was November of 2022.

Starting point is 00:06:24 So GPT-3 was released, but I don't think ChatGPT was released. I've been following the space for a long time, mainly because I do what's called conference-driven development. I speak at a lot of conferences, and conferences will give you hints of like, hey, these talks are interesting for our audience. And AI was often on that list, so I was like, okay, what's happening with AI? And so I've been in the space sort of crudely. But then ChatGPT showed up, and I was like, okay, what's happening with AI? And so I've been in the space sort of crudely. But then, you know, Chad GPD showed up and I was like, oh, wow, okay. And got deeper and deeper ever since.

Starting point is 00:06:51 So I've been doing AI for like two and a half years now. But when you do something like that, I think it's not, again, it's not unique. It's not like front end or back end. I mean, it is like front end or back end where when you work with these things day in, day out, you start to recognize, ah, this is where it sort of falls apart. This is where it's not actually useful. This is where there's a lot of educators trying to grift people. You start to understand the nuance of the space,

Starting point is 00:07:18 more so than people who aren't exposed to this on a daily basis. And so that's my job. My job is, I'd say say 50% research and 50% teaching, where it's researching the stuff that a lot of us don't know because we don't have the exposure to it. I thankfully do as part of my job. And teaching based on the things we've learned. And so I think one thing that I'll touch on pretty concretely in my talk is even just the definition of the role AI engineer is something that a lot of people still, and I know this because I talk to a lot of people at conferences. So it's like boots on the ground, like not, you know, in your bubble on Twitter.

Starting point is 00:07:53 A lot of people don't understand this term. And a lot of people often conflate it with, I don't think I could ever be an AI engineer because I didn't go to university for linear algebra and machine learning. And they couldn't be further apart, right? So part of the talk is, let's just together establish what AI engineering even is. And spoiler alert, it has nothing to do with machine learning research or machine learning engineering. These are separate beasts. And from there, we look at common problems in the space of generative AI. And then this is really centered around the things many of us have been exposed to already,

Starting point is 00:08:24 which is hallucinations. They'll often just like make up stuff. And then we as human beings attribute like an ownership or confidence to them, where we tend to anthropomorphize them to be like authoritative sources, but it's just a word prediction machine. So there's the hallucinations.

Starting point is 00:08:39 There's the fact that they can't really contain or hold a lot of context. Like the largest context window in the market is 2 million English words roughly, which is Gemini. And so that's a problem. If you're trying to build something powerful, like Jarvis from Iron Man needs more than 2 million tokens of context, you know. And the fact that they have knowledge cut off, where like past a certain trading date, there's just no data to pull from.

Starting point is 00:09:06 So we look at some of these problems and how you can solve them in the real world to drive actual positive business and or personal outcomes. Meaning if you want to build, look, I, because I've been such a generalist and I, you know, I can do front end, I can do back end, I can, I can do whatever. Um, I, I will say this pretty confidently. I could build any software that I wanted to that does anything and I think a lot of us feel this way

Starting point is 00:09:31 but in the case of AI there's a lot of people who don't feel this way and that's my job is to show you actually you can and here's how so that's the whole point of the talk there's maybe things you think are not yet possible meaning building an assistant that you can ask for real-time stuff. Like, hey, should I take an umbrella outside? And most, actually, every time you ask this of ChatGPD, it'll be like, I don't know,

Starting point is 00:09:53 I don't have access to the current weather. Even more concerning is sometimes it will actually look up the current weather, but this is non-deterministic. I teach people, how do you actually build useful stuff that works 10 out of 10 times in my talk? Yeah, yeah. Okay, so when we're talking about AI engineers, it's not the linear as-a-word person,

Starting point is 00:10:11 but it's also not me with Copilot in my editor. That doesn't make me an AI engineer, right? Yeah, that's a really good point. Yeah, it's not somebody who uses AI as the end, but who uses AI as the means to an end. I think that's the nuance there. Yeah, you're totally right. Okay, so let's talk a little bit about challenges you're seeing and integrating with this product.

Starting point is 00:10:35 You mentioned hallucinations and cutoff time. Has knowledge cutoff, I feel like it's gotten better. Am I wrong on that? Are they updating more frequently, or how is that sort of working? Do you have a sense of that? Yeah, this is a really great question. It's gotten better, you're absolutely right. And it's gotten better because the vendors, so OpenAI and Anthropic,

Starting point is 00:10:53 they're just using techniques that I talk about in my talk. They're using a technique called RAG, that is Retrieval Augmented Generation. One of my biggest gripes is the semantics of all of this. It sounds complicated, it's really not you retrieve data from somewhere, literally anywhere, it doesn't matter as long as it's accurate and you use it to augment the generated output of the LLM or the large language bundle and how do you do that? You just literally pass it as text

Starting point is 00:11:18 you get data, maybe some JSON data of today's weather or whatever you convert that into a text string and you front load your prompt. You say, hey, here's some context. Now answer the question. That's it. And that's exactly what ChatGPT does for sure. If you ask for some real-time information, you'll actually see it will say,

Starting point is 00:11:36 visiting the website, searching the web. It will do that. It will go search the web, get the information, and then augment your prompt with that information. Yeah, cool. Okay, another issue I'm seeing, like building with some of these things, is just like basic API availability and uptime.

Starting point is 00:11:50 I feel like OpenAI, that's the one I've used the most, but I feel like it's sort of like uptime is worse than like a lot of services that you would expect. Is that something you're seeing? Is that different across different models you're seeing? Or like, are you having that issue? Am I, is this a skill issue on my end? Yeah, no, it's a valid issue. It's definitely not a skill issue on your side.

Starting point is 00:12:10 It's just because of the economics of it all. It's also the reason NVIDIA is now a trillion dollar company. There's very specific hardware requirements to do these inferences and also the training. And this is something that maybe people should know, is that inference and training are extremely different and often done on different architectures. Obviously, you need more power for training and less power for inference, right? In any case, you still need significantly powerful machines to perform that inference. And these machines are just in short supply, literally. And so that's the whole reason. The data centers are often maxing out capacity,

Starting point is 00:12:44 especially like ChatGPT with over 100 million users, etc. They're literally maxing out the cloud vendors, in this case, Azure. Yeah, for sure. What models are you using most of the time? Do you have some favorites? That's a really good question. That's a very good question,

Starting point is 00:13:02 because the sub-question here is, or if I was to answer your question, I would ask you, to what outcome? Because I use AI daily in my life as an end user, and I use AI as an engineer. And I use different models for different things. For my daily driver, just like every day, if I want to look up movies to watch,

Starting point is 00:13:23 which we could talk about how search is just completely revolutionized by this. But if I want to do something like that, I'll go and talk to Claude 3.5 Sonnet. This is like the premier, like the greatest model of all time for end users. And I have multiple reasons of saying that. But in coding, like if I'm building a software system, I won't use that. I use GPT-4-0 Mini just because OpenAI's developer experience is leaps and bounds above anything else I've ever seen. And for the most part, it does the job. The only time I've not used OpenAI's models is when cost was a question. Because GPT-4 at one point was very expensive,

Starting point is 00:14:02 and I needed GPT-4's capability for generating very opinionated HTML. And this is when I deviated and actually trained my own large language model that did what GPT-4 did, but way cheaper because it was smaller, it cost less to run, and it was trained on my specific data. But for most, I'd say nine out of 10 things,

Starting point is 00:14:23 if you're building software, OpenAI's models are good enough. You mentioned Gemini a little bit earlier about the two million context window. Are you using Gemini at all? I haven't tried any of the Google products, really. Yeah, I live in Europe. Is it just not available there? It was not available for the longest time. And it may be available now, but honestly,

Starting point is 00:14:44 the time horizon for interest plateauing and declining had passed before it became available you know so i don't care about gemini honestly because i my needs are met by other things yeah yeah what about any like are you doing any like local model running maybe like llama or anything like that messing with those yeah um yeah when I mentioned I trained my own model, that was local, that still runs locally. And I think something that's interesting is, so I'll use AI to build software, I'll use AI as an end user,

Starting point is 00:15:15 but I also run a podcast, and the entire thing runs itself. And by that I mean there's literal AI agents that will execute tasks. A lot of it runs locally, like on my device. And so it also runs unreliably for that reason. I sometimes will lose power or something. It's nothing mission critical, and so I'm okay with that.

Starting point is 00:15:37 But yes, I do run a ton of models and workflows locally just because of the security guarantees and the cost savings. And it's not dependent on the internet. Are you running those just on your MacBook? just because of the security guarantees and the cost savings. And it's not dependent on the internet. Are you running those just on your MacBook? Some on my MacBook, some in the cloud. We have on my podcast, not to plug shamelessly. No, go ahead. I love your podcast. I listen to it, so it's good stuff.

Starting point is 00:16:04 Thanks. There's an episode with one of the engineers over at OpenSauce where they were running into, they were spending like $20,000 to $40,000 a month on OpenAI. And they were like, you were an early startup. We can't afford this. They built their own inference infrastructure on Kubernetes, on AWS, everything. So you send a query and an LLM generates the text and serves you all, basically all the chat GPT infrastructure they built it themselves on AWS. And we have this long discussion where they explain how they did that. So I have that also.

Starting point is 00:16:31 It's orders of magnitude cheaper, but more than being cheaper, you have more control. I'm sure you can appreciate this as a backend engineer yourself. Being able to choose the size of your instance, the size of your database, how many vCPUs, where do we, like all of this is super important, especially for where money's involved. So yes, I do have some local, I have some in the cloud and I also just use SAS's whenever. It's just picking the

Starting point is 00:16:52 right tool for the job. Yeah, for sure. Are you, I guess like, are you finding areas where like certain things are better, certain like transcription with Whisper or something like that? Are there areas where you're like, man, I wish i had an agent that could do this but it's just like not very good at this type of task yet or like where is it sort of most effective for you that's a really good question um i i think it's it's effective across the board i think like one thing i wish my agents could do better is actual discovery. Because oftentimes, so one of the things, one of the agents does a job where if, you know, you, Alex,

Starting point is 00:17:32 schedule an episode on my podcast, it will like do what we call discovery. It will literally like search things about you and find out who you are and what you do and what are you excited about? What have you created? What can we talk about that would get you going, that would get you passionate, right? And it creates really great outlines, discussion outlines based on this, but sometimes it pulls from sources that are just really old. And so sometimes I'll ask you questions about this thing you did two years ago. And this is fairly easy to fix. What's not easy to fix is the non-determinism of it all. When will it feel the need to adjust the timelines and so on?

Starting point is 00:18:10 Because again, the end is not some rule-based AI where I define all the logic, but to just literally be like, Alex Debris, go. And it will just figure it out. That is challenging. Yeah, interesting. Sometimes we hear people say, talk about like models getting nerfed

Starting point is 00:18:27 or like getting worse over time. Do you think that's true? Like, is that happening? Or is that just like our expectations change? Like when it's new, we're like, holy cow, this is great. But then like, what do you think's going on there? You know, this is the biggest call

Starting point is 00:18:42 and impetus for open source AI, period. Because, for example, with any of the open source models, with Lama, with Mistral, we just know because they're open source. And a lot of people say, oh, what does that mean? You don't know because it's just a bunch of weights. Indeed, it's a bunch of weights. But you can see when the versions are published. And you can go use an older version if you want to. So all of that's there.

Starting point is 00:19:05 Sure, you can't like open the document and look at all these weights and go like, ah, see, this number is different. But that's not the point. The point is, it's transparent. Pushes are transparent, commits are transparent, publishes. With OpenAI, with Anthropic, none of this is transparent. So who knows? They could be, they could literally over time, make GP4 oh mini worse just so that then they can have a big launch and say look at gpt5 it's so much better when in actuality it's not it's just good gpt4 right they could do that um i wouldn't put it past them honestly i i think it would serve as well to be like suspicious of anything that's not open source and and i am and like for example can i can i share something like i i think a lot of

Starting point is 00:19:45 folks um some folks have like rose tinted glasses with open ai oh my gosh they released chat gpt and it was so great um and indeed it was it's a truly great product but they needed the human feedback like that's part of the reason they released it it wasn't just like for the good of mankind it was like no we need people to use it to generate output and click on the little thumbs up or thumbs down. So then they get all of these data points that they can then use to fine-tune 3.5 and get 4 and so on. 4 is just a fine-tune of 3.5.

Starting point is 00:20:18 And so we're helping them by giving them that feedback. In exchange, we get this tool. I think that's fair, honestly. I think that's totally fair. but i think it's just worth acknowledging that the motives aren't always like for the good of mankind and there's often more to it so yeah yeah for sure so tell me a little bit more about you training your own model so first of all is this like on your website you can sort of like chat with with you is that the model that like your trained model that's doing that that's a really great question.

Starting point is 00:20:45 And let me speak back into you to answer. This is like asking when I access your website, am I hitting the server? But there's usually a load balancer. There's like 18 servers. So it's

Starting point is 00:21:02 like that. The architecture of my AI pipeline is multifaceted that way in that there's multiple paths to an output. And I can walk you through those because it's really fascinating. And this is what I really love doing. My job is teaching people about AI. So what happens? You send a prompt.

Starting point is 00:21:20 First thing that happens is we're going to check, have people asked this or something like this before? And this is a technique called semantic caching in the Gen AI space. And it's exactly what it sounds like. You cache the responses, but the cache key is a vector of the input and output. Or it's actually just the input. So what that means is when somebody sends a prompt, we can literally convert that prompt into a vector or a list of numbers and compare its similarity to other prompts.

Starting point is 00:21:49 And if it's similar enough, meaning if it's like 90% similar or more, then we just serve you some cached response. And of course, we stream it over the web, and the words appear one after the other, so it looks like this is coming from an LLM. It's really not. It's just a cache response. It's this trick with UX and the backend, you know? So first you hit the semantic cache

Starting point is 00:22:13 and most traffic goes through the semantic cache. If it's a cache miss, then what do we do? There's multiple players here. So my fine-tuned model is there. It's for those interested, it's a Mistral 7b. It's the cheapest, smallest open source model from Mistral. And just as a side note, I think this is the future of Gen AI, is cheap, small, specialized models, as opposed to big, large, general models. In any case, this thing costs, I think like one twentieth of the price of GPT-4,

Starting point is 00:22:43 if I'm not mistaken. I have this really great graph I tweeted one time. So it's peanuts, right? So if it's a cache miss, first we generate an output from this, also because it's so small, inference is extremely fast. And so we generate some outputs, and then we use a technique called LLM as judge, where there's a very, very capable large language model that can look at something and make a judgment. Is this good? Is this bad?

Starting point is 00:23:07 Is this what the user actually wants? Is it helpful? Et cetera. And so the reason you can do LLM as judge is because it's usually very cheap. You can say, if this is good, then respond with a one. Otherwise, respond with a zero. And so your output tokens, you have almost no output tokens. And that's the billing.

Starting point is 00:23:23 That's what you pay for, input and output tokens. So you take the generated output from the cheap model and go like, hey, LLM is judged. This is the most sophisticated model usually. Cloud 3 OPA is GP4O. Is this close enough to what the user wants? And then if you get back one, you say, cool, serve it over HTTP. And same, the UX is the same on the front end.

Starting point is 00:23:43 We just stream word after word. If it's a zero, then we say, okay, now you answer this. And then we stream the LLM output, right? So there's multiple layers to save money at every layer and also save compute. But at the end, yes, you will probably hit some super large sophisticated model. But if I've done my job right, my bills are super manageable. And indeed they are. I haven't paid much at all for it. Very cool. On that last point of zero and one, have you had difficulties getting even just predictable outputs

Starting point is 00:24:13 or structured outputs or anything like that? What patterns are you using for getting predictable output shapes or anything like that? Yeah, that's a really great question. This is such a great episode, I have to say. You're very good at this. I can't wait for this to be published because I think people will learn a lot about structured outputs. I believe the OpenAI API has JSON mode as a flag, as a configuration

Starting point is 00:24:38 operator. I don't know that this is true. I've seen something, I don't use this, but I've seen something like that. For me, from GPT-4 onwards, OpenAI's models have been extremely good at system prompt adherence, meaning you specify system prompt, they almost never deviate from that. It's very good. Obviously, you can't trust that because it can fall apart. But that alone has actually been pretty good for me. What has been really great for me is adding a two-step approach to this. One,

Starting point is 00:25:05 in my system prompt, all capital letters, afflammation blocks, like you only respond with this. And then you send your prompt as usual, you get a response, you take that response, and you pass it through Zod. For those who don't know, Zod is a validator, a schema validator of JSON and other types. And so you get a response from OpenAI, you take it, you pass it through Zod, literally, you just pass it to Zod with a given schema and say Zod.parse. And if that fails, then you just run the request again, and again, and again, and again. Because of the really good system prompt adherence, I almost never rerun requests, ever.

Starting point is 00:25:41 Like, it's always first time. And so that's been pretty good. I think this is obviously not extremely, I think if, if I had problems with this, my next logical step would be to go look at JSON mode with OpenAI, if that exists. Again, I don't know if that exists because I've never really felt the need for it. Yep. Yeah, man. I can't remember the thing on this either. So this is probably bad podcasting, But I feel like they had JSON mode for a little while and then recently just came out with like a stricter structured outputs or something

Starting point is 00:26:10 where you're like, I think you actually pass in a Zod schema directly and it's just like, hey, make it conform to this. And it guarantees almost that it will. Yeah, indeed, it does that. I looked at the docs. I think what they may be doing under the hood is just try catch. What you're doing? Yeah, yeah, yeah does that. I looked at the docs. I think what they may be doing under the hood is just try catch.

Starting point is 00:26:26 What you're doing? Yeah. Yeah, exactly. And I saw some stuff from Rafal Valinsky, just some of his tests on this. He was noticing just some performance or cold start issues, especially, which is kind of interesting. The first couple of times you call it with a given schema, which I don't know what that's about. I'll link to that in the show notes. But like there's some interesting stuff there.

Starting point is 00:26:52 But yeah, like I remember that being a problem, especially like there's this guy, Riley Goodside, that like early on he's like showing, hey, just give me JSON. He's like using instructions. Just give me JSON back, nothing else. And then the LLM would always respond like, okay, here's some JSON. And then it like gives the JSON. And he's like, instructions, just give me JSON back, nothing else. And then the LLM would always respond like, okay, here's some JSON. And then it like gives the JSON. And he's like, he's like, no, no.

Starting point is 00:27:08 And he's like getting more and more explicit in his instructions until he finally has to like threaten the LLM. And he's like, if you give me a single character that's not valid JSON, someone will die or something like that. And then the thing just gives him JSON back. It was hilarious. This was like two years ago, but yeah. I will say this, what really helps me

Starting point is 00:27:27 when it comes to really, really, really, really structured outputs is not doing any of this at all and going a separate way, which is using the tool calling functionality of most LLMs. For those who don't know, it's a way for an LLM to,

Starting point is 00:27:42 I love that they modeled the human brain so well because if you ask me to like multiply three prime numbers, I won't know. I'll a way for an LLM to... I love that they modeled the human brain so well, because if you ask me to multiply three prime numbers, I won't know. I'll reach for a tool, a calculator. I will know the extent of my ability and go reach for a tool. And you can do this with LLMs. With OpenAI, with GPT-4, you can.

Starting point is 00:27:56 With Mixtral 8x22b, that's the open source variant that has dual calling. I use this, it's great. And so how it works is you tell the LLM via JSON configuration, here's a bunch of functions you can call if you ever need to when you recognize that you need to call them. And you specify

Starting point is 00:28:12 a schema of parameters, input parameters, and it will call the function and you can return whatever you want from the function. And the cool thing is the inputs, ideally, actually, if it's a function, should influence the outputs. And so the outputs are non-deterministic because the inputs are non-deterministic, but they're deterministic enough as you want as a developer for structured data.

Starting point is 00:28:32 So that, for me, in extreme cases is what I would do. But so far, system prompts plus Zod have been actually really good. Yeah, okay, cool, that's good to know. What about how have you been using LLMs as a developer? I want to walk through some of the tools that you're using. Do you use Copilot in your editor? What's been trending lately is Cursor and 3.5 Sonnet. I've never used Cursor.

Starting point is 00:28:56 I'm not closed off to it. I don't think it's bad or anything. I'm the kind of person who has to feel pain and then I go reach for a band-aid. But what I see on tech Twitter is there's a lot of folks who just wear band to feel pain and then i go reach for a band-aid yeah but what i see on tech twitter is there's a lot of folks who just wear band-aid costumes without any wound yeah band-aid costumes i mean that's great i don't get it um i don't knock it i don't criticize it i just i it has to be born out of pain like another example if i go on a tangent is the local first

Starting point is 00:29:21 movement i've never felt the need for local i I understand the, yes, you don't want your data to be with Amazon or Azure or whatever, but I've never actually felt the need to build something local first. Like no one has been like, hey, I'll pay you a ton of money or like I've never felt self-motivated. In any case, yeah.

Starting point is 00:29:37 Coming back to how I'm using LLMs, I think one thing that I keep, I've said this in May of this year, and now people are like creating LinkedIn posts about it. Agentic rag, I've said this in May of this year, and now people are creating LinkedIn posts about it. Agentic RAG, I think, is huge. Because I just mentioned function calling that an LLM can do. And right now, how RAG works is you retrieve the data, and you augment your prompt, and then it generates text

Starting point is 00:29:58 that is augmented by your retrieval RAG. I think this is outdated, actually. I think the better way to do it is to create a rag tool. Like, hey, if you're missing information, go retrieve it. And then the LLM itself, if and when, performs the rag operation, right? I think that's something that people are starting to talk about. I think that's what I'm doing. I mean, I think a lot of folks would benefit from that as well. Interesting. Okay. Wow, that's super interesting. Okay, so you use Copilot in your editor, and then do you ever be like,

Starting point is 00:30:28 hey, something's not right for Copilot here, it's a more complex problem, I've got to go kick out to either ChatGPT or Cloud or something like that? Never. Never? Oh man. I don't even trust Copilot very much. But oftentimes, for example, the most value I get out of Copilot very much. But oftentimes, actually, for example, the most value I get out of Copilot is when I'll want to do some complex iteration

Starting point is 00:30:51 over something. For example, I have a hash map with keys and values, and I'll want to get the key from this and then map it to this other value, like something really complicated. In this case, I'll just rely on Copilot. I'll just do a comment. The next few lines do this, and then it fills it. And it just works literally every single time. I have never reached for anything more sophisticated.

Starting point is 00:31:13 I do reach for more sophisticated things when I want to do like bash scripts or some automation layer. So I did that today, actually. I wanted to convert a big video, so like full HD video to a GIF or GIF. And of course you could just FFmpeg-i source destination, but this doesn't create an optimized thing

Starting point is 00:31:35 for the internet that has like color deduplication and, you know, progressive enhancement or whatever. And so I went to Perplexity and I was like, write me a big script and it did. And it was super complex. I could never write that. So that for me is where this type of stuff helps. Yeah, that's a great one where it's like, hey, there's this super detailed, in-depth tool that has great documentation and is like well written about all over the internet. And it's like, I don't want to figure out all the options, but just like, you know, condense that for me for sure.

Starting point is 00:32:03 Exactly. Like FFmpeg for example also like i think um kubernetes right anyone who works with like helm charts or any of that just this is the the gold mine really yeah or like jq do you ever use jq on the command line for like all the time yeah yeah yeah it's just like there's so many options i'm just like i have this json i want this out of it write me me the JQ for that. I would say like sometimes I'll kick out to cloud. Again, I'm not the best front end developer. So it's like if I know I need this component,

Starting point is 00:32:32 and especially if it's like a very easily contained component, it's like visual only. It's like pure and all that stuff. I'll just go to chat GBT and like describe chat GBT or cloud, usually cloud now, and just like describe it. This is a component that I want. It'll build me that whole thing and i'll copy it in usually and so i like cloud for that because it's like a little bit of back and forth it's a little bit more like easier to chat with rather than just like doing tab complete with with copilot and because it's just like a pure visual component

Starting point is 00:32:59 for me it's easy to be like oh is it working or not i can also like inspect it pretty quickly to to make sure it's what I want. But I like kicking out to Cloud or ChatGPT for those sorts of things. Yeah, I will say this. In a language that I don't know or I'm not familiar with, then 100% of the time I'm with Cloud or ChatGPT. One example is I needed to do URL sanitization in Python. I've never written Python in my life, except for

Starting point is 00:33:26 some very basic import from Transformers, FineTune, whatever. And so this was a conversation with Claude. Explain how you do this. And the cool thing is, I actually wrote it in JavaScript because I know JavaScript pretty well. And I was like, look, I want to do this, but in Python.

Starting point is 00:33:41 How do I do it? And it was a really great conversation. Yeah, yeah. I want to go back to but in Python. How do I do it? And it was a really great conversation. Yeah, yeah. I want to go back to something you mentioned a while ago about just like working with the, like, I guess you're talking with people at conferences or just like working with LLMs and getting a sense for like what they're good at and not good at. And I think about that a lot for like I have kids and especially as they're moving up into like middle school, high school. I want them to like know how to use this stuff without just like writing all their papers with it you know but it's like i think this is gonna be super useful

Starting point is 00:34:08 for you but don't just like cheat with it i don't know do you have any ideas on how to um i guess like you know it could be for any age but like how to get people to a sense of like hey using this stuff to understand like what it's good at where it can be good and where it can't be yeah i i think to answer this question we'd have to think about where it can't be good um and honestly i given now that a lot of them perform rag i don't know where they can't be i guess they can't be good if you want like fast facts just give me a list of links but i think this is just my my like legacy-ness my boom boomer-ness, my age-ness, um, holding me back. Like, I think the younger generations will never want the list of, they're like, what? You get a

Starting point is 00:34:52 list of links and you have to click on them? That's so weird. You don't get an answer. So I, I think, um, honestly, I can't think of anything that provided the right safeguards and RAG is in place, that LLMs are bad at. And so, yeah, I think this will actually change the way that a lot of folks use the internet. For me, I still will go to Google and search for things, but that's just because that's what I know. But we have to remember that Google was also, like, we don't talk enough, Alex, about generational users, generational usage.

Starting point is 00:35:24 Because before the Google generation, there was directories. You just browse through folders on the World Wide Web. And then Google came, and this was a literal generational shift. And I think we're in the middle of another one of those where I saw this video on X the other day where there was this eight-year-old using Cursor and Cloud. It's like, hey, now make me this web page. And it did. And she made this eight-year-old using cursor and plod. It's like, hey, now make me this web page. And it did. And she made this eight-year-old child. She's on Forbes now. She made a web page by just like having a conversation with an LLM. And I think that's, yeah, I think browsing the

Starting point is 00:35:54 web search, I think perplexity should overtake Google in the next generation. It probably will. I don't use Google actually as much anymore. And to come back to your Gemini question, I don't know, man. I've been watching Google closely and just wondering, like, what are they going to do? Because a lot of folks just don't care about Gemini. At least that's, maybe I'm in my European bubble. I don't know. But yeah, it's an interesting time.

Starting point is 00:36:17 That's how I feel that like a lot of folks don't care about Gemini. But also I thought, I don't care about Anthropic. And I was just opening it for a long time. And then people kept saying, hey, Cloud 3.5 is like really good. And I went over there and I was like, holy cow, this is like really good. And I switched. So like, you know, maybe there'll be some moment there. But yeah, I feel like Google feels like an also-ran at this point in a lot of ways. Yeah, I think honestly, that has to be recognized as probably the biggest mistake of the 21st century, right? Where Google invented this architecture.

Starting point is 00:36:46 Like the paper, Attention Is All You Need, was written by Google brain scientists and now used against them because of open publication, but also because of Google's bureaucracy. It's quite sad. Yep. Yep, for sure. But that is like an interesting analogy to like when Google search came about. And I even remember like in, you know, like late elementary school using like Ask Jeeves and they would teach us like all like the,

Starting point is 00:37:07 the like Boolean logic in like your search filters. Like you sort of had to use some of that stuff sometimes to search. And like we got away from that, but just like think how much better you and I can Google search compared to our parents.

Starting point is 00:37:18 And it's gonna be the same thing with these LLMs as well. Exactly. And so I look forward to being replaced by younger people with skills I probably will never have. That's just so cool. What a thought.

Starting point is 00:37:29 Yeah, for sure. Okay, concluding this AI stuff, tell me, you work for Datastacks as an AI DevRel engineer. Tell me, I guess, what's going on there

Starting point is 00:37:38 with Datastacks. Yeah, so we, it's the best job I've ever had. And I want to be cautiously optimistic saying this because I've been there for, I joined it on the best job I've ever had. And I want to be cautiously optimistic saying this because I joined on the 26th of April, which is like four months now.

Starting point is 00:37:52 So it's not very long, and maybe it's the honeymoon phase, whatever. People can say whatever they want. But it's truly some of the best work I've ever done because it's partly the team. So the team is led by Phil Nash, who was formerly the head of developer relations at Twilio. The team is led by Carter Abasa, my bad, Carter.

Starting point is 00:38:09 But Phil is part of the team. And so I mixed them up because they both worked at Twilio. They both literally built up Twilio to be what it is today among developers. It's a real success story. I think a lot of teams use and get a lot of value out of Twilio. So Phil and Carter, there's a bunch of folks from Datastacks as well.

Starting point is 00:38:25 It's just a really cracked team. They create really great stuff. But also, the work we're doing is somebody said, I think it was our CEO who said, the real mission of Datastacks this time is to democratize AI, to make it available to everybody. That's the mandate

Starting point is 00:38:42 for the DevRel team. It's like, hey, this stuff is new and misunderstood, and we want everybody to just know it. To know what they can do with it, to understand its limits, to build a great software. We want to make it extremely accessible. The main goal is accessibility. And you might be thinking, wow, that's so altruistic. Yeah, but the more people who build on this use our stuff,

Starting point is 00:39:00 which gives us money as well. So it's a win-win. What do we have going on? We have a database. So we have a database that is actually the greatest database for very, very large applications at scale, as far as I know. Apache Cassandra, under the hood. As a backend engineer, I'm sure you've done

Starting point is 00:39:19 a fair amount with Cassandra. So I do a lot with DynamoDB, which is basically like, you know, right on with Cassandra, like very similar in a lot of ways with Cassandra. So I do a lot with DynamoDB, which is basically right on with Cassandra. Very similar in a lot of ways with Cassandra. I definitely understand the pros of Cassandra, for sure. I'll let you go. I'm showing my gap here. Is Dynamo open source? No, it's purely internal to Amazon. I believe the history is

Starting point is 00:39:42 within Amazon retail, basically, there was a database called Dynamo, just Dynamo, and that they use. And then some of those folks left and went to Facebook and made what is now Cassandra. And then also at Amazon or in AWS, they made DynamoDB. Both of them are different from original Dynamo in certain ways, but both of them have like a lot of overlap. And then there's some distinctions between DynamoDB and Cassandra itself.

Starting point is 00:40:13 But like a lot of- That is so cool. Yeah, like the partitioning and like basically partition range key, like all that is very similar across Dynamo and Cassandra. They just have like a few other differences as well. Yeah, so Datastack actually started as a consultancy for Cassandra.

Starting point is 00:40:28 And then we're like, ah wait, we're really good at this, we can just host it for people, serverless style. And so that's where AstraDB came from. Astra now has support for vector search. So you can imagine, you just store these, because it's this highly scalable database, you can store billions and billions and billions of records, and then they become vectors and you can do similarity search across everything your organization has ever had. And then you find similar results to whatever a user asked for and you feed it to an

Starting point is 00:40:56 LLM as context. And then you've got, wow, we've got this real-time generated text output. It's banana. So that's what we're doing with Astra. It's an extremely scalable database among the ranks of DynamoDB, but also I want to honorably mention Scylla. I think ScyllaDB is pretty scalable as well, right? Yeah, in the same category, yeah, for sure. And so we have that, and it's got this ridiculous vector engine for RAG. So we enable RAG workflows there, but also we have

Starting point is 00:41:25 another tool that is newer. So, I mean, Cassandra is pretty old, but we have Langflow. Langflow is sort of like Flowwise. It's an open source. It's open source and you can self-host it. I need to emphasize that. But it's a diagramming tool where you draw these workflows between various components of an AI system. What does that mean? For example, you have a text input, and you can pipe this text input component. It's a diagram. So you draw a little line between this text input

Starting point is 00:41:54 and an AstraDB search. But maybe you want to vectorize this text before that. So then you put a little block that says, convert the text to a vector, then send the vector to my database. And then you can add another component, again, another block in your chain, your diagram, to get the results from the database,

Starting point is 00:42:11 convert them into text, create a prompt, and send it to an LLM. So you can visualize this whole flow and make sure it works reliably. And then when this visual flow of data works, there's a little button that will just expose that over an API. And you just send a post request with the right payload, meaning the right text input,

Starting point is 00:42:31 and you just run that deterministically every time. And the cool thing about this is because it's visual, you can swap out just the LLM component for another one whenever you want. So you can keep your text input, your AstroDB, your RAG store, whatever. But if OpenAI decides to be extremely expensive, you could just delete that LLM and put Anthropic in there, plug in your API key, and your flow works just the same. And it's all exposed over the same API serving production traffic to whatever front end you're building. So we have the database and we

Starting point is 00:42:59 have the API layer. We don't yet have a front-end framework, and we maybe will do something about that. But all of this is just to make building AI things extremely accessible to as many people as possible. And so all that compute and all that, that's handled by Datastacks? Taking that, piping it to an LLM, doing all that, that's all fully managed? Yeah, so we have Langflow as a hosted SaaS-style product. In that case, yes, we do handle all the compute, all the networking, all the security, all the upgrades.

Starting point is 00:43:30 But if you want to self-host Langflow, you can. In this case, it's equivalent to just a big Docker Compose file. And so you just run that, spin it up wherever you want. Yeah, very cool. Okay, let's talk React a little bit because I know you know a lot about React.

Starting point is 00:43:45 I know that you wrote this great book on React. Oh, wow. You have my book? I have the book. It's a great book. I loved it. So nice work there. So I want to talk a little bit about React.

Starting point is 00:43:56 I'd say, first of all, let's talk just like current events. I feel like there's a lot of consternation and fighting about React 19. Is this just par for the course? Like I haven't been around for like when it's, I wasn't like that close to it when it's shipped or switched from my class components to function components and hooks. Like was there similar levels of fighting?

Starting point is 00:44:14 Is this like a new level of fighting? Like what's going on? Like, should I be worried about this? Like, how do you think about this? Yeah, I think nobody should be worried about React. I think React is just like established. And the big problem, the reason people keep fighting is because, unfortunately, React requires...

Starting point is 00:44:33 To get the most out of React, and this is getting even worse with AI, you would have to re-architect existing React applications. To be, instead of client-first, to be server-first. Just like PHP and Rails used to be, you know, and that's causing a lot of friction. That's people going like, oh my gosh, how I can't do this. It's too difficult. And that's usually where the tension is. But also there's been the advent of really faster, much faster libraries and frameworks that do what React does, but they are

Starting point is 00:45:04 extremely more performant. And the reason they're more performant is because React relies on recursive function calling to make updates. Meaning if you have some component or some function at a very high level, and the user changes the text input at this high level, then all of its children to the depth of n in your component tree, the functions will be called. Whether they update, whether don't it doesn't matter you're just like recursively

Starting point is 00:45:28 calling all these functions on the stack um and and you know very smart developers were going like why why is that what if we didn't have to do that what if just the values that change those functions are called only right um and And so this led to SolidJS, which is syntactically the same as React, literally the same, except for a few somethings are function calls that are identifiers with React. That's it.

Starting point is 00:45:53 But besides that, it's the same syntax. There's also QUIC. QUIC is a framework that uses the reactive primitive from Solid, so signals is what they're calling them. But QUIC does something very, very drastic, which again, syntactically, almost indistinguishable from React. You can't really tell a React and solid component apart. And you also probably can't tell a React and Quick component apart. And I like this, that they're keeping the syntax so similar. But what Quick will do is defer, first of all, it will code split every component. Everything is code split into its own module

Starting point is 00:46:27 and pushed to whatever web server it's being served from. That's one thing. Second thing is Quick has an engine that lazy loads each module that was code split at the time that it's needed. Meaning your first load JavaScript is always deterministically one kilobyte or less. And this is huge, right?

Starting point is 00:46:48 Because if you go right now to reddit.com, you're going to be waiting like two minutes. Reddit is just bad. It used to have this like Hacker News, Y Combinator style minimal UI, but then they added React. They literally added React and it got way slower. And that's because they're loading

Starting point is 00:47:05 megabytes and megabytes of JavaScript that I will never call as a user. And so Quick says, wait a second, what if we don't load the JavaScript for this user menu popover until they click the avatar? And so that

Starting point is 00:47:21 has been huge. And so this is where the industry is trending. There's also Angular and Vue. Wait a second on that one. So like if I click the avatar, is it first making a web request to even just go get that JavaScript? That's a great question. No. So it makes your website load, be ready and idle in its initial state. All of that happens actually on the server. So the entire UI is generated on the server and the markup, plain HTML is sent across the wire, step one. Then as soon as the user has something useful, it starts in the background preloading all the JavaScript, but preloading. So it's not loaded, it's not occupying any compute on your machine, it's just sort of like keeping it warm. And then

Starting point is 00:47:58 when a user clicks that pop-up, it then parses and executes the JavaScript in real time. So yeah, so the initial load is always a kilobyte or less. But the cool thing about this is you only download the JavaScript that you actually use. That's huge. That just doesn't exist on the web at scale. And it needs to, I think. Because we all know we ship way too much JavaScript.

Starting point is 00:48:21 Yeah, for sure. So then where are you at with, I would say, the non-React frameworks? Are you using those? Are you still just like, those are cool, I like those, I hope some of those ideas make it to React, but I'm still React first and foremost? That's a really great question.

Starting point is 00:48:35 This has been changing. Part of it is, I wrote this book, and now I'm like, okay, cool. It's done. But I do see merit in these other approaches. I think they're too early for me to use seriously in any mission-critical production workloads. But I think there's an actual meta framework here that doesn't get enough love.

Starting point is 00:49:00 This is what I use. This is my main daily driver for all production-grade workloads is Astro. Are you familiar with Astro? Am I going to be redundant here? I've heard about it. Pitch me on Astro, because I don't know that much about it. So Astro is a framework.

Starting point is 00:49:15 It's not a library. It's not like a script tag you insert into your... And I think the semantics around libraries and frameworks also need to be clarified, right? Where a library is just... It's code that you import in your thing. And a framework is like a frame within which you work. It's like, put your folders here and your files there. It's a frame of working.

Starting point is 00:49:35 Anyway, so Astro is that. It's a framework with opinions about your directory structure and so on. And if you follow the framework, it gives you the most performant websites of all time. And the reason it does this is because it doesn't include any JavaScript at all, at all. And so then you go, okay, but what if I need something to be interactive? Then you install React, Svelte, Vue, whatever you want, any UI library that does interactive updates. And Astro will give you primitives to say when to make them interactive. So you could say, I have this component, this chat widget.

Starting point is 00:50:11 I want it to be interactive when they click on it. Cool. So then Astro will just not load its JavaScript until they click on it. You can say, I want it to load the JavaScript when the client is fully loaded, but idle. And so it gives you this control over when to become interactive. And since it's a framework, it doesn't actually ship anything extra. If you want to use React with Astro, you can. And the whole thesis of Astro is instead of your entire site being a big bundle of JavaScript,

Starting point is 00:50:38 instead you have these little, they call them reactive islands. So in an ocean of markup HTML, you have islands of JavaScript. And these islands, you should be in control of when your user downloads them or not. And so the cool thing is Astro, it's just a frame. It's a skeleton on which you put meat like React or Angular or whatever. In fact, if you want to be really brutal, you could have all the frameworks in an Astro site, like each one being a different island. I don't think that's really helpful because then you load a lot of JavaScript. But the cool thing is, yeah, you can easily swap out islands and you can defer loading to when it's actually needed. Gotcha. And so when you sort of are indicating

Starting point is 00:51:15 when you want this loaded or different things like that, what does that look like? Is it like a use server declaration in server component? Or what the mechanism, or how much do I have to know and understand about that for that? Yeah, it's a really good question. Astro has its own syntax for components. Again, this is basically JSX. It looks almost exactly the same as React. However, Astro does this really great combo. And this is specifically for Astro components, where you declare front matter using a code fence.

Starting point is 00:51:46 A code fence is three hyphens, hyphen, hyphen, hyphen. And you have JavaScript. This is JavaScript that runs on the server to fetch data from somewhere, whatever, get read props, read query params, et cetera. And then another hyphen, hyphen, hyphen. So between these code fences, you put whatever server logic you want.

Starting point is 00:52:01 And this is usually to retrieve data. And then underneath the second code fence, you just write HTML. It's literally just HTML. And if you wanted to make it interactive without including a library, you can literally just write a script tag and write JavaScript, like document.queryselect or whatever, like inline, and it will work because it's just HTML, right? Now, if you wanted to include an island here, a React island, what you would do is you would have a React component that is the default export of its file. So a.jsx,.tsx something, and export default React component name. You can import that into your Astro component and just place it in your syntax tree.

Starting point is 00:52:40 So in your big ocean of HTML, you just write angular bracket, component name, and close it. And it will just render there. You bring up server components. This is a good question. So Astro has no support for server components at this point in time. But honestly, because you're deferring so much stuff, you don't actually need server components, I would argue. Because the whole point of server components is to load as little JavaScript on the client as possible. You load your big markdown renderer on the server, render your huge text to HTML and then send that to the client. Therefore, you don't load a markdown renderer on the client. That's the whole point.

Starting point is 00:53:14 And with Astro, you can already do that. Astro has support for API routes, meaning routes that execute only on the server. And just return JSON, essentially, or something like that? Yeah, you can return whatever you want. You can return text, you can return a stream. But how it works is, from the front end, you call one of your API routes, and this you would use to do sensitive things,

Starting point is 00:53:37 like speak to a SQL database, or anything that includes a bear token. I will take this opportunity, if it's okay, to emphasize, look, I travel a lot, I speak to a lot of teams, and a very common mistake is people will fetch data on the client side, including bear tokens, in

Starting point is 00:53:53 client-side fetch. We still face this at large, and so I want to take this opportunity to be like, look, if you're including a bear token, always do it only on the server side. In any case, yeah, Astro... You're talking about calling a third-party API and putting that... Okay, not like if someone's calling to your own backend,

Starting point is 00:54:12 including your Jot as a bearer token, but actually like a secret key or something like that. Yeah, even so, if you're talking to your own backend, there's discussion to be had around why aren't you using same-site cookies, which is probably more secure. In any case, what I see is, especially with people building all these Gen AI apps, they'll build a chatbot and you can just inspect the page, look at the network tab, and you get their open AI key.

Starting point is 00:54:37 Yeah, that's an expensive bill right there, if you leave that. Tell me about when you're back in that React world, how do you feel about React server components? Yeah, it's a good question. So I used to work at Vercel. I think Next.js is sort of the premier framework for React server components. I think if you're using Next.js, great.

Starting point is 00:55:04 It's a framework, again, a frame within which you work, and it does a good job of bridging that gap pretty well. I still have friction, even at that level of abstraction, because you have to do things like invalidate caches, which you never had to do before. And so if I'm submitting a form and it affects something else on some page, I'll need to like imperatively call that, which really messes with the whole React paradigm of

Starting point is 00:55:32 being a declarative abstraction. I shouldn't have to do this, right? So it's frictionful, but it's not impossible. Do I think it's the best way? I think for the end user, yes. And honestly, they should be the ones that matter the most. For me as a developer, I don't like it. I think we can do better. And I think while I say that, it's such a beautiful innovation because there's no other implementation of this

Starting point is 00:55:59 at this point in time in the wild at all. What is this? Components that execute on the server and run server-side logic, like talking to a database or whatever it may be, and then find themselves in the client in the right place every time. That's just really cool.

Starting point is 00:56:16 Nobody's doing this. Astro has no server components. A lot of frameworks don't have them because they're difficult. And so I do appreciate the complexity, but I think in terms of developer experience, we can do a little bit better. Do you think by the time we get to React 20 or 21 or something like that, most React sites

Starting point is 00:56:31 will be mostly React server components? I think Next.js sites will be React server components. But I think Greenfield, I don't think so. I don't think so. I don't think because it's just, even if you're starting a Greenfield project and you're not using Next.js, which that's the advice from the React core team is if you're starting something new,

Starting point is 00:56:57 always only use a framework. React is no longer for everyday JavaScript developers. I think this is a very important point, actually. React is not to be used for anyone building anything directly anymore. And so, yeah, if they're using Next.js, for sure, we'll see server components a lot. If they go against the official React advice

Starting point is 00:57:17 and sort of raw dog React, then it's just going to be a bad time. You have to wire together a router, a server. You need to think of where to deploy this. It becomes very, very difficult. So I think overall, I think in the future, just use of React is going to disappear and use of Next.js and React via Next.js is going to grow.

Starting point is 00:57:38 Interesting. Okay. So you kind of threw me for a loop when you said, hey, Astra is my go-to. Because I was going to say, what's your React framework of choice? I don't know, like in the next Remix? It's still next, okay. I mean, Remix is even going away, right?

Starting point is 00:57:53 They said it's going to be now React Router 7. And I think this is a wise choice. Because honestly, the core team that builds React works at the company that makes Next.js. They're so close, they're in the same room. They're probably getting lunch together. I'm very comfortable in saying Next.js is not just my preferred framework for React. It's the best framework for React, just by virtue of who works on it.

Starting point is 00:58:13 Do you worry about that closeness and just having fewer options? In one sense, options are good because we can find out different patterns, but also it's bad because it fragments the ecosystem. It makes it hard to choose what you want. How do you think about that sort of closeness now? Yeah, I think the closeness is... I don't care. And the reason is because everything's dispensable given Astro. If the closeness turns out to be really bad

Starting point is 00:58:43 and there's a company that's gatekeeping React progress because all the power belongs to them, then I have three React components in my Astro code base. I just turn them into solid components and I go about my life. And I think if we start thinking about altruism and should it, should it not, I don't know. This is philosophy. This isn't engineering. And I think the cool thing with engineering is if we can keep everything dispensable and open source, then we're good. And I think React at this point in time is super dispensable, which I'm here for, thanks to the good work of Ryan Carniato and Mishko Hevery and Fred Schott and all these folks.

Starting point is 00:59:19 Yeah, cool. What about any React libraries that you're particularly loving right now? And so examples I would say would be like Tansat Query or any of the context or statement. Any sort of favorites you have at the moment? No. I'm sorry. I feel like I'm super boring about this. I haven't written any React in months. There we go.

Starting point is 00:59:46 Okay. Yeah. So I will teach React sometimes people, because I do keep in touch, right? The core team are friends. I speak at React conferences. And I care. I care about the ecosystem.

Starting point is 00:59:58 So I do know about what's new in React 19. There's a lot of form primitives. There's use form state. There's use form status, which why are they named so close? I don't know. But there's use optimistic. There's all these great things that they're doing,

Starting point is 01:00:12 but they're polishing something that's already really shiny. And I think that's great, but I'm more drawn to like the crude things. Yeah. Yeah. Okay, cool. Well, that does lead me into like

Starting point is 01:00:24 the next area i want to talk about is just like around the education and content work and things like that you do because i think you're you're very prolific across a lot of areas like podcasting writing videos conference talks i guess like the first one is conference talks i feel like i see you speaking at a conference almost every week or something it It seems like, how many new talks do you do a year? I guess like how many talks do you do? And then how many of those are like new talks versus like, hey, sometimes you're using the same ones

Starting point is 01:00:53 because not everyone saw it the first time or things like that. Yeah, I reuse talks a lot. I will say that. And I think it's important to reuse talks a lot because people don't watch videos for one thing um and not only that people don't um get the privilege of attending a conference like i was i go to conferences where it's sometimes 300 400 people other times 15 000 people right um and not that's still like such a tiny drop in the ocean that is the world. And so when I do repeat talks, a lot of times like 90th

Starting point is 01:01:27 plus percentile of folks just haven't heard it before. So lots of value there. So I'm looking at my big Google sheet of talks this year. And this year I'm at 22 conferences for the whole year. Holy moly. What are we, like 35 weeks into the year?

Starting point is 01:01:43 Maybe not, probably not even that, like 30 weeks into the year. So yeah. Yeah. No, no, that's, that's for the whole year though. So 22, um, through December this year planned confirmed. Oh, I see. I got you. Okay.

Starting point is 01:01:55 But still, even then it's like, you know, every other week, every third week, something like that. Yeah. That's pretty good. That's a good pace. Every second. Yeah. I mean, if it, if it gets to 26, it will literally be one every two weeks. Um, yeah, but it's, it's a good pace. Every second, yeah. I mean, if it gets to 26, it will literally be one every two weeks.

Starting point is 01:02:06 Yeah, but it's a lot. I have a hard time. The thing is, I have a really hard time saying no. This is a very good tangent to go down, because it's not because I want to be seen and famous and on stage. I got over that very quickly. Not that I am, not that I am famous or seen or whatever, but like, I, there was a certain, I think I'd be remiss if I didn't mention, there's a certain glamor that comes with,

Starting point is 01:02:34 look, there's 400 people in this room who came, who paid money to come listen to you speak. That's crazy talk. And so I felt like, you know, the, I felt like all gassed up earlier. Um, but that wears off very quickly. And so I'm not in it anymore to like be perceived as this expert who's here to teach you. Right. I don't care very much about that. What I do care about is the fact that I know a lot of stuff and a lot of people don't like nowadays I do a lot of talks about AI. I'm the one I'm doing at info bit. I'll probably talk about rag. You know,

Starting point is 01:03:08 I, I, I see the bubble in which I live. So, so clearly because every single time I do the talk about rag and I, you know, I do this thing where I say a show of hands, like how many of you have heard of rag before?

Starting point is 01:03:22 Nobody, nobody had like maybe three hands go up and it's, it blows my mind because this is a repeated talk, but there's things that people that I know that people don't know because they don't have access to because they're busy building things day in, day out, things of value, things for their employer, things for their startup. And so they don't have the luxury of just like, I spend three hours a day doing research, Alex, literally just like exploring what's out there. Web sim.ai, let's go. And people don't have this, right? So I get to research and learn and then go teach. And this is why I have a hard time saying no is because I

Starting point is 01:03:53 actually want to democratize this stuff. I want to show people, hey, there's room at the table. I want to like, especially a lot of people talk about this new wave with AI being like the beginning of the iPhone. They go like, oh, the iPhone just got invented. And that spawned the App Store and responsive web design and this whole thing. A lot of people are saying that's what we have with AI now. And if that's the case, I want people to get in early and thrive. So the actual struggle inside my head.

Starting point is 01:04:20 Let's say this happens. Somebody sends me an email. Hey, Tejas, I'd love you to speak at my conference. It's on November 15th. Um, and, and, you know, we'll cover your travel and expenses and whatever. Like, can you come do it? Immediately my unfiltered thought, my first thought is, ah, again, um, it's, I'm just being fully honest with you. But then, but then I, I read the email and I go like, okay, it's in somewhere it's in Poland. It's in Gdansk, it's in, um, wherever. And well, these people probably would benefit from this. And I do have things to teach. And then I start thinking about,

Starting point is 01:04:53 okay, well, there's, there's value I can add here. There's, there's lives I can literally change here. I don't take that lightly. And then I say, okay, you know what? Yeah, let's do it. I'll do it. And then I go like, okay, cool. It's committed. It's in the calendar. And then I go look at the conference. I see the speakers and it turns out most of the speakers are my friends anyway. So now I'm like getting excited. I'm like, oh, this is going to be like a high school reunion. Great. I'll see all my friends. And then, you know, I'll go do it. People will learn. And oftentimes the feedback is very positive. Now, I will say this, I've noticed this in myself and this is not something I'm proud of, but the more like privileged a community is that I'm invited to speak at, the less I want to do it. You know, um, like, like if I get invited,

Starting point is 01:05:34 so I'm speaking at GitHub universe in San Francisco. Um, and it's a huge honor. It's a big privilege. Absolutely. I do love GitHub. I love the work they do and and i i i'm thankful for it at the same time if like jsconf budapest calls me and says hey we have a really small community um and we we'd love it if you teach our like 20 people like they don't stack up the same you know um anyway yeah for sure um so one thing i've noticed like you do a lot of talks you you do like what i would say like regular talks like purely educational but you also do like keynotes do you how do you approach

Starting point is 01:06:11 like doing keynotes differently than than like sort of maybe like the pure you know educational ones yeah it's a good question it really depends on what the organizer wants um and then it's very interesting because in india um they'll often use the word keynote for just a talk. And so this word even means different things in different communities. And so I often go to the organizer and I'm like, what do you mean by keynote? What do you expect out of this keynote? And so I asked, for example, Clark Sell.

Starting point is 01:06:40 He organizes a conference called That Conference. I watched his talk and this is what, what made me think of it? Like how, just how like different that one was than some other ones. But anyway, sorry, sorry to interrupt there. Go ahead. Yeah. No, no, it's good. So I went to Clark and I was like, what do you want from this? I mean, it's an hour and a half, an hour and a half. It's the longest talk I've ever done. And I said, what, what's your expectation?

Starting point is 01:07:00 And because this is what I'm kind of thinking. And I did watch the previous keynotes to also like understand, you know, what, what do people talk about? And it seemed like James Q. Quick spoke the year before me, and he was talking about going from being an employee to a full-time content creator. And it was the journey. And so what I noticed was a lot of these talks are just the journey. They're not really like, here's a deep dive into Kotlin with NeoVim, you know? And so I was like, okay. So I went to Clark and I was like, this is what I'm thinking. What do you think? And he said, yes, he's like 100%. We're here for the

Starting point is 01:07:29 people. Tell a human story, connect with human beings. And I was like, oh, okay, that's cool. That's a tech conference, but I can do that. And so I ended up doing that. And that talk is on YouTube. For some, they want a keynote to be like a state of the union. Like here's where, here's all the front end frameworks. Here's Claude is doing this and Anthropix doing that. And they're both the same and Gemini, you know, and so they want like a lay of the land and others just want the regular technical deep dive. So it really depends on what they want. And I think it's super valuable to just ask the organizer, like, what are you looking for here? You've been in sort of the educational space for a while.

Starting point is 01:08:08 How do you think it's changed while you've been doing it? It's a good question. It's difficult, this question, because it's very easy to offend people. And I don't want to, but when I started, I used to go to these conferences and meetups. One of the best earliest conferences I went to was a conference called Zeit Day Berlin, September 27th, 2018.

Starting point is 01:08:33 Um, and it was Zeit. Zeit is now Vercel. Zeit got acquired, not acquired, excuse me. They raised around and they rebranded to Vercel. And so, um, this was where I met Guillermo and he, you know, we really hit it off and I was like, why is it called Zite? He's like, I don't know. I just liked the name Zite Geist. And eventually that's, that would, that interaction would lead me to go work at Zite. But this conference was so cool because it was just people who are building great stuff, talking about what

Starting point is 01:08:57 they built and then they, they gave their lives to build these stuff, these things, you know, like codesandbox.com. You may know codesandbox. Ivis, he was 19 years old when he built this. And he gave the talk at Zeit der Berlin about how he built it, why he built it. And he had this great visualization of like, here's a huge mountain. And here's a person at the base of the mountain. And he's like, with codesandbox, what I want to do is make the mountain small so that everybody can climb it. And I was like, that is so cool. And so that's how it used to be. It used to be people just doing the weirdest stuff and going like, look, there's another great talk by a guy named Lucky. And he has this talk that I'll never forget. It's called how to crash a drone in nine ways with JavaScript. And it's literally like he's just controlling a drone with web Bluetooth with JavaScript. And at some point, he has a banana with electrodes, and he connects them to his

Starting point is 01:09:49 computer. And he's like, he's steering the drone with a banana. And this is how it used to be. But then, you know, we learned that there's money and power to be made if I do this a lot. And there was some type of gold rush. And eventually what happened was DevRel rose and a lot of people started basically being shills for their companies. Some would accuse me of, hey, you work at Datastacks, you talk about RAG.

Starting point is 01:10:16 Absolutely, but I definitely am not a shill. I try to give people a balanced take, often recommending like others. But I think, yeah, so it went from like, I built a cool toy project, companies saw, oh, wait, we could hire these people to say nice things about us. And then, you know, now, unfortunately or unfortunately, there's just a lot of shilling happening in the education space. And so I like your use of the word education. I like to think that that's what I do. I highly value education, me teaching people, but also people teaching me. And I think there's, there's like blood in the waters, so to speak, of like shilling, which I don't really enjoy, but I think that's, that's where we are today. How do I feel about it? It's okay. I mean, I don't really, as long as people are explicit, I start to get a little bit irritated

Starting point is 01:11:07 when there's people who just through a really unique set of circumstances, quote unquote, make it and then start selling that, hey, look, I used to do this. And now I'm making six figures, and you should follow everything I say. And unfortunately, a lot of the people who do follow everything they say, one will not make it, but two, don't have the critical thinking that is required to actually critically think and avoid the scam. And so that is probably something I'd criticize about that. But as long as people aren't being harmed, I'm like, I don't do whatever you want.

Starting point is 01:11:37 There's no need to be too gate-eepy about things. Yeah, for sure. That does make me long for the days of the wild conferences where it was just like, man, I built some weird stuff. Do you remember, there was this one talk by Ken Wheeler who just built a beat machine, an 808 in the browser with JavaScript. There's no company attached that's selling you beat machines. It's just like, look, I did this with Web Audio. Isn't this cool?

Starting point is 01:12:02 Yeah. Oh, man, Those were the days for sure. What do you think about video content? Do you like video content? Do you like making it? Do you think it does? Well, I guess, like, what do you think of it generally? Yeah. My background was design.

Starting point is 01:12:18 And so I love, I have a thing for, like, beauty. Visual stuff, yeah. Beauty is in the eye of the beholder. So what does that mean? Like for me, it's, I look at symmetry and the fine details and I'm just like, this is so cool. Like you remember, I came up with the like,

Starting point is 01:12:34 let's make a glass bubble and look at the refraction. There's this picture of a raindrop or a water droplet that was the basis of the Mac OS Aqua UI. I don't know if you've seen this but it's it's a guy sketched it on a pencil and he captured with a pencil excuse me and he captured with high fidelity like all the nuance of just light interacting with a droplet was beautiful and that became you know in apple's earlier mac os iterations you'd have like the close minimize and maximize buttons they were just these beautiful 3D balls. So cool.

Starting point is 01:13:07 Anyway, so that's where I came up. I really care about like precision beauty and aesthetics. And so for me, you ask if I like video. Absolutely. I mean, look at my camera. I don't know if it's full quality here, but this is a beautiful, you know, APS-C mirrorless A6700 with a nice little Sigma. It's just, I love it. I love the depth of field, the book,

Starting point is 01:13:31 this whole room is tuned to make video fun. So for me, I love video. It's creative. I get to play. In fact, something that probably annoys a lot of my teammates at Datastacks is I just make videos for everything. Literally, like if there's a, I did this today, there was a bug where we were overriding some state. We had a race condition and I thought I could open a GitHub issue and just write this, but let's just record it. And so I did. And it was in beautiful 4K.

Starting point is 01:13:55 It was very nice. Yeah, no, so I love video. I think I'll summarize, I apologize if I'm speaking too much here, but I'll summarize by saying, we all know that a picture is worth a thousand words but i think a video is worth a thousand pictures and i think it's an absolutely great medium yeah and i think for for any creators who want to get into

Starting point is 01:14:14 video i think it's they just have to make it as easy as possible like a lot of folks if they can if they have a desk and somewhere to like set up a camera permanently such that if they press the on button, they're ready to go. That helps a lot. It helps me a lot for sure. Yeah, for sure. What about switching gears to writing? You know, you wrote this book, I guess, like what was your experience writing? Did you enjoy that?

Starting point is 01:14:35 Will you do it again? How do you feel about writing? Yeah. So this book actually came out from a talk. So I was in Berlin, February of 2020. So just before the lockdown started. And there was a meetup and it was really bad weather. It was very rainy.

Starting point is 01:14:54 And so a lot of the speakers didn't come. And it looked like there was just not going to be anything at the meetup. And so I was there. The organizer was like, you speak at stuff, right? Can you like just do a talk for me? And I was like, about what? He's like, I don't care, man. Like we may not have a meetup. And there's people here and there's food. And I

Starting point is 01:15:07 said, fine. And so I did this talk where I built a React app live. It was all live coded. But I didn't import React. And so if you do this in JavaScript, you get an error. It says, cannot read property, create element of undefined. Because react.createElement is what makes your elements. And so I said, look, we're going to learn React today, but we're going to learn it by just making it. I'm not going to import it. I'll just const react equals and I'll define React here. And so I did that. It took like an hour, but we created a really nice bare bones version of React and I taught them the underlying mechanism. This then went on to YouTube and it became extremely popular. It was like 50k views, whatever. And then O'Reilly, I suspect saw that and wrote me an email. We're like, Hey, can you write this book?

Starting point is 01:15:48 And I was like, going back to saying, no, I just don't. Um, so, so I wrote it. Um, the experience was really great because they, they're really good on like gentle, consistent accountability. It's so great. They, they, they never like pounded their fist into their open hand and was like, we're coming for you if you don't finish. You know, they were never like harsh, but they did keep me accountable and it was great. So I wrote it. What I especially liked was I could write it in Markdown. The entire book is Markdown. There's no ASCII doc. There's no word. There's no formatting. I just did my thing in Markdown and they mapped it perfectly. It was great. Would I do it again? Yeah, I'm actually working with them on an AI book. Oh, perfect. Why not? You know, I like it. I think it's more time consuming

Starting point is 01:16:32 than video. I prefer video to writing. And I think it's sort of in a dangerous place right now, given the large language models, right? Like, there's a lot of delve and there's a lot of words that you can just tell. But I think we as humans can now at least recognize LLM text versus human text. But I wonder how this will go with the generations. And I think writing may be in some trouble. Yeah. Yeah. Interesting. Would you ever self-publish or do you like the sort of O'Reilly experience? I think if I self-publish, I won't publish. Because I just don't finish things. Yeah, for sure.

Starting point is 01:17:12 And O'Reilly does such a nice job, too. The book is obviously a nice presentation. It's easy to get it from places, yeah, for sure. You know, the self-publish question is really interesting. Because I know, Alex, I know for a fact. There, for sure. You know, the self-published question is really interesting because I, I know, Alex, I know for a fact, like there's no question. If I create a course and drop one, like an AI course for web developers, I'll be a millionaire. I just know it. I'll ship seven figures in the first month, maybe the first week. It's, there's a market and I'm good at it. But at the same time, I just don't want to

Starting point is 01:17:45 it's so interesting I genuinely have no desire whatsoever to do this and so again I am also working on an AI course with O'Reilly but it's with O'Reilly because I just can't be trusted because my heart's not in it yeah for sure I think your career is super interesting

Starting point is 01:18:02 you mentioned a few interesting places you worked at like Vercel, Datastacks, and things like that. But also, you ran your own DevRel consultancy for a while. Can you walk me through what that was like, what you learned doing that? Yeah. Yeah, it was hard. I think, so I'm based in Europe,

Starting point is 01:18:18 and it was definitely hard mode because of that. Partly also because I don't speak the language. My German is good, but it's not business good. What ultimately led to me closing it was there was just way too much bureaucracy and a lot of hidden fees that I didn't know I had to pay, but I had to pay. Literally for membership to some registry where it's a portion of your revenue where all they give you is like

Starting point is 01:18:45 invitations to networking events. And I'm like, but I work in DevRel. Like I don't need, like my whole company is networking events, you know? So it was, but I had to pay it. And at some point that plus the taxes, plus all of it just added up to way too much. To the point where I was going like, look, if I took a full-time job, I'd probably be earning something similar anyway. I learned a ton, though. I learned that people lie a lot on their resumes. On their resumes? Oh, wow. And during interviews.

Starting point is 01:19:16 I was shocked. And honestly, I shouldn't have... Was this because you were hiring people? Yeah. Okay, interesting. Contractors, not employees. But, and you know, some of them were like very well known on Twitter and such. And I was surprised.

Starting point is 01:19:37 And then, you know, I started talking to their friends and they go like, hey, I heard that person work for you. What was that like? And I told them, I don't lie. And they were like, yeah, I kind of expected it because you. What was that like? And I told them, I don't lie. And they were like, yeah, I kind of expected it because I know. And I was like, why didn't you warn? And so that was one of the more profound lessons

Starting point is 01:19:52 was that people will embellish in a resume and in an interview, in multiple interviews. I had no idea. And also people get very comfortable and complacent when there's regular income. This is, because for me, I was the business owner. I was like, yo, I need to make sure this thing is sustainable. And so I'm working and grinding and like trying to make sure there's regular income so I can

Starting point is 01:20:15 pay everybody and myself. And, um, but my team just didn't have the same fire because, because they're like, yeah, that's good. We have a good thing. And, you know, I've been an employee actually my entire life until that point and I never understood the founders I worked for until then. It is an entirely different beast

Starting point is 01:20:35 and it will burn you out and it will require that you have no life for a time. That was my experience. I think in terms of business development, it wasn't difficult. We were, it was considered the fastest growing tech consulting startup in, I think, Germany or my state. I'm not sure.

Starting point is 01:20:54 I mean, because like I'm at these conferences, I'm talking to great people. They have projects. They often, I think if somebody is listening to this and wants to start a tech consulting business, especially if they're like on Twitter and at conferences, do it. There's definitely a market. It's super unlikely that you'll be bankrupt, I think.

Starting point is 01:21:11 And this may be my privilege talking, I don't know, but I suspect so. BizDev was fine. It was just the bureaucracy and the hiring, the people. The people was so difficult to get right. Yeah, interesting. That's interesting to hear. That's a cool perspective. And yeah, thanks for sharing that with us.

Starting point is 01:21:28 Tejas, just thanks in general for coming on. It's been super interesting and I learned a lot for sure. And it's great to have you. If people want to find out more about you, where can they find you? They can find me on my website. That's tej.es or on X, formerly Twitter.

Starting point is 01:21:43 That's Tejas Kumar underscore. And at conferences worldwide. So if there's a conference near you, Tejas will probably be at it. Yeah, I'd love to meet you. Cool, awesome. Tejas, thanks for coming on. Really appreciate your work and having you here. Thanks a lot, Alex.

Your Ad Here

Software Huddle - AI Engineer, Web Frameworks, & more with Tejas Kumar

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.