The Changelog: Software Development, Open Source - A new direction for AI developer tooling (Friends)

Starting point is 00:00:00 It's ChangeLog and Friends, a weekly talk show about MCP hot takes. Thanks as always to our partners at fly.io, the public cloud built for developers who ship. We love Fly. You might too. Learn more about it at fly to I.O. Okay, let's talk. What's up, friends? I'm here with Kyle Galbraith,

Starting point is 00:00:44 co-founder and CEO of Depot. Depot is the only build platform looking to make your builds as fast as possible. But Kyle, this is an issue because GitHub Actions is the number one CI provider out there, but not everyone's a fan. Explain that. I think when you're thinking about GitHub actions, it's really quite jarring how you can have

Starting point is 00:01:05 such a wildly popular CI provider. And yet it's lacking some of the basic functionality or tools that you need to actually be able to debug your builds or deployments. And so back in June, we essentially took a stab at that problem in particular with depots GitHub action runners. What we've observed over time is effectively get up actions when it comes to like actually debugging a build is pretty much useless. The job logs in GitHub Actions UI is pretty much where your dreams go to die. They're collapsed by default. They have no resource metrics. When jobs fail,

Starting point is 00:01:41 you're essentially left playing detective, like clicking each little drop down on each step in your job to figure out like, okay, where did this actually go wrong? And so what we set out to do with our own GitHub Actions of Observability is essentially we built a real observability solution around Git of Actions. Okay, so how does it work? All of the logs by default for a job that runs on a Depot GitHub Action Runner, they're uncollapsed. You can search them.

Starting point is 00:02:05 You can detect if there's been out-of-memory errors. You can see all of the resource contention that was happening on the runner. So you can see your CPU metrics, your memory metrics, not just at the top-level runner level, but all the way down to the individual processes running on the machine. And so for us, this is our take on the first step forward of actually building a real observability solution around get up actions so that developers have real debugging tools to figure out what's going on in their builds okay frang you can learn more at depot dot dev get a free trial test it out instantly make your builds faster so cool again depot dot dev

Starting point is 00:02:49 i got like the news like i got the update in the podcast that there was like an oxide event and You were there? Was that a thing? Like, how often do you do that, where you go, like, to the place? Yeah, this was the first for us, I guess, because that's like an internal conference for their company. And obviously, we are not internal to their company. So that's a first for us.

Starting point is 00:03:12 But, you know, we've hit it off with Brian Cantrell and with Steve Tuck, the co-founders. And they wanted us to experience the team and meet everybody. And we had been kind of ogling and gush, you know, over how cool their server racks are for years. And of course, I don't have enough money to buy one of their racks and neither is Adam. They're not going to make a homelab version,

Starting point is 00:03:36 despite Adam's incessant cries for them to have an affordable, maybe a half rack. And so we've always wanted to like see their hardware in real life. And so this opportunity presented itself. So we went, hung out, met a bunch of cool people and saw kind of inside their company what they are up to, which was cool, it was different.

Starting point is 00:03:56 nice very nice first time to emeryville you've been to emeryville Oakland area in the bay Jose I don't think so you don't think so right across the street from Pixar like the Pixar headquarters I actually didn't even know like Pixar was around that area

Starting point is 00:04:15 I didn't either until I looked across a street and there was Pixar you know isn't like that area in Auckland where you're like some startups are starting to move there? Is that the thing that you're saying or I'm getting things mixed? I don't know. Honestly, I think so.

Starting point is 00:04:32 But we're outsiders. You know, we get invited to the valley from time to time. Yeah. And eventually we say yes. You know, it's a cool experience, I think. I think it depends on the different places we get invited to. But I think it's a chance to explore the world and meet some cool people, put back the layers, tell some cool stories.

Starting point is 00:04:55 So I favor the IRL. I think it's cool to do it a few times a year. Or as often as it makes sense, some version of that. We usually go to all things open, but our schedule conflicted this year. What about you, Jose? You get out and see the people ever? Yes. And I had kind of like, I was used to do a lot of that,

Starting point is 00:05:17 especially at Elixir, like at the beginning, go to a bunch of different conferences and just talk about Elixir. And then, of course, like at some point it gets very exhausting and then I ended up like just kind of like, okay, I'm going to focus on the Alexer community from now on. And even the Alexa community, like there is like the adjacent Arlene community as well that it's enough to keep a person busy. But now we've tied wave and we are like supporting now like we support Phoenix and Rails. We are working on like Django and other frameworks. I have started to like kind of go back, for example, to Ruby conferences. So last week I was at Eruco, which is one of my favorite conferences.

Starting point is 00:06:07 I don't know if you are. You were Ruby folks, weren't you? Yeah. Yeah. So I'm familiar with that. I haven't been to it. But I'm familiar with most of the Ruby Confs. Yeah.

Starting point is 00:06:16 So just I don't know if you're alive already. But for the listeners, I talked about this another place. What I really like about the Ruko is that every year, people say, look, I want to host on my city. They do like a three minutes, five minutes presentation and people attending the event vote where it's going to be next year. Okay. And it's usually somebody with like no experience organizing an event now has to like organize an event for like 500 people, right? 600 people and it's probably very daunting but I think it keeps like it keeps it keeps it fresh and keeps it always like community centric right because it's always

Starting point is 00:06:58 moving around so yeah so I was at a Roku and and then the lexer events I'm going to in two days I'm going to the go-to conferences in Copenhagen so yeah I'm kind of back on traveling mode for now. You like that or do you just do it because you have to do it? I'm enjoying it right now because I think one of the things with everything that is happening around AI, right, and code agents that they are like nobody kind of knows where it's going. Some people tell you that it's going to go there, but nobody really knows, right? Like I think the CEO of Anthropic made a prediction about like 90% in six months, six months

Starting point is 00:07:44 have passes, has not been 90% of code being written by code agents. So I think like I am enjoying a lot like this opportunity with like talking to different people and you are getting like a bunch of different takes and different ideas and things to explore. So it has been really fun just going out and and talking to people. But I'm sure that I'm going to do it enough that at some point like maybe six months of like, I had my fail. It's time to

Starting point is 00:08:18 hibernate again and go back to the Alexer conferences. But right now it has been really fun. I agree. I think it's fun to step back for a while and become a recluse and enjoy your local world. And then to come out, you know, peek your head out

Starting point is 00:08:34 from underneath the rock and see the people again, there's something invigorating and exciting about it. But when you're just constantly on that track of just like travel, travel, travel, conference, conference, conference it can tend to burn out so i think everyone needs to step back but then also step out and see some people because that's where the that's where the magic happens isn't adam i mean that's where the real relationships actually form i think so you know the rl is uh is really where it's

Starting point is 00:09:02 at i heard that somewhere i liked it then i experienced it and i was like you know just give me more please non-stop well trying to figure out where the ai thing is going Claude 4.5 dropped today. I don't have anybody played with it yet. I have not. But, you know, better, stronger, faster. Still not writing all the code for us. I think I did use it today.

Starting point is 00:09:27 But they said something like it can go 30 hours on a coding bender. I just thought, well, that's really good marketing because I have no idea if it's true or not. But I was like, that's a great way to describe what you're thinking can do. Which is more than I can last. I don't know, Adam, how long you can bend or jose. I'm sure you've been on some benders in your. your life, but 30 hours. Holy cow, man.

Starting point is 00:09:46 Yeah. So that would be stronger. I'm trying to think of where it fits in the category of better, stronger, faster. Stronger, man. 30 hours straight. I think it doesn't lose context or something. I don't know. Did you read this?

Starting point is 00:09:57 Maybe they are doing more things. So right now, like, they do the auto compactation of the chat. And context engineering is all the rage right now, right? Right. But like, they do the automatic compactation of the conversation, which is summarizing it. But something that they do is also that when the context is getting filled, they just prune the tools output from the beginning. So there is like some files or some searches or commands you ran at the very beginning of the conversation. They prune that and that also

Starting point is 00:10:37 allows them to go for long without having to summarize stuff. because when you summarize, there's always a chance that you are losing some data. And something that is really something that I do it a lot is that you can actually have conversations with the agent about this stuff. So a lot of people say like, oh, like there is the agent, the agent doesn't tell me like, oh, I'm using this agent and this agent doesn't tell me which tools you have available. but you can always just ask the agent like which tools you have available and then you can you can you can like have a just say look invoke so something that i do is like invoke like echo with uh with you know a a hundred thousand times three times so i can force it to fill in the context name like what you see now on that tool that was invoked and then it's it's like oh the two output

Starting point is 00:11:41 disappeared. It's like it's surprised that self like itself like it now says this thing. And you can you can trick it like to crash very easily as well. Say like so you're like, oh, why you have this too? Like wouldn't you think like this two would be better? And then you just say like would this two be better. It imagines that the two exists. And it's like, yeah, it would be better. Let me try calling it. But it obviously doesn't exist. And then the agent crashes. So I like like having those meta conversations and they get like surprised or tripped up yeah it's fun it's almost like talking to a kid you know it's just very easy to pull the wall over our kid's eyes and they'll just they're just gullible because they don't have life's experience that we do and you can

Starting point is 00:12:26 have a lot of fun as long as you're you know keep it in good natured fun and not trying to actually trick a kid uh but with your AI who cares it's a robot trick it all you want jesee you know get it to do all kinds of stuff have you heard we just learned this from ferras that people are actually using prompts in their malware now. So if you can get arbitrary code execution on someone's computer, for instance, this was in the case of NX, which is like a mono repo command line tool, and they hacked the NX MPM package,

Starting point is 00:12:58 distributed some malware, and if you're running the NX command and you're infected, in there was an actual prompt to ask Claude code to do stuff for it. Instead of like, you know, instead of coding it out. Yeah. And it was really kind of smart because what they asked it to do is the things that are kind of fuzzy finding for humans, which is like for programs, I should say, which is like find all the interesting files on this computer, which, of course, you could have a list of where the interesting files are and you could like search certain things. But you can, but Claude Code can just go do tool calls and read stuff and just hand back a list of interesting hackable files, you know, like secrets and whatnot. Anyways, I just found that to be. be amusing. So even attackers are getting lazy to write their own code. That's what they're saying. Exactly. Like, why?

Starting point is 00:13:46 This is the promise land, isn't it? You know, I don't have to code even when you're hacking someone's computer. Yeah, I wonder why that was the best route. Was it because of their laziness or their lack of desire to write that script or just because they were just trying to leverage, you know, a clod code-enabled developer's machine? Like, what do you think the true

Starting point is 00:14:05 psychology of that choice was? I think they're just thinking, this is the fastest way to the best result, you know, just like most programmers are like, well, what's the fastest way to the best result? Well, I could write a program. And besides, I only have so much stuff I can shove in, I'm assuming, like the more stuff you put in, the more likely are you to be found. So maybe some compression is in there. But it's like, if I could just prompt something to, to scour your computer for interesting files, that's a lot, that's pretty good at it. That's a lot faster than me having to write a program that scours your computer for interesting files. That's my guess. I don't

Starting point is 00:14:39 knows Jose. Why do you think somebody might do that? Maybe they're just showing off. Yeah, I don't know. They just want to trick a computer, you know? They want a trick in the eye. Yeah. They're in nefarious deeds. So, okay, you like to mess with them. How much value are you getting? Because a survey says that we're getting tons of value, but quantified research says that we're not. I don't know if you've read any of the research, but a lot of recent papers, a lot of me, at least more than one, have come out and said,

Starting point is 00:15:14 developers think that they're more productive with AI coding tools, but it's actually slowing them down. What are your thoughts, Jose? Well, I think I have many thoughts on this. The first one is that to nobody's surprise we are really useless

Starting point is 00:15:33 at estimating stuff. Of course, we are estimated. Proving that for years, haven't we? Yeah. So, of course, you are estimating things wrong. And I think, like, the fact, like, people call it, like, with this exaggerated, like, oh, my God, like, three times more productive or even twice. For me, it's just kind of pointless because if actually, if you're even, like, a third more productive, 33%. That's, like, kind of, like, massive. That's huge.

Starting point is 00:16:02 Yeah. Right. And then I think people fail, which you consider, like, there are other studies where, like developers with I don't remember the exact number it's like we spend like 50% of the time coding let's say right and then if you're using agents for coding of course how more productive are going to be is going to be where are using agents and if I're only using agents for coding you can only optimize that 50% and not all the other things so all that said I yeah and then the other thing is that we allow

Starting point is 00:16:37 lot of people they don't estimate, they don't consider the time that they lose when something doesn't work, right? Everybody's happy, like, oh, I use the agent. It worked. I was super productive. But there are a lot of times where it's just, it's not productive. And then you end up, like, trying to coerce it to do the right thing. And then it doesn't do it. And then you quit. Right. And then you try it again. And then it works. And you completely forgot about that bad experience. The bad experience is actually one of the reasons why I never like the completion, like the AI completion suggestion, because I would read it.

Starting point is 00:17:17 And if it's not what I want, it would always throw me out of my loop. And like that time where I'm like, I read it and then I'm like, oh, damn, I lost my flow. Like, how do you measure that? Right. If you're only measuring like, oh, it was accepted, you know, two thirds of the time. time but like the one third was so disruptive to me that you know like so with all that said i think that i i get a i get a benefit from you do right i think and i think there's my six caveats but i still think i do citation needed right uh i like i i i joke that i would

Starting point is 00:17:56 like the we could use the the the wikipedia citation need it should like be like an html feature we should just be able to put that everywhere, you know, like in conversations. After every sentence that I say? Yes. Yeah. Because the other thing is... It's Rigby. Is that Samsung's thing?

Starting point is 00:18:16 Silicon Valley. Oh, gosh. Sorry. Oh, yeah. Continue. You know I'm not going to get that. You're hoping Jose's going to get it, weren't you? No, yeah.

Starting point is 00:18:24 Yes. All right. Well, some people got it. Yeah, we can talk about Silicon Valley later. But yeah. So the... Yeah, you just did the AI completion. He just ought to complete the wrong thing.

Starting point is 00:18:38 In the reels. In the reels. He lost his limit. Okay, he's back. So, so there are a couple of things that I do that I think, like, you have to find where it works and where it doesn't work. And of course, it's going to change as those things improve. So, for example, I tried it a couple times to help me work with elixir type system and stuff. And it doesn't work.

Starting point is 00:19:00 It's going to be useless. I'm not going to try again. And maybe six months, maybe in a year, things change enough that it can help me with that kind of work. But I don't feel it's there. But for example, when working with TideWave, because it support other web frameworks, I often implement the feature in Elixir. Tell the people what TideWadeW is real quick, so that the three of us know, nobody else knows. What's TideWave? Yes.

Starting point is 00:19:26 So TideWave is a coding agent for full stack web applications. So I'm going to summarize that we can jump into it later, but the idea is to have a coding agent that is tightly integrated with our web framework. So we understand what is on the DOM and how that maps through a template. It can coordinate the browser. So it gives it a really strong verification loop. So as you ask it to build features, it can verify that features work. You can interact with the actual app page and ask changes on the page.

Starting point is 00:20:02 instead of asking for changes on the code. I have this whole idea that I think we should run coding agents on top of what we produce. So if I'm working on a library, what I produce is like API, docs, fine, run that in an editor. But if I'm building a web application, I want to run the code agent in the actual browser. Because I wanted to understand what I produce, right? And I want to be able to interact with what I produce because if I can't do that, that we are doing boring translation work all the time. I like looking what happens in the screen,

Starting point is 00:20:38 go to the editor, ask it to change things, right? And then the agent says, I'm done. You reload the page. There's an exception. You have to copy and paste the exception. Back to the agent.

Starting point is 00:20:48 Like, you don't want to do this boring stuff, right? And I say, like, the data science folks, they were the first one to notice that because they were the first ones should put the code agents inside notebooks. They're like, okay, let's run this thing inside a notebook because, if it understands my variables, if it understands my cells,

Starting point is 00:21:05 they're going to be more productive. But nobody caught up to that trend, right? We kind of regress, we first put in the editor, and then we put it in the common line, right? Like, you know, so we should be going up, right? So that's TideWave. And I do think TideWave can help you, like, be more productive with AI because being able to,

Starting point is 00:21:29 allowing the agent to verify what it builds, builds is going to make it so it builds better things, things that are guaranteed to work and are going to spend less time on that loop. So when I'm working on TideWave, like we support Phoenix, we support Rails, we are working on Django, Nextjs, and a couple of others. I usually implement the feature in Elixir, tell the agent like, hey, I implemented this feature in Elixir, now do, then I go to the Rails project, implement the same thing. I'm like, there are a couple of things that I tell it's like, don't add tests, don't use

Starting point is 00:22:06 mocks, right? So there are some threats in there, but. But you say don't have tests and then you say don't use mocks. I mean, if it's not writing any tests, why is it used? Sorry, don't add additional tests. Sorry. Then the ones I wrote, because the AlexerPR is like, it's, it's good. I wrote it, right?

Starting point is 00:22:24 Right, it has a test in there. So it's copying those over. It has like the proper tests, yeah, because it tends, I think, I think my experience with coding agent for coding is way better than testing, because tasks, it stands, not in elixir, but because I'm doing a lot of Ruby and Python, it tends to use mocks a lot and just writes a bunch of redundant tasks. So it's a whole separate discussion, but it's really good. Like, when I ask it, like, get this PR here, translate to this repose story.

Starting point is 00:22:56 A lot of the time, it's just perfect. It's say it's done. It runs the test. it runs the linter, and I can just push it, right? I send a PR, people review. So that's really good. So I think that that's one of the things you have to figure out where it works and where it doesn't, take notes of that, right, and find the loops and tools that make it

Starting point is 00:23:18 work for you. And it's like any other tool. And I think AI has this particular problem that some people say, like, oh, it's just magical. It's kind of like a lottery, a lottery in some sense. like some people go try AI and because it's you know it's probabilistic they get a bad experience and they're like oh this sucks and I'm going to try it again because people come with the expectation that is just going to work and then some people again like you're

Starting point is 00:23:46 trying it for the first time by just the randomness of it they have a good first experience and then they start investing on it and refining it right and that's the process you do have like to figure out what is there and what isn't. And then the other thing that I like, I tell people to do, which works really well for me, is to I don't correct the agent. I just, like, if it does something wrong, or if it's like 70, 80% good, I just go and finish it. That's fine.

Starting point is 00:24:18 Well, you code it for it. It depends. So I do two things. So imagine that I ask it does this thing. And then I, I leave. I come back, like, oh, this sucks. I'm not asking like, oh, you're supposed to do this instead. Because often when it does something wrong, it's there in the context.

Starting point is 00:24:34 It has a really- Keep getting wrong. Yeah. And then when it fixes, it doesn't fix everything. So when it does something wrong, I usually go and like, I start a new chat. I just discard everything, right? No, nobody's going to be upset. I just discard everything.

Starting point is 00:24:49 I was like, okay, start again. But do this, this, and don't do that. I add a little bit more of context. And then if necessary, I start again. start fresh with additional little warnings or instructions yeah Adam you do the opposite don't you you never write the code you just keep telling it to do stuff uh yeah I don't really yeah I think by and large it's writing code I can't really ready any you know myself anyway so it's like it's you can do a better job that I'm going to do so he's a better programmer than both of us so he can

Starting point is 00:25:19 just fix things we're just different people you know yeah so what are you using it for uh CLI tools really I'm having fun with like a proxmox CLA word and instantiates like a virtual machine with a given cloud init image and it's like a command line away basically. It's cool. So I can spin up a new server immediately, essentially. I can package it as a server. You can share it with me as a Git repo. It's kind of cool. That, and I would say Sevens Arch, which is a, you know, 7Z is the compression algorithm.

Starting point is 00:25:54 So I was just working on a version of that as a CLA that's just cooler, basically, because 7Zs, its existing command structures, it's kind of like not a lot of fun. It's hard to remember. I can always forget it. It's highly configurable. And so I wrote something that was just more fun. So does it wrap it and then call it underneath the hood with specific flags? Yeah, or is it? Essentially.

Starting point is 00:26:20 It did that for a while until then it was like, uh, uh, uh, uh, We essentially just, like, we rebuilt something called Lib 7Z, which is a wraparound. I think it is like Rust 7Z2 is a crate out there. Okay. And so it's like, it actually acts as a library around 7Z essentially. And then you can write a seal a lot layer on top of that because it's a library. So that's where it's currently at right now. That's cool.

Starting point is 00:26:49 So you don't have to actually shell out. You're actually re-implementing the functionality. Yeah, precisely. With a rust crate. And you get a lot more data in that API as well. Like you get a lot more granularity around like files and process and progress. And you can control a lot of the UX around the CLI that way. We deal with a lot of large files and folders.

Starting point is 00:27:07 Yeah. So I'm just sort of a nameer by archiving them very well. Archiving it to the best of visibility. Yeah. Well, friends, it is time to let go of the old way of exploring your data. It's holding you back. But what exactly is the old way? Well, I'm here with Mark DePuy, co-founder and CEO of Fabi,

Starting point is 00:27:42 a collaborative analytics platform designed to help big explores like yourself. So, Mark, tell me about this old way. So the old way, Adam, if you're a product manager or a founder and you're trying to get insights from your data, you're wrestling with your Postgres instance or Snowflake or your spreadsheets. Or if you are, and you don't maybe even have the support of a data analyst or data scientist to help you with that word.

Starting point is 00:28:04 Or if you are, for example, a data scientist or engineer or analyst, you're wrestling with a bunch of different tools, local Jupyter notebooks, Google CoLab, or even your legacy BI to try to build these dashboards that someone may or may not go and look at. And in this new way that we're building at Babi, We are creating this all-in-one environment where product managers and founders can very quickly go and explore data regardless of where it is, right?

Starting point is 00:28:31 So it can be in a spreadsheet, it can be an air table, it can be in Postgres, you know, it's really easy to do everything from an ad hoc analysis to much more advanced data analysis if, again, you're more experienced. So with Python built in, you know, Python built in right there, NRII assistant, you can move very quickly through advanced data analysis. And a really cool part is that you can go from ad hoc analysis and data science to publishing these as interactive data apps and dashboards were better yet at delivering insights as automated workflows to meet your stakeholders where they are in, say, Slack or email or spreadsheet. So if this is something that you're experiencing, if you're a founder or product manager trying to get more from your data or for your data team today, you're just underwater and feel like you're wrestling with your legacy, you know, BI tools and notebooks, come check out in your way and come try out Fabi. There you go. Well, friends, if you're trying to get more insights from your data, stop resting with it, start exploring it the new way with Fabi.

Starting point is 00:29:27 Learn more and get started for free at Fabi.a.i. That's fabi.a-i.a-i. Again, fabi.a.i. I also use coding agents for things that I am not reviewing, particularly for like prototypes and that part has been really fun because if you're working on the product you have ideas of wait what could this project which directions we could go in the future but usually before you would like think about it put on some notes right and then maybe if you're lucky in two three months like somebody from the team can take a look at it

Starting point is 00:30:09 give feedback right and now with agents you can just say okay go for it implement this thing, right? And so as I was saying, like, I have this idea that coding agents should run on top of the thing that we produce, right? And we talked about TideWave Web that works for applications. I talk about notebooks. Well, but if I'm working on the game, right, I want to have TideWave running in the game engine.

Starting point is 00:30:40 If I am building a mobile app, you need to know about the mobile device, simulators. right and all this kind of stuff so i was able to like i think for four or five weekends straight like my my what i would do during the weekend is to come to the computer from time to time see if the agent was working and just have it build like a different proof of concept of embedding tidewave somewhere completely different like oh what would uh what would a a Tidewave browser extension look like, which capabilities we get from this. And, you know, like this, doing that,

Starting point is 00:31:20 when I was, I had to do this kind of things for other products. Like we were doing that for Lifebook, you know, it would take a real long time to validate all those things. And I could very quickly explore something different, get the lessons learned, and provide a way better blueprint for the team to work on. Do you run it in YOLO mode or whatever is the equivalent? where it's just doing whatever it wants to do

Starting point is 00:31:43 and you come back every once in a while? Yeah. Yeah, totally. So do you have it on it? Have you considered like a notifier and text me when you're finished kind of a thing? Otherwise you gotta keep coming back. Are you done yet?

Starting point is 00:31:56 No, it's not finished. Oh, it's been done for two and a half hours but I was watching TV. Yeah, so in this case because it's the weekend, I don't care in the sense that I don't want to be also interrupted, right? Like it's not my priority. Gotcha.

Starting point is 00:32:11 Yeah. So when you feel like it, you go over and check it. Yeah. Otherwise, it's like, yeah. Otherwise, I'm using the notifications, right? Like I use Zad a lot and TideWave and they all have notifications. And then I'm kind of like listening. I'm waiting for them. Oh, they don't like push to your phone or anything.

Starting point is 00:32:32 I don't think that tidewave doesn't. I don't think Z doesn't know. Because then you don't have to wait and listen. You can be out on a walk or whatever and be like, oh, it's done. maybe even like yeah but if i'm on a walkie it give it as nest task you know yeah right so it's funny because i talk to chris McCord about this right and it's like oh maybe i am on a coffee i'm out to get a coffee and then i'm like look if i'm out to get a coffee i'm out to get a coffee you know it's like if it's not i don't care i'm out you know it's right yeah same but i do

Starting point is 00:33:07 three maybe you lose three hours of productivity man i mean all you got to do is sell it to keep going and you know he's tradeoffs i get it it's the weekend i like to unplug as well but i don't do any of the stuff that you're talking about i don't have anything coding for me over the weekend so if i was i'd at least want to like you know be a good babysitter not a neglected not a neglecting babysitter but to each their own i guess so you're talking to chris it sounds like you and chris are you guys competitors now i mean doesn't chris have phoenix dot new and isn't this like there can be only one josea right so right i i Yeah, and that's why he's not coming on the show anymore.

Starting point is 00:33:46 No, I'm kidding. So, no, we do talk, talk a lot about those things, and like we are still balancing many ideas of each other. So the way I think about this is that I think there's a very easy way to separate those things, is that Phoenix.comte new is remote. And I did not, so maybe we should go deeper in Tidewave because there is a bunch of of additional context here. So as I was saying, like, TideWave is a coding agent for full stack cloud applications, but the thing is that it runs on your machine, you know, so it's not, so my whole, so one

Starting point is 00:34:25 of my ideas is that we're looking at like boat.com, lovable.com, dot dev, right? And they have all those things where you can click around, ask it to do changes, but it's like, they want to kind of own your code. They want to be responsible for a code. and most of the times it's like for its front end or for like React apps. And then I'm like, I want that for my Phoenix app that I run on my machine. Right. So there's like this, like a lot of people are pushing, oh, AI and those app builders that are running on the cloud, right?

Starting point is 00:34:59 Tidewave is, you know, you are accessing local holes. So the way you would install it is that you would add the TideWaves package. So today it's like for Phoenix or Rails or in the future for next jangles. You just install the package. After install the package, you go to your application, local host, whatever, 4,000, and you do slash tidewave. And then the agent is running there, like in the browser and your web app running on the side. And now you can do all those things.

Starting point is 00:35:30 You can go to the inspector, click it and say, hey, I want you to, on. top of this element, I wanted to add a chart of the most listened podcasts in the last month, right? So you can be very UI-driven and everything is running locally. So for the people who are fine, look, I want to have Phoenix. Neal be responsible for my code, for my deployment. I don't care about that. And I want that thing to do everything for me. Then go use phoenix.neal. I still think it also owns the getting started experience. It is the best way of getting started with a Phoenix app, right? Just go put things in the prompt.

Starting point is 00:36:11 It's going to build something for you that you can throw it out. And for me, I'm like, look, okay, I have my own thing. I already have my own infrastructure, my own development cycle. And I want to incorporate all those tools into what I do every day. For me, it's like, you know, when there was like a trend, everybody was saying like oh you're all going to be developing on remote machines you know like and then there was like those developer containers and that never really happened right we i remember we did the show didn't we adam we did the show with uh with whoever was like gethub

Starting point is 00:36:48 code spaces uh cloud development environments essentially yeah yes yes get container dot yeah and i mean yeah i know some people use it it is yeah people do do it but you can use it but they can use it local But it's not like everybody, because people would say that everybody would use it, right? Like, why would you have a local machine? Right. So I see it the same way. I want those tools for my framework and running on my machine. Okay.

Starting point is 00:37:18 I am with you. I actually, when I heard Tidewave runs in the browser, I was like, another browser thing, Jose. Like, they're all running in the browser. But actually, it's different than that, right? it's it's in the browser because that's what your output of your web app goes but it's in your local browser running against your local web server with your local environment and helping you build cool stuff right there which is kind of how i develop now anyways whereas phoenix.new was making me go into the browser and have a remote browser session which i've always got

Starting point is 00:37:52 excited about for the hour that we do the show and then when i go back to my real life i just don't want to do that. I want to be on my local machine. I always have. Maybe I always will. I'm getting old, so I'm getting stuck in my ways. So that makes me like tide wave a little bit more than when I first thought, oh, it's, it's, because one of my questions for you is going to be like, why the browser, you know, but it's because I didn't understand. Yeah. And the thing is, so we actually went through many possible designs. So over the last month you had, we already had like, let's talk a little bit about the browser design. So, like, we already had for some time the Playwright MCP. So somebody may be listening to this and say, well, I can use VS code with Copilot and

Starting point is 00:38:38 install the Playwright MCP. Recently, Chrome DevTools MCP. Chrome, yeah, they released there. And I think yesterday, Courser Browse came out. What's that? It's just controlling Chrome. It's like the, it's like the Playwright Puppeteer MCP. It's just building, right? And, And the issue with those tools is that it is a separate browser session. It's not the one that you are developing. So imagine, for example, that you are working on a project manager. And what you need to do is that you need to implement a feature for transferring a project between two organizations.

Starting point is 00:39:16 So in order to implement this feature, you need to create a user, create two organizations, probably make sure that the admin is the user. is admin on both organizations, create the project, and then you can transfer it. And then a lot of the times the MCP is going to get stuck just in this process. Like, a lot of the times MCP cannot create an account because create an account requires sending something to an email that the MCP doesn't have. So now we start writing like those back doors for tasks. So there's such, there's a big amount of work, right?

Starting point is 00:39:51 And then the fact that we run in the browser, we are literally running in your browser. browser session. Because when you're going to develop the feature, you open up the, you are already logged in your development version. You go to the page already. And when you're going to validate that the feature works, you already have all that setup. So because it's running there in your session, everything that you do for development, the agent can do. And the agent is going to verify the things in front of you, not in a separate session. And then you can actually have a back and forth. Like if you're using the MCP, imagine like the agent's like, okay, let me test that it works. And then the MCP with the separate browser is running. And then

Starting point is 00:40:36 you see a bug right when the thing is testing. How you're going to debug that? Because it's a separate browser. How are we going to click things and say, hey, you would have to go around, say around this page, maybe there is a bug here with Tidewave. It is your browser. I think that's the most important thing. You can stop the testing. Go with the Tidewave Inspector, there is a bug here, fix it. And we also go the next step, which is that we integrate with the web framework. We understand, like, when you inspect like a dome element, we know the dome element and send it to the agent, but we also know where in the template or which React component that thing came from.

Starting point is 00:41:21 And we send that to the agent, so we don't have to do the manual working or figuring that out and do it to the agent. When there is an error page, we detect the error page of all the web frameworks and support automatically feed that to the agent. So it's really meant to be like, look, it's like you, the agent, the browser, the web framework, like in a shared context. Kind of everybody can see what the other is doing because otherwise it becomes our responsibility. You are the ones who are like getting information from all those places and pass it around. Sounds pretty cool, man. That sounds pretty cool. But I'm passing a lot of that stuff.

Starting point is 00:41:55 I mean, the fact that it has, you know, that, I mean, that's something I just, I've always wanted, I guess. I mean, you're going back and forth like that. It's better to do it right there real time. I haven't played with it to know the U.S. really of that. Like, when you're filling with, let's say, is it a button? Maybe it's not working properly. What is the back and forth with the experience? Can you speak to it?

Starting point is 00:42:18 Can you type to it? Like, what are some of the interfaces you can think of? Right. So I think there are three ways. that you are interacting with it. So one is the usual chat prompt, right? And with the difference that we know what is the page that you're currently looking at,

Starting point is 00:42:37 so you can talk to the page in the sense that, so for example, imagine that you just boot up your dev instance, your database is empty, you can go to a page that is listing all the podcasts, like for change log, and you can say, oh, this page is empty, add some podcasts, it know which page it is at so it can find information from the controller or from the live view and then say, okay, that's the data in each of like, it gives you an entry point.

Starting point is 00:43:05 So that's the chat, so you can cut it has the context of the page. The other one is the inspector. It's like the browser inspector, but so you can click it and then you can mouse over element. We show the dome element. We also show like which templates or which Phoenix template it came from. And then you can like click it to open your editor or you can click it to ask the agent to do something. And the other way that we interact with it is when we detect that something goes wrong, we just show up a pop up like, oh, you want to fix it.

Starting point is 00:43:42 Right. And then you can just click a button and have it fixed for you. So I think as a human, those are the, I may be missing some. Those are like the three. So it's a very classic, like, chat experience with a few, like things on top, like inspector, the error. But I think a lot of the part that we shine is in giving more tools through the agent, right? So the agent can do everything that a coding agent can do, but it can also run JavaScript on the page. And that's how the agent can test that it implements something.

Starting point is 00:44:20 So, for example, one of the coolest features that we use TideWave to implement, like, if you go to TideWave.AI today, we have videos in the home page. And so I added like the YouTube URLs or, not YouTube, like the URLs for the, I added the video tags, right? And then I wanted to make it so as I was scrolling through the page, the videos started to auto play. So I asked a TideWave to implement this, which it can do. It's a straightforward feature. I can't do it, but I assume it's a straightforward feature for me. I didn't look at the code, but I'm sure it was pretty easy. Yeah.

Starting point is 00:45:01 So it implemented the thing. Right? It implemented the thing. And then in order to test that it worked, it actually like reloaded the page. So this Tidewave, it wrote JavaScript code to reload the page and scroll the page and scroll to the first video, and then it runs on JavaScript to validate that the first video was playing, but not the other two. Then it automatically scrolled a little bit more. So the second video started playing, and then it ran some JavaScript to make sure that the second

Starting point is 00:45:34 video was playing and not the other two, right? And I think that's the important part, because if you can see the agent doing that, because if the agent doesn't do that, there's a chance they get it wrong, right? And then if they get it wrong, who is paying that? the price to fix it. It's you because you are going to be the one who test it and then you have to go and tell it. I thought you're going to say your users because you're going to push it out live. Also, could be. And then your users have to tell you if it's broken or not. Yeah. Yeah. How do you limit it to the viewport? Like I assume the scrolling is either simulated or it's real or it's simulating it so you think it's only scanning with you.

Starting point is 00:46:13 It just, it just runs JavaScript on the page. So what's in the view? So it's, it's looking at what you're seeing, essentially. Yes, yes. It's running in your browser. Like, there is no, there are a lot of complexities in there, but like this part is like as straightforward as it can be. It can control the page. So, so the same way, because that's the thing, like people are coming up with all

Starting point is 00:46:35 those different APIs to have the agent. Like there's an MCP with 30 different commands to control the page. And I'm like, it knows JavaScript. It knows the DOM API. just have it run things on the dome. It knows what is, I don't know what is the command, but it knows what is the command to say, hey, it's crow a little bit, right?

Starting point is 00:46:54 It knows. So the only things, like, we had to intervene, like, very little. Like, so there's one of the things that it can't do, like resize, resize the browser window. I think it's because browsers don't allow you to do that because of security concerns or something like that. So there are some things where we have to intervene

Starting point is 00:47:14 and add, like, extra capabilities. But it's just running things on the page. Well, friends, you don't have to be an AI expert to build something great with it. The reality is AI is here. And for a lot of teams, that brings. uncertainty. And our friends at Miro recently surveyed over 8,000 knowledge workers, and while 76% believe AI can improve their role, most, more than half, still aren't sure when to use it. That is the exact gap that Mero is filling. And I've been using Miro from mapping out episode

Starting point is 00:48:05 ideas to building out an entire new thesis. It's become one of the things I use to build out a creative engine. And now, with Mero AI built in, it's even faster. We've turned brainstorms into structured plans, screenshots into wireframes, and sticky notes, chaos, into clarity, all on the same canvas. Now, you don't have to master prompts or add one more AI tool to your stack. The work you're already doing is the prompt. You can help your teams get great done with Miro. Check out Miro.com and find out how.

Starting point is 00:48:38 That is Miro.com. M-I-R-O dot com. Can it take a, I wouldn't say a fixed width, but a, a desktop design website and implemented. So it looks where you want to look on a desktop. And can you say, make this a progressively, what's it called? Not progressive web enhancement. Responsive. There we go.

Starting point is 00:49:06 Yeah. Can you make this responsive for these six viewports or something? Right. Not yet. I knew it. The risk... I knew it. I got you.

Starting point is 00:49:16 Because the reiss I think that I told you. Because the rest I'm not because it's his fault. Not because it's Jose's fault because I don't think these things can do that anymore. It's hilarious, though. I love it. This pursuit of rightness. Oh, I'll try. I'll actually try it.

Starting point is 00:49:32 Because I've been using, I've been doing a lot of front end lately and I'm not good at it anymore. I'm learning. All the new tools are fancy and they're hard to use. I can't figure clamp out. I mean, I've been using clamp. wrong for weeks now finally starting to get it to work and none of these tools can do it either so I play what I call a LLM Russian roulette so I take the same prompt and I'm usually I'm like hey can you do this thing in SVG or whatever like I'm trying to accomplish stuff that I don't think

Starting point is 00:50:03 is possible and I thought it should be possible it's the modern web you know and so I ask chat GPT I ask Claude I ask I'll even ask Grock if I get too angry. And then I'll ask Gemini. And they all give me different responses that are all wrong. None of them can do it. I want one that just tells me, actually, Jared, that's not a thing that you can do. You know, like, you can't do that with web technology.

Starting point is 00:50:26 They're not going to do that because they want to make me happy. But I know that, like, that kind of stuff, we're not there yet, man. I'm just doing way too much work in the browser as a human right now. Here's how I would try to implement it. Okay. And let me know if that's an approach that you tried, because if that's an approach that you tried, then my solution obviously is not going to work. I'll let you know, trust me. Yeah.

Starting point is 00:50:53 So I was talking about the resize. That's something that we identified recently. So we haven't implemented it. So it doesn't have currently the ability to resize, which means that it cannot validate responsive designs, right? As simple as that. But so I would try doing would be is add the feature to resize and the feature to take a screenshot of the page, which there are some other complications because the browsers don't allow you to do it for security reasons as well. We need to, I know how to solve it. It's just going to take, I'm just explaining why you're not going to have this feature tomorrow.

Starting point is 00:51:30 There's some work. And then have it look at the screenshots and see if you can see things are good or bad. How do you think about this approach? In my experience, their ability to look at screenshots and decipher things is really bad. It's not there. Okay. Like they have vision, but it's not precise enough, you know? And so I haven't tried that specifically.

Starting point is 00:51:55 I also don't think it's going to work. I would love you to try and put me wrong. I would love to be wrong. But in my experience, when you pass a screenshot or you say take a screenshot and then inspect the visual, 9 times out of 10, they're wrong. All of them. I wonder if we could, if we could use accessibility APIs.

Starting point is 00:52:18 Wait, what? This is Jose. The guy is such. I love the exchanges on this. He's such a problem solver that he like can't help himself right now. He's like, let's debug this thing. So you already, you already saw me like getting off track with the AI suggestion.

Starting point is 00:52:33 We saw it live. So this is also a real life nerd sniping. happening right here. Yeah, we're shaving a yak. So, yeah, what we're going to say? Accessibility APIs. If we could use accessibility APIs somehow to measure like size of elements and what is visible, what is not, but I maybe, maybe not.

Starting point is 00:52:55 Right. Yeah, I don't know about that. I just get angry and I just do it myself. Because it goes back to like to what I was saying, the sense that the, the, The way for us to eliminate the AI guessing is adding more verification tools. So if browser had had a way, if the browsers could tell me like, oh, the phones here are too small, these things are clipping. That's why I was thinking about accessibility APIs.

Starting point is 00:53:23 Because if the browser tells me that, then I can get that thing, which is going to be better than a screenshot and send it to the agent. That might actually work. Right. But I don't know. I don't know if this accessibility API exists, right? So that's why, that's why I am. Well, don't ask an outline, though.

Starting point is 00:53:38 I'll tell you that it does exist. That's right. They'll give you the code. Emphatically. I love when I have it, they produce SVG. And I'm trying to get like a tapered border and all this kind of stuff. And they're like, here you go. And then they like tell me all the reasons why it's going to look good.

Starting point is 00:53:51 And I put it in there. And I put it in there. And I'm like, dude, it looks like a bow tie. Like you just drew a bow tie. And it's so far off that I have to laugh because I'm, otherwise I'm just going to cry. And just be like, why am I even wasting my time with you guys? So there's certain things. things where they just have these inadequacies and they're all inadequate at this point in my

Starting point is 00:54:11 experience. I haven't done 4.5 yet. So maybe after this call, I'll go see if Claude can do this. But I don't know. I don't feel like I'm pushing the envelope. I feel like I'm a kind of an just an intrepid person trying to get something done and thinking that you can do things that maybe you just can't even do in the browser right now. But I think being able to develop out a simple I'm not going to say fixed width desktop styled website and say make this responsive

Starting point is 00:54:42 like that should just be a thing don't you think if you build that in the type web Jose I mean people are going to line up with their money I think so I mean because that's the thing like so I don't want to do that work

Starting point is 00:54:56 right so if I hope that I can do it yeah totally that's the thing like I can also do it but it's just slower and tedious and that's what that's what the promise is. We don't have to do this stuff anymore. And I'm not a good at it anymore. It's just, it's guess and check.

Starting point is 00:55:12 I got a guess and check. Oh, it's still too big. Now it's too little. All right, I'm done complaining. Adam, take us somewhere else. I'm just kidding. I'm airing my grievances. Well, you know, one thing I was going to go back to was,

Starting point is 00:55:23 Josie, I think one thing you were mentioning was how when you scroll, uh, tiewave. I, as you see these movies come in. I'm actually, you know, back, I think maybe 15 minutes, potentially, but you were describing this page here. And now that I've actually caught up and I'm scrolled it, maybe that's where we can go. It's like, what was

Starting point is 00:55:42 the aha moment here when you did this? Because you said you were kind of going back and forth. Did you not do any of this design yourself? Did you just sort of prompted? What was the experience like for getting this page to be like this? Yeah, no. For this page in, so for this page in particular, we were just doing the design of the

Starting point is 00:55:58 page and then we knew we wanted to add all the scrolling. And then we just asked it to do it. and then it did it, right? And I think what was surprising about that is because, I mean, it's obvious, but that's exactly how. The autoplay of the videos was key, right? AutoPlay video, but it wasn't the autoplay.

Starting point is 00:56:18 It was how it tested itself to know that it got the autoplay right. Yes. And that's exactly how we would test it. I mean, it's obvious that the way you test the auto play scrolling is by scrolling. If you scroll and you watch it, autoplay, and you make sure the other ones aren't, but it's just running JavaScript. But it's really nice to see. see it happening by itself, right?

Starting point is 00:56:37 And then it goes back to, to other stuff. Like, TideWave has access to everything. Another way that I like to phrase this is we imagine like you're working with somebody and somebody sends a request and then you open up like the work they did in the browser and then they're like, wait, this looks bad. And then you go back to the person like, did you look at it in the browser? Did you try it out? And then like, the person says like, no.

Starting point is 00:57:03 And then we'll be like, what? Like, you have to test things in the browser, right? And that or like, I use the repo all the time as well, right? It helps me develop a lot. But we are asking coding agents to develop without a proper browser, without a repo. So Tidewave give all those things as well, right? So, oh, yeah, I was talking about the, you ask about what are the user tools and I started talking about the agentic tools.

Starting point is 00:57:30 So one is coordinating the browser. but the other one is that we also give access to a repo running inside a web application because we use the repo for development, why we're not giving one to the agent? I would be a worse developer if I didn't have a repo, right? And then we have MCPs for like, oh, you can install an MCP to talk to Postgres. But I'm like, my web application already knows how to talk to the database. It already has all the credentials in there. why are you asking me to configure a separate thing?

Starting point is 00:58:03 So a lot of the times it builds a feature and then it tests the feature in the browser and then it does a database query to make sure that the change also happen in the database. So that's kind of, yeah, so we're going back 15 minutes, but that's closing the loop of like what are the tools that the agent have.

Starting point is 00:58:23 And the whole purpose is to make sure they are producing something that is really good, and I'm not going to waste my time, like, telling it obvious things, like, oh, the video actually doesn't play. Oh, the change was not actually saved to the database. You mentioned, but I think one thing you mentioned there was MCP servers. Have you, Jared, mess with MCP servers at all? I really haven't personally. Mostly just figmas. Yeah. And I told you my experience of that, which was not, right. There's nobody's fault, except for the state of the yard is not quite what it's needed, but.

Starting point is 00:58:56 Josie, I imagine you're probably playing with them. heavily. How exactly does that fit into your flow? Because from I understand it just adds more tooling to the context window, which is already kind of small. And so we're always battling that you know, that auto compression or just having to refresh the entire chat whenever you feel like I suppose. How do you work in those kind of tools into your workflows without, I guess, loading the context? I actually have a hot taking here. It's not a, it's not a exclude it's not a unique hot take but so to answer your question which is going to kind of review the hot take is that uh he's just teasing it he was not going to review it is he just teasing

Starting point is 00:59:40 the hot take come set it up jose and give us the hot take so like almost all of our APIs is write code oh you can execute code in the context of the web application you can execute code in the context the web page. That's it, because like we are doing all this dense. Like, oh, I said about the database, right? Like, oh, I'm going to have an MCP for the database. No, my web application already knows such a lot of the database. Just use that. Oh, I want to have an MCP to talk to GitHub. And I'm like, well, I already have like, I already logged the on GitHub in the browser. I already have the GitHub command line. Use that, right? Like for coding agents, we are even going as far as adding

Starting point is 01:00:28 MCPs for documentation and then I'm like, why I'm going to a separate website to get documentation? You are a coding agent. The code is in your machine and usually with the code you have documentation. Why don't you use a documentation

Starting point is 01:00:43 that is there already on your machine with the exact version that you are using because sometimes you go to the remote server and then we get the documentation for Phoenix 18, but we are still on 17. Right. So for me, like the answer for, oh, they are like too many, like the context thing is like, I'm going to have just a small amount of tools. And what those tools are going to do is that they can run code. Right. And I'll let them do whatever they need to do. So trying to keep the amount of tools minimal and powerful. And like this stake that, you know, like, oh, MCPs are too much. You probably just need code. Right. That's kind of.

Starting point is 01:01:25 of, I'm not the first one to say it, but I also think, like, MCPs, the user experience, like the developer experience around MCPs for coding agents is really poor. Like, I mean, to be fair, it's new. It's still evolving, right? It's probably six months old at this point. But, like, so we have issue where, so one of the MCP that we're using, it works, it was working for GP5. but not for Gemini, and then we fix it for Gemini and it broke for GPT5.

Starting point is 01:02:00 Like, if the server disconnects, they cannot reconnect again. Like, there are all those sort of, like, annoying issues there. And then, do you know about the, the Figma dev mode thing? Mm-hmm. Mm-hmm. So, like, so there is an MCP in Figma dev mode. So you can all run like Figma on your machine. Like, there's a desktop client.

Starting point is 01:02:22 And then I can go to Figma, inspect an element, click like a component that I wanted to implement right and then you know what the workflow is today is like I have to go to Figma I did it on the component and then I have to go to the agent and say I have selected a component please implement it like when it's done when it's done you have to redo it yourself yeah but that's my experience oh good job not good yeah not good it doesn't look like it's like I already click the thing. Why do I have to go back and tell you that I really, like, you know, and then they're separate tools, right? They're distinctly different tools. You're meant to have a protocol for those

Starting point is 01:03:07 things to communicate. I know you know the answer. I'm just saying it out loud. Whereas it was Tidewave, it's all integrated. It's all. It's all integrated. And I actually, I want to, I want to hold as much as possible we've actually had the MCP support to TideWave because I think we will have better integrations if we do it by hand. So for example, when I do FIGMA for TideWave, when you click on the Figma thing, we will know and we'll just tell it, oh, you want to implement this? Oh yeah, just click the button, right? Like, don't have to type anything.

Starting point is 01:03:44 Just click it. Now is it going to do it right? Well, that's up to your model, right? Yeah, maybe we can give more... And TideWave is just using whatever model you ring to it, basically. Yes. What we can do is that we can help it. Like, you'll be able to click something on Figma and then click something on TideWave.

Starting point is 01:04:03 And then we'll be able to say, oh, you should implement this and this is exactly where it is. So we can improve the experience there, right? But if the, when we send all the information to the agent, if that's going to be ultimately better. And the agent will be able to validate. that some things look good like it did with the video all the scrolling. So we are giving it more tools to verify that it did a better job than it just working blind.

Starting point is 01:04:29 But ultimately, yeah, right? And I think like, and I think that's going to be true like for a lot of things. So there is a tool called conductor that, for example, they added GitHub integration. And one of the things they do is that they know which GitHub, They know which Git branch you're using. So in their GitHub integration, they know the comments drop in a PR for that branch,

Starting point is 01:04:56 and they automatically surface that in the UI. So you can ask the agent to solve a comment as somebody's commenting on GitHub. So, like, those sort of experiences doing through MCP, it's just, oh, you know, it's like, oh, get all the comments for me, right? It's like, so I really think that for coding agents, I don't want to generalize this too much, but I think like for coding agents, for a lot of the things, you can build like, like MCPs doesn't allow you to push information, right? So that's what I'm complaining about. Like GitHub should be able to push information for the MCP.

Starting point is 01:05:35 Like, oh, there's this comment. Oh, I click this on Figma. It doesn't support that. And we're not talking about even the security issues. So I feel like, yes. Yeah, like I want to give you like a good, a good package with everything. At the point, you're telling users to go like, oh, just go and install those different MCPs. You kind of like gave up on the developer experience, right?

Starting point is 01:05:59 Because it's not there yet. I would tend to agree, I think, with that. How hot was that? Was way too much teasing for not too spicy or? It's a lot of spice in there. A lot of spice. It's a variety of spices. There was a lot of, there was some hedging around.

Starting point is 01:06:15 you know, I just feel like you could have dropped it a little hotter and then... Yeah. I could, yeah. Yeah. Kanye would have gone ghost, you know, gone ghost. And also, I just tend to agree. I think MCP servers seems to be like a builder, like builder-driven technology right now versus user-driven. Like, I feel like it's, it was like so quickly adopted by all the builders and as users were kind of like, were we asking for this necessarily?

Starting point is 01:06:41 And can you, could you do it so that it was made? I would say a bit more transparent perhaps, or like maybe just user friendly for us as end users. But man, I've never seen an API or specification or a protocol get built out across the entire tech industry so fast. And we're talking like 30 days. Well, I mean, from their first announcement

Starting point is 01:07:06 back in November, I believe, was when MCP was announced by the Anthropic team like less than a year ago. And nobody paid much attention to it for three months and then all of a sudden in this spring is like everybody just started building mcp servers like everybody's true yeah and then as end users we're kind of like did we ask for this or i don't know i'm not sure what it's like you want to be first i don't know you don't want to be left out i'm not sure why everybody just thought immediately we got to do this but it was pretty interesting to behold that you know i don't know i don't have a lot of contexts around mcp servers

Starting point is 01:07:44 but I don't really use any of them. But when I think of them is more like a CLI tool that's on the system already, rather than like, you know, pollute my context window with a tool that's an MCP server. Why not just have a tool on the system that you can use, not have to be an MCP. Like instead of using the GitHub MCP server, you might use the GitHub CLI to access data from GitHub. Is that what you're saying? Right. Like if I already have GH installed and it's already authenticated, why not just have it?

Starting point is 01:08:14 just use the tool versus some sort of mcp server that just is like in my context like why does that to be in my session and configured it's also uh instrumentation of tooling it's it's a lot of ceremony you know it's a it's a lot but even you say like look well what if what if uh you'd you don't have the get hub command line too yeah have it right like it can write elixir you can write JavaScript, we can write Python to talk to the API. Then of course, so I think the authentication part of the MCP is interesting

Starting point is 01:08:47 because if it had to write like a tool, it would have to ask for your credentials somehow. So that's part is good, but it feels like that's in certain ways, that's probably all we needed, a way for the agents to ask which we have all of and other things

Starting point is 01:09:04 and a way for the agents to ask for your permission to use, to talk to some API on your behalf, right? Because if it has the code, it could also do things like, oh, it can ask information. It can get the raw data from GitHub,

Starting point is 01:09:22 then use whatever library to compute the information that you want, right? And give you a better result than trying to do with the MCP, getting plain text, and then maybe doing something interesting with that. Right? Like, we have those things

Starting point is 01:09:39 that are really good at coding, and we are sometimes, like, dumping things down to a text interface while they could write code. I like to think one of the questions that I ask myself, like, people are talking a lot about, like, personal devices, right? You're going to have our personal devices that are AI augmented and this kind of things. And I, and I say, like, that thing needs to know how to run code, right? It's like, because how can you have, like, some, like, generic, like, personal, like, assistant can do everything and that thing cannot run code any assistant of mine's got to be

Starting point is 01:10:14 able to run code that's right you're telling you get out of here it's like the first thing on the resume can this person run code code well yeah I tend to agree I think MCP is an interesting phenomenon and most widely adopted by builders technology in that I can think of in history so there it is it's there now but But not necessarily. It didn't necessarily have to be there. And there you have it. Spicy.

Starting point is 01:10:47 Spicy, Jose. What else? I mean, Tidewave, you're trying to make a business out of this thing? You're trying to make a living? What are you trying to do? Yes. So it is a paid product. We consider a little bit, but then I realize, well, everybody, this is an AI thing.

Starting point is 01:11:04 It's a very rapidly changing landscape. So if we want to be able to keep up and feel invested on this and continue improving it and also support different kinds of frameworks, we need to find a sustainable way of doing that. So yeah, we'll see where the launch was pretty good. We got a lot of people excited, but it also pointed out like, so today's like bringing our own API key. and the feature that people ask the most for is cloud code support. So being able to bring like codex and cloud code. And yeah, let's see. So right now I like to, I think in the email I sent you folks like my product history,

Starting point is 01:11:58 my history of building products have been like cataloged by the change log. Yeah, pretty much. We're doing our job there, like the change log of Jose's products, you know? Yes. Yes. It really is. Yeah, so there's Lifebook, which is also running, right? And now Tidewave.

Starting point is 01:12:20 And what about Elixir, man? Is it done? Are you done with it or still working on it? Still working on it. It still, it changes. So around now that there are a good amount of Tidewave things happening, it's fresh. I think we are about five weeks since we launched. So we just launched it. And, you know, when you launch something like that, there's a lot of work, feedback and prioritizing. I think it's

Starting point is 01:12:47 kind of like about half half of my time on elixir, half on Tidewave. But otherwise, like, most of my works is still going to the elixir type system and elixir work. The other thing that I want to do, One of the things that going outside of TideWave is an example, but going outside of TideWave, one of the things that makes me excited about AI is because we can look at the tools and find ways to improve and build new developer tools. And I've been like exploring some ideas around those areas. Like, well, so I was saying like tests, the test that the Coney agents, right, I usually don't like that. they are redundant. And I think a lot of people don't pay attention to like to, or they use too many mocks.

Starting point is 01:13:38 A lot of people don't pay attention to quote quality in tests. Right, test is test, right? So I'm trying to figure out ways of improving that. So for example, when the agents writing tests, can we measure coverage and guide the agent to write tests based on coverage, but also give information about, oh, those tests, they are redundant. They are pretty much checking the same lines of coverage. you can try and define them.

Starting point is 01:14:04 And the cool thing is that we are thinking about those things because we want to automate the agent, but a lot of it translates to better developer tools, right? We release this for the agent, but developers can also use it. So I think a lot of the work that we are doing right now will go back, we'll feed back into better tools. Even when I'm working on Tidewave, a good amount of the work will eventually feed back to better tools for like elixir and the community too right on man well keep fighting the good fight always love talking to you and always love hearing what you are working on i am going to

Starting point is 01:14:41 give tide wave a ride in earnest i have a rails app now i all you know i we have an elixir phoenix app so i can use it in both contexts and um let you know what i think give you some feedback let me know yeah and then uh right now is bring it's either bring your own key or you can use your github uh your github copilot integration and then uh hopefully in about a month what are the tools that you use today so i use i use clog code i have chat gbt pro but i don't actually use codex i'm not sure if i get codex with pro um I have Gemini CLI. Okay.

Starting point is 01:15:28 I don't know. It's very confusing. Like when you get clod code, do you also get tokens for the API? I don't think so, right? So you buy those separately. I'd rather not buy more of those. I'd rather, is this why people want to bring their cloud code subscription?

Starting point is 01:15:42 Exactly. Yeah, I would love to do that and not have another toll bridge, toll road, as Adam calls them. Toll booth. Toll booths. So that's my current setup. Adam, what are you? And you got some AMP subscription maybe? About five bucks left in an amp.

Starting point is 01:15:55 it's uh I still can't hold it right it's always expensive for me it's really great though it's it's so cool how it works it's really one of the best but I haven't found a way to hold it in a way that isn't expensive and oh okay so cloud code primarily uh same I have an anthropic key but

Starting point is 01:16:14 only because of I think one thing had to have it and I think I got like the the trial balance they give you I'm still on that so right something past that But cloud code Augment code I like as well They're cool Amp I still like Amp

Starting point is 01:16:31 It's just I haven't found a way To make it not expensive for me I just I don't know But it is really really really good When it does this thing Is there anything in particular That you like about it? It seems to be just

Starting point is 01:16:43 It's got this Oracle So speaking to AMP code It's got an Oracle where it can go back And console it's kind of like Ultra Think now that I think about it Jared It's not quite that But it's a bit more where I'll go into a deep understanding of coding patterns

Starting point is 01:16:57 and like a learned behavior across, you know, let's just say like a Rust CLI ecosystem. Like, how do those work generally? What are good patterns? And it will come back and tell you stuff like that. So I find that it's research and its ability to execute in a, you know, hands-free, yellow environment is just, it's really good at that.

Starting point is 01:17:18 Like, you wind it up on the right thing with the right research, the right context, the right everything. It just plows through it for hours and just does amazing work. So, but it gets expensive if you don't like work with it and babysit it. Do you prompt for the, for the oracle or it automatically figures out that it's like the plan mode and like a plot code or the or it automatically figures out, oh, now is the time for like some oracle. Yeah. You know, I think it does it on its own desire, but you can also. say, hey, in this exercise, go ahead and prompt the Oracle as well.

Starting point is 01:17:59 Tap them, get them involved. I don't know. It feels cool. It does it. And good results come, I guess. But you can either prompt it yourself or it just kind of does it when it needs to. I am not an AMP code expert by any means, but that's how I experienced it. Yeah.

Starting point is 01:18:15 So wrapping it up, yeah, right now is bring your own API key. And you cannot. Not, yeah, and Cloud code, like your OpenAI subscription or Cloud subscription does not give an API token. And we cannot actually, because, you know, Cloud code is just using a Cloud API, right? But we cannot use that API is actually not according, is not legal according to Cloud to entropic terms. So we decided to not do that. That's how we're working on the whole Cloud Code, Codex integration kind of things. And so either bring your own key, but I really, at the end here,

Starting point is 01:18:57 I really would recommend giving the GitHub copilot a try because it's confusing because Microsoft calls everything co-pilot, right? But there is a GitHub copilot plan that gives you access to kind of like a bunch of different models. And it's actually like it's a predictable plan in the sense that because the thing with paying for tokens is that it's very hard for you to predict how much it's going to be. And paying and GitHub, the GitHub Compilot subscription, is per messages, which at least improves the visibility a little bit. And it has like a basic plan quite affordable. So that's

Starting point is 01:19:40 a good way to try it out for now, to get some feedback. And yeah, we are hopefully launching cloud code. We are a Zed release. is something called ACP. I don't know if you saw those news, like with the Asian client protocol. So you can talk to codex, cloud code, Gemini, CLI, right? So we are building on top of that.

Starting point is 01:20:05 But it's a bunch, it's it's work because you're running on the browser, right? And ACP is an IO protocol. So you can figure out all the hopes that we have to jump to make those things talk to each other. But yeah, hopefully we'll be launching soon alongside jango next j s and and so on i wish anthropic would just give you when you get some sort of subscription they'll just give you a a token or a key that you can use against that subscription at the same pace that you use clod code right like i guess they're just subsidizing that

Starting point is 01:20:43 to death and then don't want to subsidize their API because i can pay 20 bucks a month or whatever it is and use the dog do out of cloud code but i got to pay 200 bucks or 500 bucks equivalent to use this API the same amount i just made those numbers up but you can see like the discrepancy is there doesn't make sense to me i guess they just want you using their their cly a lot it's it's not even i would say it's not even about the the the how much the cost is just the pervisive like the how predictable it is right because yeah because because you don't want, you don't want to get dinged for making the bad prompt, you know, like you want to set, I'm fine with 20 bucks, 40 bucks, 50 bucks a month,

Starting point is 01:21:25 but just because I use it a lot, don't give me a, I tell it to Ultrithink, and it's like, well, that's $17 for that ultrithet, and I was still wrong. But, I'm still wrong. Like, can I get my money back too? Are there returns? I know, right? Service degradation is a real problem for me, you know? But here's the thing.

Starting point is 01:21:45 I actually think pushing people towards using cloud code more or code decks and building on top of those tools like with ACP is not actually a bad idea. Okay, what is that? Because here's, okay, let me tell you a story, quick story. I know you're going. So when we first implemented a tide wave, we focus on Anthropic and the cloud models.

Starting point is 01:22:11 And like, if you go to cloud code prompt, it has things like, you should be concise, you know, don't use too many words, use four words. I think it even said at some point, one answer word is best. And of course, like, it doesn't listen to that, right? It's like, it finishes the feature. It just like dumps like four pages of text about the thing that it implemented that nobody ever reads, right?

Starting point is 01:22:37 So when we did our prompt, we tested with those things as well, it does improve a little bit, right? But it also says things like, don't write a code comment. And it always raised a code comment. So anyway, so we wrote the prompt. And then when GPT5 came out, we decided to give it a try and start support OpenAI. And it was very curious because it would say like, hey, implement this feature. It would do all those things.

Starting point is 01:23:01 And then at the end, done. And then we would ask something and it would say, good. And then we realized that the prompt we had for Anthropic that was saying be concise, GPT5 was actually listening to that prompt and it was being concise. That's why I was just saying done, good. It was not doing any fluff nor anything. And that's when you realize

Starting point is 01:23:23 that you actually have to come with a, if you're building a coding agent like I am, you have to actually build a prompt per model. And now GPT5 codex came up with its own prompt that it's different from the GPT5. And then not only, so just doing that, like fine-tuning the prompt per model, that's a pain. That's like, that's boring work. That's not something I want to do, right?

Starting point is 01:23:52 And then it gets the other thing, which is then they have the tools. And at this point, those coding models, they are becoming so important for those companies that they're actually fine-tuning, like how it should send edits to a file. You know, like they are fine-tuned the models for that. So when the GPT5 codex model came out, they also said, like, look, this model is best at sending this kind of diffs and edits over the wire. So now I have to implement the specific editing tools per model that I support. And then each of those models come with like their own context engineering techniques. So at that point, like it's, you know, like if you're. If you're like me, you're building a coding agent, you want to be able to get that infrastructure and build on top, right?

Starting point is 01:24:49 And then it comes with the nice thing. So, like, going back to the hot take, if you are building like your agentic tooling for coding, instead of doing the MCP, don't do the MCP, build on top of ACP and have control of the agent and use all those things and extend that instead. And with the announcement, with Claude Sonet 4.5 today, yesterday, they actually recognize that. They renamed like the cloud code SDK to cloud agents SDK. They moved a couple of things around for it to be a better SDK for people to build on top, right? Because I think that there is a lot to gain for like leveraging everything. Like they are tightening, right?

Starting point is 01:25:37 the model of those tools and we want to be able to leverage that that makes sense they're putting a lot of work in to take that model and make it an agent and there's no reason why everybody else needs to do that work as well what could you uh how would you go about building on that right now like where would you go what's the starting point to building that right now i don't know what's the website oh i don't know zen dot dev slash acp or something no i i i will just search for agent agent client protocol uh on google and see Clientclientprotocall.com. All together, right?

Starting point is 01:26:12 All spell it out, yep. All right. Of course, when you Google that, it will be your first hit. And I believe there is an SDK type script. I don't know other languages right now, but the... That's the only language that you know. There's an odd take. You heard of it first.

Starting point is 01:26:35 Jose only knows type script. That's when you know it's getting late. I just, I just auto-completed you. So, yeah, but I can see it becoming like more and more important, and we are going to see, like, more SDKs. But the protocol is also relatively straightforward as well. So, yeah, I'm hoping that I can really see a lot of value in there, and I hope it's going to catch up to the point where,

Starting point is 01:27:05 because a lot of the I think Gemini CLI supported built in the CLI but when Zad released support for Cloud Code for example is because they have a rapper so I hope it grows to the point where

Starting point is 01:27:22 more like the CLIs they are coming with building support for it right and then I hope it grows to the point that like is it who which of the

Starting point is 01:27:35 big providers have their C-L-I version right now. It's Gemini, it's Open AI, and it's Anthropic, right? Like, Grox, they don't have theirs, I think. ZAI, they don't have theirs. So I actually hope, like, those other companies, they start providing those CLIs as well. With all those things we have been talking about in the sense, like, look, here are the optimized diffs.

Starting point is 01:28:01 Those are the things we improved for, so we can move to the point where we are all building on top and not particularly reinventing that wheel. So I really hope it grows. More CLIs, give them to me. There you go. Thanks for hanging with us, Jose. It's always a pleasure, man.

Starting point is 01:28:20 A pleasure, yeah. All right, bye, friends. Ooh, synchronized. All right, that is your change log for this week. We hope you enjoyed Monday's news episode about exiting the feed, for Sell versus Cloudflare, and why over-engineering happens.

Starting point is 01:28:37 We hope you enjoyed Wednesday's interview with Evan Yu, and we hope you enjoyed this episode with Jose. Because after all, we're here for your enjoyment. We also want you to learn and to keep up the easy way, and we want you to level up your own work and to feel connected to this worldwide community of hackers, but we love for you to do all those things while enjoying the process. Because after all, the process, that's our life, isn't it?

Starting point is 01:28:59 If you do enjoy our work, please tell a friend, or three, or send us an email. editors at changelog.com. We absolutely love hearing from you all. Thanks once again to our partners at fly.io and to our sponsors of this episode, depot.dev, fabby.a.ai, and mero.com. Thanks also to the one, the only,

Starting point is 01:29:20 the mysterious, breakmaster cylinder. Have yourself a great weekend. Let someone else praise you and not your own mouth. And let's talk again real soon. Game line... Game line!

The Changelog: Software Development, Open Source - A new direction for AI developer tooling (Friends)

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.