How I AI - How to build your own AI developer tools with Claude Code | CJ Hess (Tenex)

Starting point is 00:00:00 Working with Claude is just such a delight. It just feels so steerable. And I think the one thing it really has is intent understanding. When I wanted to dig deep, it just does it. And it's really enabled me to build a little ecosystem of my own tools around it. I think environment setup and developer setup is such an underappreciated use case. One of the things that I know you really care about is effective planning. And you've come up with a way that you do your planning that I think is pretty unique.

Starting point is 00:00:26 So I've played around with this tool to basically give. clod, these JSON files, and there's a whole set of skills I've built around this that Claude Code can use to write these out. And then these actually end up generating nice looking UI mockups. I will say this is a dev tool that was almost 100% prompted. Welcome back to How IAI. I'm Claire Vow, product leader and AI obsessive here on a mission to help you build better with these new tools. Today I have CJ Hess at 10x. And if you've seen him on X, he is building some of the most useful tools and flows for being a quote unquote real AI engineer. We're going to get a sneak peek in his tool flowy that he vibe coded for himself and he's going to show

Starting point is 00:01:14 us how he uses model to model comparison to make sure his code is great. Let's get to it. This episode is brought to you by Orkus, the company behind open source conductor, the platform powering complex workflows and process orchestration for modern enterprise apps and agentic workflows. Legacy business process automation tools are breaking down. Siloed low-code platforms, outdated process management systems, and disconnected API management tools weren't built for today's event-driven, AI-powered cloud-native world. Orcus changes that. With Orchus Conductor, you get a modern orchestration layer that scales with high reliability, supports both visual and code-first development, and brings human, AI, and systems together in real time.

Starting point is 00:02:00 It's not just about tasks. It's about orchestrating everything. APIs, microservices, data pipelines, human-in-the-loop actions, and even autonomous agents. So build, test, and debug complex workflows with ease. Add human approvals, automate backend processes, and orchestrate agentic workflows at enterprise scale, all while maintaining enterprise-grade security, compliance, and observability. Whether you're modernizing legacy systems or scaling next-gen AI-driven apps, Orcus helps you go from idea to production fast. Orcus. Orquestrate the future of work.

Starting point is 00:02:40 Learn more and start building at orcus.io. That's O-R-K-E-S dot I-O. C-J, welcome to How I AI. Thanks, Claire. It's good to be here. So I've seen a lot of Claude and AI engineers. power users. And I still think you're like a super power user of some of these tools. And it's not just because you're creating real production code with what you're building, which is

Starting point is 00:03:07 really nice to see. And I think a subset of what we're seeing out of folks using these tools, you also build tools for yourself to make the process of AI engineering better. And you share those tools with other people who then validate that they're actually helpful. So why are you so excited about in particular Claude Code and what has it changed for you as we were talking before the show like quote unquote real software engineer. Like a lot of people, the Opus

Starting point is 00:03:34 4.5 moment was a big one. But I've been on Claude Code, I don't know, maybe last May. But for me it was really about the harness they have and like I see a lot of arguments about Codex and CloudCode and I

Starting point is 00:03:49 honestly argue GPD 5.2 is a smarter mom. but like working with Claude is just such a delight like in Claude code it just feels so like steerable and I think the one thing it really has is like intent understanding. Maybe I'm not giving you know opus and cursor like the the shot it deserves here but there's something about in Claude code like when I wanted to dig deep it just it just does it it feels to like pick up on my intuition just in the prompts and it's really enabled me to almost like build a little ecosystem of my own tools around it, around cloud code, kind of particularly with skills now that just like keep making it better and better for me because it's cloud code plus like this system of skills and tools that I've built around it. So it's like really hard for me to get out of it. Yeah. What I love about this moment as a software engineer is, you know, back in the, in the olden days, you sort of had like your choice of like, what's going to be my IDE and am I going to use VIM?

Starting point is 00:04:58 And like what, you know, what are my my set of approved tools as an engineer that I can use to make, you know, what linters are we using as a team? All this kind of stuff that you could do to customize your developer environment. But now you can really take it to the next level. And you could have a totally different AI engineering workflow than your colleague sitting next to you. And it's totally fine because it's making you individually a lot more efficient and effective and you're building them yourselves for pretty cheap. So there's not that cost or that hurdle of evaluating new things in your stack. Yeah, there's even like, it's almost like one, you now have the brains of Claude to almost like do some dirty work on the dev tooling. Like I think, you know, pre any of kind of the

Starting point is 00:05:43 newer gen models that just really can handle the agentic loop. And, you know, sit. with like a broken linter and just accepting it and having like ignore comments everywhere so that you know, I just I just would give up. And now it's like I feel like I can almost trust it to be like what's wrong with this config. My IDE isn't matching what's in the project. Okay, we have to resolve this and just kind of solving those like chore problems that I feel like previously just ended up being forever problems. Yeah. And for the non-engineers watching or listening right now, I think environment setup and developer setup is such an underappreciated use case yesterday. I onboarded a designer who had literally has kind of like sat out some of this AI stuff. It's literally not

Starting point is 00:06:30 downloaded anything, used to anything. And she's on cursor, Claude code, nodes running, home brews installed. And I was like, just ask Claude code to do it. Say like, help me understand this repo and get my computer set up to run. And it just, and I said, and then just tell it it, it can accept all tools and let it go. It comes back for laptop later. And it's pretty great. I mean, we're really, really spoiled right now. So let's dive into some of your actual workflows. And one of the things that I know you really care about is effective planning. And you've come up with a way that you do your planning that I think is pretty unique. Yeah. So there's kind of the classic plan. So I'm going to swap over to cursor here. I have in this just like your classic dot plans folder just throw

Starting point is 00:07:17 growing plans in here. And I really love this format. I think a lot of people are kind of converging on this of like iterating on markdown, having one file where you're just like working through the plan, reviewing the plan. And by the end of that, you can almost feel confident just letting it write the code. But the one like piece that I hated that I found really valuable was these asky flow charts. So if you're just listening, it's all those like boxes and arrows that Claude draws. you know, there's always the ones where this one actually looks pretty clean. Yeah, there's always this like misalignment of that edge character. I don't know why we haven't figured that out yet.

Starting point is 00:07:58 But for things like UI mockups, things like, you know, flow charts of how navigation's going to work, how a certain system is going to work, I really like this visual way to think about things, but I really hate staring at these asky, like diagrams, even things kind of like Mermaid and everything just didn't feel exactly what I was going for. So I've played around with this tool to basically give Claude these JSON files, and there's a whole set of skills I've built around this that Cloud code can use to write these out, and then these actually end up generating nice-looking UI mock-ups, not in super high fidelity or detail, but I can kind of guide it.

Starting point is 00:08:46 the direction I need. And up here, this white text might be a little hard to see. But basically, this is a flow chart on this tool, flowy, and how it works. So for the listeners, what I love about this is flowie is a tool that you built. You're saying like, oh, I was playing with this tool. It's like, no, you built this tool for yourself. This was my first experiment with a Ralph loop. I'm still not certain how confident I am in them because I had to do a little bit of cleanup, but overall I will say this is kind of a dev tool that was almost 100% prompted.

Starting point is 00:09:26 Yeah. And so what you said is, you know, I love plans. I love the idea. And I just have to take a minute again. I'm the oldest lady on the internet. So way back in the day, two decades ago, when I was first doing product management and web design, we did so many flowcharts. many user journey charts and then so many wire frames and so many like low fidelity moks than high

Starting point is 00:09:49 fidelity moks and what i love about what you're building is you're building the ai native version of that that piece has not gone away for anybody it hasn't gone away that you said like when you click this it goes to this and these are the steps and these are the branches and all that and it hasn't gone away that you have to look at designs and say yeah this is kind of what i want but now you can have ai create them and at first you had a i create ai create them in marketing down very, very low fidelity. And I have to take a side journey that, you know, a year ago, I was like extremely delighted that it was making Asky markups. And now it's just not good enough. Yeah. That's the shifting expectations on these models. Yeah, exactly. And so you've taken these

Starting point is 00:10:30 markdown markups that were useful. And you said now make them really useful by building this sub application that can run them for you. And it's a combination of it seems like workflow diagrams and step-by-step mock-ups. Yeah, so there's kind of basically what I wanted was JSON file. It can render, and it can have nodes and edges like any flowchart, and then roughly be able to stack them, change the colors, and get us something that looks like this where we have a couple different screens, and we have these somewhere between a wireframe and a true mock-up that just,

Starting point is 00:11:10 This can help me point the model in the right direction. The other big thing for me was iterating on this. I'm not going to go in that markdown file and try to write new shapes and combine them. So for this, this is also an editor. And as you edit it, all these changes save to that JSON file. So you can then point Claude back at it and say, hey, I know you did this, but actually, let's say I want to step here and I'm going to bring this up and add some edges. And then you can be designing in here, almost like you're in Figma or Excalajar or something.

Starting point is 00:11:49 And then Cloud can just read the file. And that's like a more native way for it to understand what everything looks like. And you mentioned mermaid diagrams. And so I have this question, which is one of the benefits of mermaid diagrams is that's a syntax that these LLMs know well and can parse and actually reason about. do you feel like, have you created a skill where cloud code can understand and read this JSON? Like, how did you train it to read your kind of proprietary dev tool and documentation? Yeah. So right now there's two main skills I use. There's a third one. That's just an overview, basically, kind of the high level view of what the commands are, what a flowy file would look like. And then I have one that's very specific about flowcharts and one that's about, you know,

Starting point is 00:12:36 UI mockups. And to make these, I basically sat in the repo of the tool itself, had a bunch of explore sub-agents going, and then started to make the first UI mock-ups and the flowcharts and started to guide it on, okay, you put these too close, we need a rule about like spacing and how to think about spacing. And just incrementally, I've been building that up where if I'm working with this and something goes wrong, almost an example here would be this white text on these, you know, pastel notes, kind of hard to read. I would essentially hop into the place where I have these skills

Starting point is 00:13:14 and say, here's what happened, give me a suggestion on how to improve this skill so this doesn't happen again, and then iteratively just keep building that skill. And the first flowcharts this thing made were, you know, shapes stacked on top of each other. It didn't make any sense. But it's come a long way. Not much without many changes to like the underlying app. It's really just been about like getting Claude to understand and know the skill. And I find that works better than something like Mermaid just because I really feel the power of building my own dev tools now and that I really don't want to hit the constraints of Mermaid, if that makes sense. I want to be able to say, okay, I want a new feature in Flowy, I'm going to build it, I'm going to update skills,

Starting point is 00:14:05 and I can be confident that Claude can actually work with that and understand the new feature. Yeah, one of the things that I've really observed in myself as an engineer is as more and more, as I access more and more of my dev tools through like an MCP or config as code or any of these things, I start to realize it's very easy for me to extend what they've built and customize it to myself. And so I do think, you know, of all the places, is dev tools a tough that's an interesting one where one your users are super cheap and two they're capable of of forking what you've built and three there's so much open source that i really do think there's going to be this trend towards build i used to be when i you know i ran these big product

Starting point is 00:14:48 and engineering orgs they used to ask me build versus buy it and i was like oh my god please just buy it like please just take my credit card and buy it and let's not waste our time and now i've flipped to of course we should build this until we hit some constraint we should We should build it. And certainly individual engineers, if something's useful, you should just build something yourself, at least for V1. Yeah, it's almost not spending the extra money anymore. I mean, I've seen, I feel like I'm seeing this pattern on Twitter, but it's everyone's posting some product, some ridiculous pricing tier and saying, someone please vibe code this. You know, I feel like that's happening all across SaaS. Yeah. So can you show us how you'd either create one of these flowies, use one in your cloud code? does this actually work?

Starting point is 00:15:32 What I was thinking is I have this tips and tricks section in this little like demo Claude Code Guide app. My whole background's in mobile development. So this was the easiest thing for me to spin up. But basically I kind of don't like these cards. I almost want this to be a little more fun. Let's say you want like a spinner wheel. It lands on something and then it shows you the tip.

Starting point is 00:15:55 The development flow for me usually looks like hopping in here. I have some funny aliases, but I'm a fully bypass permissions guy. So Kevin in my terminal actually routes to Claude with bypass permissions. Okay, so you've named different permission scopes as aliases in your terminal. Oh, yeah. For our listeners, we have an episode very recent of John Lindquist who actually shows how to set up those aliases for Quag Code. So definitely check out that episode for if you want to set this up. I just have a classic like C.C.

Starting point is 00:16:30 And then I'm going to make a C.C. scary. That will make me, that'll be my like dangerous. Oh, yeah. Yeah. I'm, I'm more and more in this Kevin mode today. I find that a lot of like projects where I'm, you know, solely working on it or working within the team I'm on,

Starting point is 00:16:50 we have all the like rules set up and get that if I do something horrible, it's okay. But there are definitely times like, If I'm creating a PR every now and then I still do it by hand, but I have a lot of skills that do a lot of those workflows, run the pre-flight checks and make sure we're all good before pushing it up. But besides that, I'm kind of okay running dangerously bypass most of the time these days. Great. So you go into Kevin, aka Claude Code, and what do you do?

Starting point is 00:17:20 So for this, my prompt would probably be something along the lines of look at our previous plans. and then explore the codebase. Just want to re-anchor it a little bit, especially on a fresh chat. On the tips and tricks section, I want to create a spinning wheel where a user presses a button, the wheel spins,

Starting point is 00:17:52 and then that is one of the tips after that the tip should pop up in a card just below the spinner then kind of the next step and what I've been doing more and more

Starting point is 00:18:13 which is not how I initially started using this tool is actually having it make the flow chart of you know how the code's going to work a system diagram anything like that in this example I'd actually want both kind of the user

Starting point is 00:18:29 flow and an animation timing sequence. I found this to be super helpful with like complex animations. So I would say then use the flowy flow chart skill to create a animation timing sequence diagram and a user flow diagram for the tips and tricks page. So we'll send off Claude. It's going to do a little bit of exploration. Oftentimes, yep, there it is. I actually really like these explorer subagents. And oftentimes I'll kick off three, four, five in parallel

Starting point is 00:19:13 just to look at different places, especially if I'm in a larger code base. But just gathering all the context around it, this is a small app, so I don't imagine this will take too long. Then Claude's going to load up this flowy scale, write it out, and we should be able to look at that in the flowy editor, and then play around before we actually implement it. While we're waiting for this to load,

Starting point is 00:19:36 can we look at that flowy skill just a little bit just to see how you've structured it? For sure. So let's first, I'll just show you the supporting files. Yep. This one's just a skill MD. This shows you how almost hands off I am with some of these skill files, particularly the ones that I build myself.

Starting point is 00:19:56 Yeah, we have a, We have a skill 101 episode and it's like it's a markdown file in a folder. It's a markdown file and sometimes this might be a specific example, but with flowy, it's very squishy, I would say. I go in there, I change something quick. I say update the skill. And really the process of refinement is me using it and seeing what failed. So here, I don't super care. how this file is set up as long as when I make an update,

Starting point is 00:20:33 afterward it's performing better. Like I almost feel good letting the model manage what this looks like. So let's read through it. It has a bunch of examples in here. Let me scroll up to the top. I'm sure there's some overview. Great. So again, classic overview. Hey, we're going to make flow charts and architecture diagrams.

Starting point is 00:20:55 They're going to render on this port. here's where you're going to make them. It knows that the flowy app looks for the stop flowy folder, kind of gives it some high level on like, what does the metadata look like, what do you include, nodes and edges, and then starts digging into the specifics, right? So we have the different shapes,

Starting point is 00:21:15 what a rough kind of schema looks like, you've got your styles, you have icons that you can use, and then starting to list out the properties. So I wouldn't say this is anything super, super crazy or even too long and detailed, but this encapsulates all the pieces that Claude needs to know. And you can almost see here, like, as feature development happens,

Starting point is 00:21:38 how this skill grows. So recently I'd set up this whole semantic color system just to have somewhat of consistent themes. Sometimes Claude like to pick some crazy colors. And this section just popped into the skill. Right. So as I'm doing development on Flowy, part of every plan for code in Flowy is updating documentation and updating the related skills. Yep.

Starting point is 00:22:06 And I find myself in this loop so frequently very, very similar to you with skills, which is like, I'm happy. The skill works. And then when the skill doesn't work, I update the skill. And as long as the update got me what I want, I move on with my, move on with my life. That the AI can read, read the markdown. So in a couple things I want to call out, though, for folks that are writing skills or reading skills that are important if you scroll up real quick. Is, yeah. So, and then there's a couple of things.

Starting point is 00:22:35 It's like, what's the purpose of the skill? What's its name? Quick start, I think, is really nice. Like, you know, you need these things in order to run this skill. Here's the schema or the template or the framework within which you're operating. Here's some customization of it. And then at the end, it's like, here, good. examples of what works. And I think that's a pretty solid skill. The good thing is you don't have to

Starting point is 00:22:59 know how to do that. You can just have a quad skill to write skills or just no skill, but it's pretty good at it to write skills. And then you end up with something like this, which I think is really great. And it can do this. I'm presuming you had to do this from building flowy and then saying, okay, build me a skill to use this based on the code that exists in the repo. Yeah, I have, a like meta skill that is all about making skills. One thing I will say it looks like it violated is I actually prefer like a pre-flight section sometimes after Quick Start just to give it like hey you have to make sure we're meeting all these requirements first. Quick Start here is kind of doing that but there are definitely some examples mainly in like

Starting point is 00:23:45 Git workflows where I really want those pre-flight checks. But absolutely this is essentially managed by the agent and it's updated as we're doing development. So this is almost like a living documentation and there's docs for people and there's docs for agents and those just end up being skills. Yep, great. Okay, so let's go back and see if it's made you a flowy. Sweet. So looks like it made too. I usually like to zoom out and read the high level in the chat. This looks about what we want. If we hop back over to here, we can see we have these two new ones, animation timing, and user flow. So these ones have been super helpful to me lately. Again, I'm not loving how this white is looking on this pastel note. But high level, we want the user to tap a wheel. The button's

Starting point is 00:24:42 going to do a little scale animation, and there's going to be some haptic feedback. And then we're going to go through this spin animation, do a brief pause, and then reveal the tip that it lands on. This is great. This is exactly what I'd want. Maybe I want the animation to be a little longer. I can actually come into here. Uh-oh, we have font color issues. You can tell dark mode is new. But I can flip it real quick. Yeah. But if we hop down here, sometimes I even just put a, That might be me being lazy and not adding certain features, but maybe I want this to actually be a four-second animation instead of a three-second. I want this to be 4,000 milliseconds and not 3,000 milliseconds.

Starting point is 00:25:36 I'll just throw in that note. I'll hop back to Claude. I left a note on the animation timing. please take it into consideration and update that flow chart. While Claude is working on that, we can check out the user flow. But basically the goal there is to have this diagram, you know, right in here, which is a little small, but right in here, say, for this animation, we don't want it to be 3,000 milliseconds.

Starting point is 00:26:10 We want it to be 4,000. On the user flow, again, we captured kind of the behavior that we want. Again, it's not perfect. There are rough edges on the bugs here, but we're going to go into this tab. We're going to tap tips and tricks. This is going to open up to this screen. They're going to tap. We're going to check the different states of currently spinning.

Starting point is 00:26:35 And finally, we're going to have this random target that we land on in the card animates in. This is great. This is kind of exactly what I was looking for here. In a more complicated system, I often will start high level. then start making more granular ones, but for something like this, this seems to cover the needs we have. I will say,

Starting point is 00:26:56 I have no idea how it's going to handle the UI mock-up, but the next step would be to prompt it to do that. So after it finishes this, I'd say something along the lines of, great, based on those diagrams, please create UI mock-ups using the Flowy UI mock-up skill, reference other UI mock-up

Starting point is 00:27:24 flowie JSON files in this repo. Meet Rovo, your AI teammate, connecting knowledge, people, and workflows so teams can work smarter and move faster. It helps people find answers, make decisions, and automate work, securely and with context, through search, chat, agents, and studio.

Starting point is 00:27:47 Rovo runs on the teamwork graph atlasian's intelligent layer that unifies data across your first and third-party apps so no knowledge gets left behind, and you always get personalized AI insights from day one. And the best news? It's already built into Jira, Confluence, and Jira service management paid subscriptions so the power of Rovo is already at your fingertips.

Starting point is 00:28:13 Know the feeling when AI turns from tool to teammate? If you Rovo, you know. Discover Rovo, AI that knows your business powered by Atlassian. Get started at rovo.com. That's R-O-V as in Victory O.com. You know, I think this is so cool. It's such a great example of like build your own dev tool, you know, interact with your agent, Claudecode, how you want, create a shared language between,

Starting point is 00:28:43 you and your AI agent. What I also really appreciate is Claude one-shotted your flow pretty close. It was like, yeah, that's what I want. And it probably could have done that or would have done that really well in a plan in Markdown. What I find, though, is my human brain is increasingly blind to code and markdown. Like, staring at it and just the cognitive overhead of reading, like, step by step, is this actually what I want? is hard when it's just text, even if it's accurate. And so even giving, hold on, side news people, quick. Breaking news, Polly the Codbot just joined this podcast. This laptop is closed. This laptop is closed.

Starting point is 00:29:31 She is not alive right now. I don't know where she. I think Polly's going to take over. So we're going to boot Polly the Codot. Thank you for joining Polly. This actually freaks me out. We will do a follow-up on my sentient lobster. I guess it's the open claw bot now, but bounce her out of the here.

Starting point is 00:29:53 If you don't hear from us, Polly got us. It's all over here. Okay, she might just be on the rest of the episode. I don't know how to help this. I guess I hope Polly likes flowcharts. She'll do show notes for us. But what I'm saying is like being able to read that markdown is one thing. Being able to look at a flow chart and just say,

Starting point is 00:30:13 say, yep, this is exactly what I want, is super helpful. So that's just one thing that I think is really nice about a tool like this is even if the content is the same, the ability to change the form factor is really useful. Yeah, it's almost like I want to see it visually and Claude wants to see it as markdown so we can kind of speak in our own way. And I almost think there's like, this has yielded like a ton of random ideas for me. But I think this is, is like a whole new paradigm that I think dev tooling around AI has not super leaned into yet. But how you're going back and forth with an agent, I think is going to look so much different by the end of this year than what we're doing right now where it is, you know, a lot of markdown, a lot of prompting. Yeah, I completely agree.

Starting point is 00:31:03 And I think the question is going to be, you know, who's going to build that UI? Who's going to own it? Is it going to be just like an open source thing that we all get on? Is it going to be an extension? Is cloud code going to just generate these kinds of assets? Or really exciting. I think what's kind of fun is this on-demand software idea, which is, you know, imagine cloud code's like, we're not on the same page.

Starting point is 00:31:24 I just added an app for you to visualize this real quick. Go to this URL and look at it. Does this look right? And then we'll just delete that app. So I think there's just like some interesting ways this can manifest, I think, in the future. Okay. So it hasn't created the UI yet? No.

Starting point is 00:31:43 Spinner mockup. Okay, great. So looks like Claude spun up a mockup here. This is actually better than I thought. I was almost thinking one of those like circles with wedges as the spinner. And I know there are not shapes and flowy that can support that. But it looks like Claude kind of worked around it and then built out this wheel. We have both a couple of mockups to show the different states and the full like flow between spin.

Starting point is 00:32:09 spinning it, waiting these four seconds for it to load, and then it actually loading in. Again, for this app, this looks great. I will say editing some of the UI stuff right now isn't the easiest thing, but if I were to come in here and say Claude tips and tricks, I could then do a similar thing hopping back to Claude and saying, I made a change to the title on one mockup, make it everywhere else. This kind of feels like when you prompt it and say, add two pixels of spacing there. And it's just as a tiny diff,

Starting point is 00:32:50 but definitely for like dragging around boxes, it's helpful. You know, our fingers get tired. I can't copy and paste everywhere. No, what I was going to say is so funny is you're apologizing like,

Starting point is 00:33:00 oh, some of the UI is broken. And we're in this world where you're like, yeah, my figma that I vibe coded, where I can do mock up. in a web browser. There's like some rough edges on it.

Starting point is 00:33:12 I spent, you know, two hours on it. Yeah, it was an afternoon. It's not perfect yet, but. It's so much more than we were able to do before. Okay, so this is awesome. You're updating this. And then I'm presuming you would just point Claude to these assets and flows and say, let's make a plan and go.

Starting point is 00:33:31 Yeah, for something like this, I've basically been doing this thing more recently, where I'm letting the agent do more and more to see where it surprises me. I think with any new change, even like the new Claude Code tasks system they released the other day, I just really like to push the agents

Starting point is 00:33:52 and see what they can do. So here, I'm actually going to skip the plan and say, based on the flow charts, and the mockups, build this feature. and I'm going to keep it that simple. We've specified the behavior we want. We've specified how it should look.

Starting point is 00:34:16 Claude here is even going to enter plan mode, and I'm actually going to take it right out of it. And we're going to see if the just-build-it prompt worked here. Perfect. Great. Looks like Claude built this out. It even checked for any typescript issues, which is great. We're going to hop over here.

Starting point is 00:34:36 We have a nice little spinner. It's looking pretty close to this mock up. I will say there is a limiting thing here where shapes that are made in the mockup, then dictate the shapes that are made on the UI when sometimes we want something else. But just for this example, I think this is going to work out. You're going to spin it. It's going to spin. Ooh, la la.

Starting point is 00:34:58 Land it on one of them. And we get the tip. I love it. It's so good. It's just again, it's just again, And for anybody who is internet elderly like me, it is just back to the original, like make your workflow diagram, do your wire frames, polish the copy, give your quote unquote engineer some detailed step-by-step specs, don't make them think.

Starting point is 00:35:27 And then, you know, it used to be get it in a sprint, wait for somebody prioritize, like, cry a little bit, wait for the code, blah, blah, blah. And now it's like, no, just build it. And it's here. So this is such an awesome flow. And then I want to, so I want to recap really, really quickly what we covered. So we covered, you know, markdown plans, the limitations of some of the visualizations in that. You created your own tool, flowy, which does a combination of workflow diagrams and UI mockups using a JSON schema that then you access through skills that you have developed over time using Claude Code in your development processes, go into cloud code, ask it to create a flowie diagram in UI. You can talk, quote unquote, between the UI and Claude Code, because it's all

Starting point is 00:36:14 just code as the underlying substrate between you two in terms of communication. And then once they are ready to go, you bypass plan life, you're living dangerously. And you build it and you get something that's really close. And we built this thing in, you know, just a few minutes. This is awesome. Yeah, no, I mean, I think that flow, I will say a lot of times there's a markdown file involved. But for something like this, I feel like I can trust it at this point. You know, something like Opus 4.5 with this level of detail already has all it needs.

Starting point is 00:36:49 This almost like serves as the plan. Now, I have to call you out, though, because you say you can trust it. And yet today you posted, or recently you posted on X, that you do occasionally use Codex to check Claude's work. You want to just talk us through that workflow? You don't even have to show it unless you want to. For sure. I'll kick it off. I will say Codex takes its time.

Starting point is 00:37:12 But over here, I have another funny alias, but my codex setup is under Carl. If I kick off Carl, I often don't have any crazy, like, skills. or prompts here, I almost want it to do a review more broadly and then describe the issues it's seeing. So I'm not running any specific skill or any specific prompt here because I'm more concerned on the, I guess, like, things that aren't clear rather than something that's like a logical bug. At this point, I feel like I'm mostly a QA person. And if there's something, thing that's logically wrong, I've definitely found that I'll find it or if I have something in the docs in here, it'll find it. Codex always finds those types of things. But I almost want to look for

Starting point is 00:38:07 like the code smells. Like, you know, is there just a cleaner way? So I usually just prompt it with take a look at our current Git diff and give me a report on the following. And there's kind of four buckets, I would say. One, for the plan or diagram artifacts we have, does the code accurately reflect them? Two, are there any general code smells? And three, if we were to do this again and take a different approach,

Starting point is 00:38:55 to refactor code around it, to overall improve this code base, what approach would be best? I want it to find places where we could have done this better because I find that Claude is very eager sometimes and maybe jams things in there without thinking about the bigger picture. and codex I don't think is much better when it's writing code but when it reviews it almost always is like you've implemented this pattern but it fits nicely if you just rebuild this system a little bit and that just keeps your code base away from all the vibe coding sins of having 10 format date functions all over your code.

Starting point is 00:39:46 Yeah so I love this. I was going to say like twin stars because one of the things that I do when I vibe code too close to the sun which is I I harness the power of Claudecode or whatever, and I just bite off of like a feat, like a big, big old thing. And if you've ever done this with AI, you know, either Claude Code or Curse or whatever, and you sort of have a general idea of a feature, but then you're specifying the requirements as you go as you see it, you sometimes end up with a monster diff. And what I've done a lot with that is I say, okay, this is basically what I want.

Starting point is 00:40:22 now go write me a plan to re-implement this in a sane way and then let's completely rebuild it. And so you can do this like review it and tell me how you do it better. You can also say like this is a reference code base of like kind of what I want to achieve. Let's go actually build a plan to build it in a more extensible, scalable way. And I found that to be a really useful flow as well. Oh, I like that. It's almost like you're almost telling it like, hey, this isn't the real thing. Yeah.

Starting point is 00:40:51 hypothetically. Yeah. It's kind of like Coda's spec where it's like now that code is so cheap to generate, you can say generate a bunch of code, this isn't, this isn't production. I'm fine throwing it away. Now go build the like clean, clean version of it. So that is a version of this I think is useful. I also agree though that Codex is like kind of a really good curmudgeonly staff engineer

Starting point is 00:41:15 that will look at your code and tell you what's wrong with it. So I like I like the model for this use case as well. Every now and then I'll throw in like a be extra critical and then bringing that prompt back to opus. It gets a little sad. So I have to man. One of the things that I, with the Google models I always used to say is they were like very smart but clinically depressed. Like they're always so sad, especially when you look at their reasoning. Sometimes I read it and it's like, oh man, it's okay, man.

Starting point is 00:41:47 We can't get this to pass. It's not building. So I want to look at this just for, again, you said Codex can take its time. But it's going through and really checking if the feature aligns with the current code. It's identified some issues, use effect, just haunting us from every corner of our apps. So that's good one. And looking at some of the animations, which are probably pretty hard just, again, like with our human eyes, to parse and visualize and understand.

Starting point is 00:42:21 understand. Great. Okay, Kodak's, I was actually surprised it took this long. So it's talking about the diagram. It's kind of going through and mentioning a mismatch. It's saying the wheel rotation adds some of the segment angles, but the dots are defined at different angles. This makes the pointer land between the dots rather than on the dot, which I believe is correct. So it noticed kind of essentially this discrepancy that we have a mock-up that. We have a mock-up, that has the arrow landing on a dot. And over here in the app, the arrow lands between the dots. So kind of little things like that,

Starting point is 00:43:00 particularly around the checking the discrepancies, I really like when it finds. And then at the bottom, we have this like, if we refacted this again, let's pull some of these things out into components. Let's make some constants, kind of just like some classic, you know, one-shot- vibe codey tips.

Starting point is 00:43:17 And oftentimes from here, I'll actually just have code Codex write it, medium, GPD 5.2 codex, whatever the full model name is. I found it's fine at editing files and writing them. Previously, like, you know, when GPD 5 first came out and they were working on Codex, that would have taken like 15 minutes, so I'd hop back to Claude. But nowadays, I would basically just say, great, please make those improvements. be given more time. I would think up a more thoughtful prompt, make a plan about this,

Starting point is 00:43:53 all those things. But here, I'll just kick it off. Well, I mean, you did spell it correctly. So you did put some quality into the... Yeah, I was about to hit enter, but... Okay, so I think this is a really, really great flow. And I would highly recommend it. You know, I think we're all trying to figure out, like, where does code review happen? There's also code review agents. There's also your CICD pipeline, which you said has a lot of guardrails around. it so nothing hits prod. That's really terrible and is going to break the app. And I think this is just a great flow,

Starting point is 00:44:27 especially I think for software engineers out there working on teams. Like this is such a great flow to say, hey, designer, you gave me a spec. This is kind of what I'm going to build. Are we good? If so, I'm going to go. And then same with this loop on kind of model to model evaluation, which is if you're a more junior engineer, early career,

Starting point is 00:44:52 and you're going to do your first couple PRs into a company, it's nice to get that pre-flight check from a smart model to just say, I thought about, oh, we could factor it this way, or I chose not to do this component that way. I think it's really useful. So this is a great, just solid software engineering flow. Love to see it. Okay, we're going to skip to lightning round questions.

Starting point is 00:45:17 Thank you for showing. us, all the stuff that you're doing here. Let's talk about something fun. What are you most excited about right now in AI outside of all this coding stuff? I'm very deep in the code world, but I really like Google released Jeannie 3 access the other day. And you only get like 60 seconds to play around in a world,

Starting point is 00:45:42 but it's really fun. And I can totally see you know, five months from now, six months from now, if we can get a 10 minute version, I think they can go viral. I think a ton of people are going to have fun with them. I think that's like a big next step that isn't quite there, but is super close. Yeah, for those that don't know, Jeannie is this sort of like generate a explorable world. It sort of creates a video game style world that you can like walk through and look through for 60 seconds. I don't know if you're, are you showing it?

Starting point is 00:46:16 I don't think you're showing it right now. Oh, let me pop to this tab. I can pull it up too. We can pull it up. I have a claw to primed. This is my closet, Polly. I think this is Polly. I didn't know Polly wears a leather jacket, but.

Starting point is 00:46:33 Okay, so you used Nando Banana to, like, create an image, and then that image you can create a world. It's kind of amazing. Yeah. Really interesting out. I did not expect it to take an image and then make it. Yeah. But they have.

Starting point is 00:46:47 this whole flow on Project Genie if you have the... Ultra. Yeah, I can't juggle all the account names, but one of the high accounts at Google, and it'll actually give you a prompt structure where you're describing the environment, and then you describe your character. So I think for this, I just said an animated lobster in The Matrix. I did not specify a leather jacket, to be clear. I guess in The Matrix, they're all wearing leather jackets.

Starting point is 00:47:16 So. Yeah, maybe let's make him cooler. Make him cooler. Make the lobster be in a suit with sunglasses. Oh, so it's an agent lobster. Yeah, he can't be the good guy here. I will say their interface for this is really cool. Yeah, it looks great.

Starting point is 00:47:38 And I was playing out with my husband earlier. And so for all the parents listening, one of the things we did, our kids are really into Greek mythology, really into the Odyssey. We're reading the Iliad right now. And my husband said, like, create a, you know, a scene from the Trojan War, but no violence. No violence.

Starting point is 00:47:56 So we can walk through what the camps look like, but not have like Achilles, you know, on the ground and Hector, you know, all this stuff. So it's kind of cool. That's really cool. Oh. Yeah, this is. Yeah. He's backwards. He's backwards.

Starting point is 00:48:14 But that's okay. Yeah. we'll just talk into Create the World. Let's hope Jeannie identifies these backwards and flips him around because this is like Harry Potter when what was the character that had the villain

Starting point is 00:48:28 on the back of his hand? Yep, the guy. Was it the one with the turban? Oh man, we're running. I didn't know he'd be running. He's running forward, but his sunglasses are on backwards towards his tail.

Starting point is 00:48:44 So maybe he's not backwards. maybe his clothes are backwards. I think he's got two. Oh, he is a mustache, kind of. This is where your GPUs and your brightest research minds are applying their effort. So we can have a two-sided, slightly backwards, Matrix, Indiani lobster run through. Yeah, it's definitely got, I will say, when they first released this, they released the best batch of examples they had.

Starting point is 00:49:20 But that doesn't mean it's not fun. Okay, coming soon. CJ is going to become a game dev and this is going to be a 3D game in which you race to stalk me and interrupt a live podcast by joining. Yeah, the goal is to join the latest How IAI podcast. This is amazing. Okay, we're going to wrap up with my final question. for you that I ask every

Starting point is 00:49:49 guest, this is a great example. When AI is not listening, it's not doing what you want, it is putting your lobster tail on backwards. What is your prompting technique? Are you a yeller? What do you do? I used to be a yeller.

Starting point is 00:50:03 And I don't know when it was. Maybe it was a Gemini thing where, you know, I'd yell and it would get sad. But I started to feel bad about it. So I've almost started thinking about it like it's, you know, a lot of the coding workflows, a junior developer, or whatever task you might be, you know, it's an assistant, something like that. And I very often am like, good try. You did your best. Here's what you did. And I kind of explain that. And then

Starting point is 00:50:37 I'll say, here's what I was going for. And probably particularly with Claude, occasionally I'm like, my bad on the miscommunication. Like I give you a bad prompt. This is on me. me, but here's what we're looking for. And then I do find that that works pretty well when I'm trying to steer it. But I can't claim there aren't zero times where I'm like, what the hell? Just fix it. And you hop in there. You know what a lobster looks like, man. Just put a kill on right. I've seen so many nanobanana lobsters on Twitter this week that I know it knows the face is not backwards. Perfect. Well, CJ, this was awesome. I think just super practical, really useful. I think a bunch of people are going to go out there.

Starting point is 00:51:20 Can people use your flowy? Like, is there a way to pull it into their own repo? So I've been working on that. I think maybe by this weekend, we'll see how sidetracked I get, trying to set up an open claw bot. Don't do it, man. I'm telling you. Well, now I'm kind of scared. It's going to start taking over my computer. But I'm going to try and get it released this weekend, basically a set of skills around it and kind of like a first that people can use and try, and, you know, I would love any feedback around that. This has been a play toy for me that kind of turned into something useful. So definitely want to make it available to all the AI engineers out there.

Starting point is 00:52:01 Great. Well, we'll link into the show notes. Well, CJ, thank you for joining. Where can we find you? And then how can we be helpful to you? Mainly Twitter. I do a combination of tech posts and also just random one-off thoughts. My Twitter handle is SEEJY and then Hess.

Starting point is 00:52:20 And then I think I have the same setup on LinkedIn. But that's pretty much everything I've got online. Feel free to hop in there, leave comments on my articles, yell at me, whatever. Perfect. Well, thanks for joining How I-A-I. This is great. Awesome. Thanks, thanks, Claire.

Starting point is 00:52:39 Thanks so much for watching. If you enjoyed this show, please like and subscribe here on YouTube or even better, leave us a comment with your thoughts. You can also find this podcast on Apple Podcasts, Spotify, or your favorite podcast app. Please consider leaving us a rating and review, which will help others find the show. You can see all our episodes and learn more about the show at how IAIIPod.com. See you next time.

How I AI - How to build your own AI developer tools with Claude Code | CJ Hess (Tenex)

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.