The Good Tech Companies - How to Build an AI Agent That Actually Handles Boring Tasks for You

Starting point is 00:00:00 This audio is presented by Hacker Noon, where anyone can learn anything about any technology. How to build an AI agent that actually handles boring tasks for you, by bright data. Ah, AI agents, the hottest trend in tech right now. Everyone's hyped about tem being the future of work. After all, they can do it all and will automate most tasks to give us more time, right? Well, sort of, the reality? Most agents get blocked by websites or get lost while trying to execute tasks. To actually make one that works, you need a best-in-class tech stack.

Starting point is 00:00:33 Only the right combination of tools can turn an AI agent into a real task automation machine. Follow this tutorial and learn how to craft an AI agent that can truly automate tasks for you. Why most AI agents don't deliver. The dream of having AI automate tasks for us is exactly why AI agents we reinvented in the first place. It's why, agentic AI, became a trend, and why the hype is still sky high. Imagine a world where all the tedious, repetitive stuff gets handled by AI so we can save time. Sounds perfect, right? That way, we could focus on what really matters. Stacking V-bucks in fortnight or grinding runes in Eldon Ring. Jokes aside, if you've ever played around with an

Starting point is 00:01:14 AI agent like Open AI operator or tried building one yourself. You already know the sad truth. AI agents rarely live up to expectations. These are some of the main reasons AI agents flop. They can't interact with websites or desktop apps like a real human would. LLMs powering them can be unpredictable, giving different results on the same input. Even when they do use a browser, anti-bot techniques like CAPTCHAs stop them cold. Unlike humans, AI agents often lack common sense reasoning and struggle to adapt when faced with situations beyond their programming. The problem isn't the idea of AI agents. Instead, it's the tech stack you use to build them, So let's stop wasting time and figure out how to build an AI agent that can actually

Starting point is 00:01:58 automate browser tasks for you. Make an AI agent automate the stuff you hate doing. Step by step tutorial. In this chapter, you'll be walked through building an AI agent that can handle one of the most boring, yet critical tasks out there. Job hunting. The resulting AI agent will be smart enough to. One, visit Google. Two, discover job platforms. 3. Browse listings based on your desired positions and preferences. 4. Extract interesting jobs. 5. Export them into a clean JSON file. And if you want to take it further, you'll also find resources on how to feed it your CV so the agent can learn your profile and automatically apply to the best matches, all without you

Starting point is 00:02:38 lifting a finger. Warning important. This is just an example. As mentioned before the end of this guide, the same agent can be adapted to almost any browser-based workflow by simply changing the task description. Let's dive in prerequisites to follow along with this tutorial. Make sure you have an LLM API key. We'll use Gemini, since it's basically free to use via API, but OpenAI, Anthropic, Ulama, GROC, and others work as well. A bright data account with the browser API enabled, don't worry about setup yet, as you'll be guided through it in this tutorial. Python is greater than or equal to 3.11 installed locally. To speed things up, we'll also assess. assume you already have a Python project set up with in virtual environment in place.

Starting point is 00:03:24 Step number one. Install browser USEAs mentioned earlier, most AI agents flop because they hit the wall of tech limitations brick. The models alone just aren't enough. So what's one of the best tools to build AI agents that can indeed do stuff inside a browser? Right finger browser use, never heard of it? No worries. Catch up with this video or take a look at its official docs.

Starting point is 00:03:50 YouTube, com, watch, V equals ZGKVKiX underscore crew and embeddable equals true first things first, activate your VNV and install the package from PIPI. Under the hood, this library runs on playwright, so you'll also need to grab the chromium binaries it depends on. To do so, run, boom, collision you're now set up with a browser automation agentic AI powerhouse. Step number two. Integrate the LLMAI agents won't. do much without AI, shocker, right? Cold sweat smile, so your agent needs a language modeled to properly think. Browser use supports a long list of LLM providers, but we'll focus on Gemini, the one highlighted on the official browser use GitHub page. Why Gemini? Because it's one of the few

Starting point is 00:04:38 LLMs with API access and generous rate limits that make it fundamentally free to play with. Free grab your Gemini API key and store it in a file in your project folder like this. Next, create and file, which will contain the AI agent definition logic. Start by reading the ENVS from using, which comes with, then define your LLM integration, amazing. You've got your AI engine ready. Brain time to define and build the rest of your agent's logic. Step number three. Describe the browser-based task to automate how you describe the task to your agent as everything. The LLM you configured in browser use only works as well as your instructions, so spent time crafting a prompt that's clear, detailed, but not overly complicated. This is the most important step in

Starting point is 00:05:23 your implementation. Thus, check out guides on prompt design and follow the browser use best practices to maximize results. You might need a few rounds of trial and error. Test tube since this is just an example, let's keep it simple and describe the browser job hunting task like this. As you can see, you're giving your agent a lot of freedom, which is totally fine considering how capable and flexible browser use is. Biceps light bulb tip. In a real-world setup, you should read preferences from a configuration file and inject them into your prompt. This makes your agent customizable for different searches. Think varying job titles, locations, required skills, company preferences, remote versus onset, and more. For a similar approach, Reador guide on

Starting point is 00:06:08 building a LinkedIn job hunting AI assistant. Step number four. Define and run the agent to use browser used to spin up an AI agent controlled by your configured LLM Thadjon tackle the task you defined earlier. Fire your agent like this. Perfect. Now all that's left is to grab the output from your AI agent and export it to JSON or any format you need. Floppy disk step number five. Export the output to JS on grab the output from your agent, which should be a clean JSON list of jobs and dump it to a file. Here we go. Mission complete. Boring task handler agent at your service. Saluting face step number six, address the agent limitations browser use is incredible, but not magical. Unfortunately, if you try to run your browser-based handler AI

Starting point is 00:06:55 agent now, it'll probably get blocked. That may occur because of a Google Recapsia sad face, see how to automate Recapsia solving. If it somehow bypasses that, there's still the indeed human verification page powered by Cloudflare. These failures are especially common if you run the script on a server or in headless mode, which, let's be honest, is exactly what you want. No one wants Amachine tied up for minutes while it handles a task. Blue it's so yeah, all this sets up building an AI agent that fails, just like all the others crying. Was that a waste of time?

Starting point is 00:07:28 Nope, as the tutorial isn't over yet, there's still the most important step. The one that actually makes this holding work. Starstruck step number eight. Integrate agent browser your agent fails because the sites it interact. acts with can detect it as an automated bot. How does that happen? Tons of reasons, including browser fingerprinting. The browser session created by default in Playwright is super generic and doesn't look like a real user. Rate limiters. Your agent ends up making too many requests in a short time, classic for automation, not humans, which triggers suspicion instantly. IP reputation. The more

Starting point is 00:08:05 automation scripts you run from your IP, the more solutions like Cloudflare flag you as a potential bot. increasing the chances of a CAPTCHA or other verification. So, what's the solution? A browser that runs human-like sessions, mimicking real user behavior. Consolve CAPTCHAs automatically if they appear. Integrates with a proxy network with millions of rotating IPs to avoid rate limits. Runs in the cloud for infinite scalability. Integrates seamlessly with AI.

Starting point is 00:08:33 Is this a dream? Nope. It exists, and it's called Agent Browser, aka Browser API, HTTPS, slash slash www YouTube com watch v equals T59 G KPK5ZY and embeddable equals true follow the official agent browser integration guide and you'll end up on a page like this. Copy your connection URL, highlighted in red, and add it to your file like so. Then, read it in and define the object to instruct browser use to connect to the remote browser. Next, pass the object to your agent. your AI agent will now execute tasks in remote agent browser instances,

Starting point is 00:09:13 while no longer being blocked or interrupted. What a Clutch! Trophy put it ALL together your final should contain. Tested by running it with. As you can see from the GIF execution you can generate from browser use, perfect for debugging bug. The AI agent can now access Google, then indeed, and filter jobs using the required criteria, posted in the last 24 hours, the result will be a file in your project folder. This file contains all the job data extracted from Indeed, ready for you to apply for. Wow. Astonished face in around 40 lines of code, you just built an AI agent that can automate

Starting point is 00:09:48 virtually any browser task for you. Want some ideas? Hang tight for a few more minutes and check out the next chapter. If you want to level up up, you can even integrate it with logic to read your C-Vand apply for positions automatically, as shown in the official browser you see example on GitHub. Thanks to Bright Data's agent browser integration in browser use, you can can now craft an unstoppable AI agent that handles all the boring tasks that drain your time and energy. The AI agent revolution is now, examples of boring tasks you can automate with this agent. Want some ideas for tasks and chores this AI agent can handle? Check these out. Find and schedule flights airplane. Let the AI search for flights, compare options, and even book tickets

Starting point is 00:10:31 based on your preferences. Extract weather data for multiple cities sun behind small cloud. Get real-time weather info for all the cities you're traveling to, so you're always prepared. Schedule calls for you calendar. Rely on calendarly or similar tool, and the AI will arrange meetings according to your availability. Track Amazon product prices and buy at low money bag. Monitor product prices and automatically purchase items when they hit your target price. Collect news headlines newspaper. Gather and summarize the latest news from multiple sources, so you don't miss anything important. Buy groceries for you shopping cart, provide a shopping list, and the AI will automatically purchase your groceries online, saving you time. Want more ideas? Discover other AI agent use cases

Starting point is 00:11:16 and scenarios. Final thoughts. Now you know how to build an AI agent that tackles boring, repetitive, dull, and time-consuming browser tasks for you. That wouldn't be possible without browser use, one of the coolest AI agent libraries out there, but the real game changer is Bright Data's agent browser, which gives your AI unstoppable, agent-ready cloud browser instances. At bright data, our mission is simple. Make AI accessible for everyone, everywhere, even for automated users. Until next time, stay bold, and keep building the future of AI with creativity. Sparkles thank you for listening to this Hackernoon story, read by artificial intelligence. Visit hackernoon.com to read, write, learn and publish.

The Good Tech Companies - How to Build an AI Agent That Actually Handles Boring Tasks for You

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.