The Good Tech Companies - What is Agentic Testing?

Starting point is 00:00:00 This audio is presented by Hacker Noon, where anyone can learn anything about any technology. What is agentic testing by QA. Tech? Riddle me this. If your test suite breaks every time a button moves, a div changes, or, gods forbid, and A, B test runs, is it really testing anything? If you had to pause, chances are your team, like most engineering teams, is spending far too much time fixing and maintaining script-based tests that break with every UI change. Luckily, with AI in the workflow, releases cycles winded down, Ui does is break, and everything is fine, right? Right. In this post, we'll break down what AI testing is and how QA agents work under the hood, so you can decide if it's actually any help for your team. What traditional test automation does?

Starting point is 00:00:46 Test automation tools like Playwright or Selenium follow a set of step-by-step directions. It goes something like, go to this URL, find the element with the specific CSS selector, Click it, assert this text appears. It all works great, as long as your product never changes. But there's something called the selector treadmill. Basically, scripts don't break just because the UI changes but also because the instructions go stale. Teams report spending 30% to 40% of their dev time maintaining existing tests rather than finding and fixing real bugs or building new features. AI code generation has only increased the friction to maintain good tests. Tools like Cursor, code and co-pilot are helping companies ship codifaster than ever. But more output also means more

Starting point is 00:01:33 Ui changes per sprint, more code refactors, more components being rewritten. Each and every one of those brings a risk of your testing workflow breaking. According to a Forrester TAY study on automation platforms, high-performing teams without a proper QA solution were experiencing around 20 bugs per sprint reaching production within a two-week cycle. As code volume continues to grow, this number increases exponentially, and neither manual testing nor classic test automation is capable of catching up. The unfortunate reality is that traditional test automation does deliver excellent return on investment, ROI. When things are stable, the same Forrester study reports a 209% ROI over three years for one of these platforms. But that assumes a pre-AI level of development stability that

Starting point is 00:02:19 doesn't exist anymore. Instead of helping, scripted tests quickly become liabilities. They start slowing you down, because keeping them up to date becomes a job in itself. Enter agentic testing. What is agentic testing? Simply put, agentic testing focuses on AI achieving given goal. You don't tell the tool how to test, click this, assert, then that, you tell it what to verify. Here's an example statement. Make sure the user can successfully add an M-size hoodie to the cart and complete checkout with Google Pay.

Starting point is 00:02:50 The AI agent is tasked with carrying out that goal. With agentic testing, the QA agent receives. a goal from the user and the N figures out how to complete it inside the system. It navigates across web, desktop, or mobile apps and interacts with elements. Then, it checks whether the goal was actually achieved. React pattern most agentic systems, including the one we've built at QA. Tech, follow the React pattern. Observe, greater than decide, greater than act, greater than evaluate observe. The QA agent first looks at the current state of the page, both the DOM and the visual layout. Decide or think. It reasons about the goal. I need to find the add to cart button. I see a blue button with a cart icon.

Starting point is 00:03:33 Quote. Act. It acts like a user would. Evaluate. It checks if the action worked and decides what to do next repeats. Here, we're talking about an autonomous system that understands the structural and visual hierarchy of a web app and recreates a path that a user would take. Memory L-A-E-R-Q-A-Tek creates a structural understanding of your application before running a single test. Our agents crawl your website are an app to map all the pages, flows, interactive elements, and relationships between elements into a knowledge graph. Let's use our map analogy once again. Think about the difference between a tourist wandering a city and a local who knows every street by heart. Both can go from A to B, but because of their familiarity with the city, the local no show

Starting point is 00:04:18 to take shortcuts and where the dead ends are. How agentic testing works in practice. Before you write a single test, the agent crawls your app. QA. Tech calls this epistemic foraging, which is basically agents autonomously exploring your AptoMap and understand the user flows and the UI elements. Intent you write a simple and natural prompt to tell the agent what the end goal is, such as, verify that the user can successfully search, browse, view, and book a property. Discovery the agent then loads the application using the structure it has already learned during

Starting point is 00:04:51 the crawl. Now that it has a map of the pages and elements, it can start looking for the homepage browser properties feature, booking buttons, and every other required option the way a human user would. Execution flow the agent proceeds to complete the test run step by step. You can watch the full process through recording session. If something happens unexpectedly, like an unwanted suggestion, the agent sees it, realizes it's an obstacle to the goal and closes it. Assertion once the flow is completed, the agent evaluates the pass or fail based on whether the goal has been met. Now, compare this result to the equivalent playwright script. You have to go to the

Starting point is 00:05:28 locate the property search input using a specific selector or enter allocation, trigger the search, wait for the results page to load, click on a property listing, and then find and press the booking button. Finally, you need to check if the confirmation page or message shows up. It all works as long as nothing changes. The moment a CAPTCHA is added, a test ID is renamed, or the booking process is split across additional pages, you will back in maintenance mode. What makes agentic test automation different from AI-assisted testing? Honestly, elude of tools marketed as AI test automation are really just rappers generating scripts for obsolete tech like playwright. True, they get you higher test coverage faster, but they are brittle and unreliable in the same way. We think using agentic automation

Starting point is 00:06:14 this way is just using AI on wrong paradigm. The outcome of using AI rappers is worse when compared to AI agents who act independently. When evaluating AI testing tools, always look for these five markers. One, goal-driven. Tests are focused on outcomes rather than the implementation process and how you can get there. 2. Perceptual. The agent views your application just like a real user would, visually plus via HTML. It doesn't rely on selectors created by an individual to reference an element. This is why your agentic tests won't break due to UI modifications. 3. Adaptive. This is your agent's ability to self-heel. If you move the position or a submit button or add an additional step as an A-B test, the agent will be able to find its way to complete the goal, even though the original elements moved or the path change.

Starting point is 00:07:06 4. Self-evaluating. In agentic testing, the agent determines pass or fail based on whether the stated goal was achieved. Tests stay aligned with user intent even as they the codebase evolves underneath. 5. Continuously learning. The more interaction the agent has with your application, the better it becomes at recognizing happy path scenarios and what is considered normal task performance for specific user interface components. When agentic testing is, and isn't, the right fit. It'd be easy to oversell this, so let me be straight. I don't think you should throw out every single script written for

Starting point is 00:07:41 playwright you own tomorrow. There are some areas where this approach is a really strong fit. end-to-end, E2E, user flow. Anything that involves onboarding, checkout, managing accounts, and doing all crud activities. Regression suites. Continuously changing UIs that release faster than you can manually test. Fast-moving UIs. Validation of new releases on time. Complex products. Products where the best way to validate the experience was manual testing, but to obvious reasons, it doesn't happen fast enough. However, at the moment, it's a less than ideal solution for highly interactive apps, like Notion. WebGL games or WebGL-based UI are also hard for agents to test. UI that is highly dynamic from session to session. The reality is that most teams we talk to use

Starting point is 00:08:29 a hybrid approach. They rely on scripts for small details and let AI agents handle the broad and complex user flows. That being said, we've seen companies adopting Agentic QA gain up to 529% ROI with a three-month payback. Rapping up, agentic testing represents a completely different approach from the traditional testing framework, with a one-to-one relationship between your desired goals and your actual results. There are no brittle tests in between that could collapse when your development team releases a product at lightning speed. If your team is spending more time maintaining test infrastructure than finding real bugs, agentic test automation can help you close that gap.

Starting point is 00:09:08 Want to go deeper? Here are some useful materials. from manual to autonomous QA, a step-by-step transition guide. Is QA automation worth it? The real ROI of intelligent testing. Book a demo with QA tech and see how our agents can validate your critical flows for your next release. Thank you for listening to this Hackernoon story, read by artificial intelligence. Visit hackernoon.com to read, write, learn and publish.

The Good Tech Companies - What is Agentic Testing?

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.