The Good Tech Companies - How to Build an n8n Automation to Read Kibana Logs and Analyze Them With an LLM

Episode Date: December 22, 2025

This story was originally published on HackerNoon at: https://hackernoon.com/how-to-build-an-n8n-automation-to-read-kibana-logs-and-analyze-them-with-an-llm. How we buil...t an n8n automation that reads Kibana logs, analyzes them with an LLM, and returns human-readable incident summaries in Slack Check more stories related to futurism at: https://hackernoon.com/c/futurism. You can also check exclusive content about #automation, #devops, #incident-response, #ai, #n8n, #kibana, #llm, #good-company, and more. This story was written by: @indrivetech. Learn more about this writer by checking @indrivetech's about page, and for more stories, please visit hackernoon.com. Built an n8n automation that pulls user logs from Kibana, analyzes them with an LLM, and posts a clear incident summary to Slack—cutting analysis time from 15 minutes to ~2

Transcript
Discussion (0)
Starting point is 00:00:00 This audio is presented by Hacker Noon, where anyone can learn anything about any technology. How to build an N8N automation to read Cabana logs and analyze them with an LLM by Indrive. Tech. For our small incident support team, just six specialists, two metrics are absolutely critical, speed of detection and accuracy of diagnosis. We receive cases from millions of users across 48 countries where our business operates. When a user case comes in, one of the core tools we rely on, inside others as Cabana. Due to the multifactor nature of our system, microservices, local regulatory specifics, A, B experiments, and more, we often end up searching for a needle Ina haystack by analyzing user-related events in logs. Even for an experienced specialist, this takes
Starting point is 00:00:47 time in order to understand what actually happened. Separate noise from real issues, connect technical events with the real user experience. On average, this kind of analysis takes at least 15 minutes per case and in ambiguous situations, it can take much longer. I wanted logs to read like a story, not like a raw event dump. And yes, this is not a replacement for an engineer, but rather a thinking accelerator. The idea, I built an automation in N8N that combines several tools, Slack, Cabana, via clairvoyance, an LLM, large language model, and turned it into a simple, practical workflow for support specialists. How it works. One, a dedicated Slack channel is created. Two, a specialist sends a user UID into the channel, just a number. Three, the automation captures the UID and sends a request
Starting point is 00:01:39 to clairvoyance Cabana using predefined filters. Four, all user activity logs for the last six hours are fetched. Five, if no logs are found, a clear, no activity found, message is posted in the Slack thread. 6. If logs exist, they are processed, empty entries are removed. Duplicates are eliminated. Data is structured and normalized. Everything is bundled into a single dataset. 7. The full log package is sent to an LLM together with a custom prompt tailored to our team's needs. 8. The LLM analyzes the events and returns a human readable summary, up to 600 characters. 9. The response is posted back into the original Slack thread about two minutes after the request. The initial development of this pipeline took about 30 hours, a large portion of which went into properly configuring
Starting point is 00:02:29 credentials, especially for Slack. I'll cover that later. We expect that with active usage, this automation will save the team up to 60 hours per month. What does the LLM actually answer? The response is always structured and answers very specific questions. What errors were detected? A summary of positive, normal, events, how the errors could have affected the user workflow. When exactly the issues occurred, timestamps, what actions the specialist should take next? As a result, we don't just see HTTP 400 on endpoint X, but real context, what the user was doing, where they encountered a problem, and how critical it was. And yes, no sensitive data is ever sent to the LLM. Core goals of the automation.
Starting point is 00:03:14 1. Humanizing LOG reading Cabana is a powerful tool, but reading logs with your eyes is tiring and cognitively expensive, especially when events are spread over time and across multiple services. I wanted the output to look like a clear explanation, not a technical dump. 2. Reducing analysis time before automation, 15 plus minutes per user case. After automation, 1 to 2 minutes, send UID right pointing arrow get summary. This is especially important during peak load or mass incidents. 3. Enabling deeper analysis the automation doesn't just save time.
Starting point is 00:03:50 It allows us to detect systemic issues faster. Identify recurring error patterns across users. Improve specialist skills by highlighting new problem areas. Better understand how the application behaves in real-world usage. Ultimately, this approach significantly reduces developer time spent investigating issues described only through user experience. Each case comes with a structured analysis backed by concrete log events. Who is this tool for? Primarily, support specialists, engineers working with user-facing incidents, teams that need to
Starting point is 00:04:23 quickly understand what went wrong, without immediately diving into Cabana. Is this the final version? No, this is a living tool currently in a quiet testing phase, with a full rollout planned for early 2026. During testing, prompts are refined, different LLMs and model. versions are compared. Filtering logic and response templates are improved. Even now, the automation already fulfills its main goal, making logs understandable and saving time. Pipeline structure. Trigger the pipeline starts from an event in a dedicated Slack channel, Slack trigger, event type.
Starting point is 00:04:58 New message posted in channel. Input. A user UID sent as plain text. Data preparation the message data is extracted and transformed using set nodes into the required JSON format. upper branch UID as a number, lower branch, thread context, channel ID and thread timestamp. Log retrieval the UID is passed to Kibbana, Elasticsearch, get many, using the index and clairvoyance. Logs are searched by within a defined time window, conditional logic and if node checks whether any events were found, no events. Data is merged with the thread context. A predefined no events message is posted to Slack.
Starting point is 00:05:40 Found, logs are aggregated, minimized, and normalized. Data is sent to the LLM for analysis. Response delivery, the LLM output is merged with the Slack thread context and posted as a reply in the original thread. Node Details. Slack Trigger requires pre-configured Slack credentials and the channel ID, available via open channel details in Slack. Set nodes used to extract and normalize input data. Right pointing arrow parsed from message text as a number. thread data right pointing arrow original message timestamp elastic search node requires
Starting point is 00:06:14 cabana credentials and the index id found in index management key settings 1,000 items higher values often caused gateway timeouts query time range filter by limited source fields last 1,000 events code node minimizer prepares logs for llm analysis normalizes fields time content masks potential P, phone, email. Even if not present, as extra protection, truncates long values. Removes empty fields, sorts events by time, de-duplicates similar events, computes lightweight statistics, HTTP codes, endpoints, builds a compact prompt with the top 500 aggregated events and a strict length limit. This is critical to avoid sending large, token expensive payloads to the LLM. Open AI node, message a model,
Starting point is 00:07:07 requires open AI credentials and model selection, currently GPT4. One mini during testing, the prompt is designed from the perspective of a second-line technical support specialist. First classify the user, driver, passenger, courier, focus on technical errors. If no errors exist, analyze business state, documents, bans, profile status, follow a strict response template with character limits. Tie conclusions to concrete endpoints and timestamps. Separate technical analysis from user-facing workflow impact. This structure turns raw logs into clear, actionable insights.
Starting point is 00:07:43 Example output available below. Final thoughts. This automation doesn't replace engineers, it helps them think faster. By turning raw logs into short, structured narratives, the team reduces cognitive load and speeds up incident analysis without losing context. With tools like N8N and modern LLMs, even small teams can build practical, human-friendly observability layers. The key isn't more data, it's making systems explain themselves. Thank you for listening to this Hackernoon story, read by artificial intelligence.
Starting point is 00:08:15 Visit hackernoon.com to read, write, learn and publish.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.