The Good Tech Companies - How to Build an n8n Automation to Read Kibana Logs and Analyze Them With an LLM
Episode Date: December 22, 2025This story was originally published on HackerNoon at: https://hackernoon.com/how-to-build-an-n8n-automation-to-read-kibana-logs-and-analyze-them-with-an-llm. How we buil...t an n8n automation that reads Kibana logs, analyzes them with an LLM, and returns human-readable incident summaries in Slack Check more stories related to futurism at: https://hackernoon.com/c/futurism. You can also check exclusive content about #automation, #devops, #incident-response, #ai, #n8n, #kibana, #llm, #good-company, and more. This story was written by: @indrivetech. Learn more about this writer by checking @indrivetech's about page, and for more stories, please visit hackernoon.com. Built an n8n automation that pulls user logs from Kibana, analyzes them with an LLM, and posts a clear incident summary to Slack—cutting analysis time from 15 minutes to ~2
Transcript
Discussion (0)
This audio is presented by Hacker Noon, where anyone can learn anything about any technology.
How to build an N8N automation to read Cabana logs and analyze them with an LLM by Indrive.
Tech. For our small incident support team, just six specialists, two metrics are absolutely critical,
speed of detection and accuracy of diagnosis. We receive cases from millions of users across
48 countries where our business operates. When a user case comes in, one of the core tools we rely on,
inside others as Cabana. Due to the multifactor nature of our system, microservices, local
regulatory specifics, A, B experiments, and more, we often end up searching for a needle
Ina haystack by analyzing user-related events in logs. Even for an experienced specialist, this takes
time in order to understand what actually happened. Separate noise from real issues, connect
technical events with the real user experience. On average, this kind of analysis takes at least
15 minutes per case and in ambiguous situations, it can take much longer. I wanted logs to read like a
story, not like a raw event dump. And yes, this is not a replacement for an engineer, but rather a
thinking accelerator. The idea, I built an automation in N8N that combines several tools, Slack,
Cabana, via clairvoyance, an LLM, large language model, and turned it into a simple, practical workflow
for support specialists. How it works. One, a dedicated Slack channel is created. Two, a specialist sends a
user UID into the channel, just a number. Three, the automation captures the UID and sends a request
to clairvoyance Cabana using predefined filters. Four, all user activity logs for the last six hours are
fetched. Five, if no logs are found, a clear, no activity found, message is posted in the Slack thread.
6. If logs exist, they are processed, empty entries are removed. Duplicates are eliminated. Data is
structured and normalized. Everything is bundled into a single dataset. 7. The full log package
is sent to an LLM together with a custom prompt tailored to our team's needs. 8. The LLM analyzes
the events and returns a human readable summary, up to 600 characters. 9. The response is posted
back into the original Slack thread about two minutes after the request. The initial development
of this pipeline took about 30 hours, a large portion of which went into properly configuring
credentials, especially for Slack. I'll cover that later. We expect that with active usage, this
automation will save the team up to 60 hours per month. What does the LLM actually answer? The response
is always structured and answers very specific questions. What errors were detected? A summary of positive,
normal, events, how the errors could have affected the user workflow. When exactly the issues
occurred, timestamps, what actions the specialist should take next? As a result, we don't just
see HTTP 400 on endpoint X, but real context, what the user was doing, where they encountered a
problem, and how critical it was. And yes, no sensitive data is ever sent to the LLM.
Core goals of the automation.
1. Humanizing LOG reading Cabana is a powerful tool,
but reading logs with your eyes is tiring and cognitively expensive,
especially when events are spread over time and across multiple services.
I wanted the output to look like a clear explanation, not a technical dump.
2. Reducing analysis time before automation, 15 plus minutes per user case.
After automation, 1 to 2 minutes, send UID right pointing arrow get summary.
This is especially important during peak load or mass incidents.
3. Enabling deeper analysis the automation doesn't just save time.
It allows us to detect systemic issues faster.
Identify recurring error patterns across users.
Improve specialist skills by highlighting new problem areas.
Better understand how the application behaves in real-world usage.
Ultimately, this approach significantly reduces developer time spent investigating issues described only through user experience.
Each case comes with a structured analysis backed by concrete log events.
Who is this tool for?
Primarily, support specialists, engineers working with user-facing incidents, teams that need to
quickly understand what went wrong, without immediately diving into Cabana.
Is this the final version?
No, this is a living tool currently in a quiet testing phase, with a full rollout planned
for early 2026.
During testing, prompts are refined, different LLMs and model.
versions are compared. Filtering logic and response templates are improved. Even now, the automation
already fulfills its main goal, making logs understandable and saving time. Pipeline structure. Trigger
the pipeline starts from an event in a dedicated Slack channel, Slack trigger, event type.
New message posted in channel. Input. A user UID sent as plain text. Data preparation
the message data is extracted and transformed using set nodes into the required JSON format.
upper branch UID as a number, lower branch, thread context, channel ID and thread timestamp.
Log retrieval the UID is passed to Kibbana, Elasticsearch, get many, using the index and clairvoyance.
Logs are searched by within a defined time window, conditional logic and if node checks whether
any events were found, no events.
Data is merged with the thread context.
A predefined no events message is posted to Slack.
Found, logs are aggregated, minimized, and normalized.
Data is sent to the LLM for analysis.
Response delivery, the LLM output is merged with the Slack thread context and posted as a reply in the original thread.
Node Details. Slack Trigger requires pre-configured Slack credentials and the channel ID,
available via open channel details in Slack.
Set nodes used to extract and normalize input data.
Right pointing arrow parsed from message text as a number.
thread data right pointing arrow original message timestamp elastic search node requires
cabana credentials and the index id found in index management key settings 1,000 items higher
values often caused gateway timeouts query time range filter by limited source fields last 1,000
events code node minimizer prepares logs for llm analysis normalizes fields time content
masks potential P, phone, email. Even if not present, as extra protection, truncates long values.
Removes empty fields, sorts events by time, de-duplicates similar events,
computes lightweight statistics, HTTP codes, endpoints, builds a compact prompt
with the top 500 aggregated events and a strict length limit. This is critical to
avoid sending large, token expensive payloads to the LLM. Open AI node, message a model,
requires open AI credentials and model selection, currently GPT4.
One mini during testing, the prompt is designed from the perspective of a second-line technical support specialist.
First classify the user, driver, passenger, courier, focus on technical errors.
If no errors exist, analyze business state, documents, bans, profile status,
follow a strict response template with character limits.
Tie conclusions to concrete endpoints and timestamps.
Separate technical analysis from user-facing workflow impact.
This structure turns raw logs into clear, actionable insights.
Example output available below.
Final thoughts.
This automation doesn't replace engineers, it helps them think faster.
By turning raw logs into short, structured narratives,
the team reduces cognitive load and speeds up incident analysis without losing context.
With tools like N8N and modern LLMs, even small teams can build practical, human-friendly
observability layers. The key isn't more data, it's making systems explain themselves.
Thank you for listening to this Hackernoon story, read by artificial intelligence.
Visit hackernoon.com to read, write, learn and publish.
