The Good Tech Companies - Tool Calling for Local AI Agents in C#
Episode Date: October 22, 2025This story was originally published on HackerNoon at: https://hackernoon.com/tool-calling-for-local-ai-agents-in-c. LM-Kit .NET SDK now supports tool calling for buildin...g AI agents in C#. Check more stories related to machine-learning at: https://hackernoon.com/c/machine-learning. You can also check exclusive content about #ai, #dotnet, #llm, #csharp, #local-ai, #local-ai-agents-calling-tool, #tool-calling, #good-company, and more. This story was written by: @lcarrere. Learn more about this writer by checking @lcarrere's about page, and for more stories, please visit hackernoon.com. LM-Kit .NET SDK now supports tool calling for building AI agents in C#. Create on-device agents that discover, invoke, and chain tools with structured JSON schemas, safety policies, and human-in-the-loop controls, all running locally with full privacy. Works with thousands of local models from Mistral, LLaMA, Qwen, Granite, GPT-OSS, and more. Supports all tool calling modes: simple function, multiple function, parallel function, and parallel multiple function. No cloud dependencies, no API costs, complete control over your agent workflows
Transcript
Discussion (0)
This audio is presented by Hacker Noon, where anyone can learn anything about any technology.
Tool calling for local AI agents in C. Sharp by Loic Carrere. Tools are a fundamental part of
agentic AI, alongside these core capabilities, while language models excel at understanding and
generating text. Tools extend their abilities by letting them interact with the real world,
searching the web for current information, executing code for calculations, accessing databases,
reading files are connecting to external services through APIs. Think of tools as the hands and
eyes of an AI agent. They transform a conversational system ento an agent that can accomplish tasks
by bridging the gap between reasoning and action. When an agent needs to check the weather,
analyze a spreadsheet or send an email, it invokes the appropriate tool, receives the result,
and incorporates that information into its response. This moves AI beyond
peer text generation toward practical, real-world problem-solving. Interested in how agents retain
and use context over time, explore our deep devion agent memory, why local agents have been hard. Building
AI agents that can actually do things locally has been surprisingly hard. You need models that
understand when and how to call external functions. Privacy without sending data to the cloud,
a runtime that can parse tool calls, validate arguments, and inject results. Model-specific
flows because each model has different tool-calling formats and interaction patterns, requiring
custom logic for interception, result injection, and action-ordering. Safety controls to prevent
infinite loops and runaway costs. Clear observability so you know what your agent is doing. Until
now, most agenic frameworks forced a choice, powerful cloud-based agents with latency and privacy
concerns, or limited local models without proper tool support. Today, that changes. Why tool-calling
changes everything. With L.M. Kit's new tool-calling capabilities, your local agents can ground
answers in real data. No more hallucinated weather forecasts or exchange rates. Agents fetch actual
API responses and concise sources. Chain complex workflows. For example, check the weather,
convert temperature to the user's preferred units, then suggest activities. All in one conversational
turn. Maintain full privacy. Everything runs on device. Your user's queries,
tool arguments and results never leave their machines.
Stay deterministic and safe.
Type schemas, validated inputs, policy controls,
and approval hooks prevent agents from going rogue.
Scale with your domain.
Add business APIs, internal databases,
or external MCP catalogs as tools.
The model learns to use them from descriptions and schemas alone.
What's new at a glance?
State of the art tool calling, write in chatbot flows.
Models decide when to call tools,
past structured JSON ARGs, and use results to answer users accurately.
Dedicated flow support across model families like Mistral, GPTOS, Quen, Granite, Lama, and more,
all via one runtime.
Three ways to add tools, implement, annotate methods with, import catalogs from MCP servers,
unified API that runs local SLMs with per turn policy, guardrails,
and events for human in the loop and observability at every stage.
All function calling modes supported. Simple function, multiple function, parallel function, and parallel multiple function, choose strict sequencing or safe parallelism.
Model-aware tool call flow. Modern SLMs emit structured tool calls. LM kit parses calls, routes them to your tools, and feeds results back with correlation and clear result types for a reliable inference path. How it works. Getting started. Here's a complete working example in under 20 lines.
The model catalog includes GPTOS and many other families.
Let's you pull a NAMD card like.
You can also check before you rely on tools.
See the model catalog documentation for details.
Try it now.
GitHub SAMPLA Production Ready Console Sample
Sample demonstrates multi-turn chat with tool calling, currency, weather, unit conversion,
per turn policies, progress feedback, and special commands.
Jump to create multi-turn chatbot with tools in.
Net applications three ways to add tools.
One, implement ITool, full control,
best when you need clear contracts and custom validation.
This snippet demonstrates implementing the interface
so an LLM can call your tool directly.
It declares the tool contract, parses JSON ARGs,
runs your logic, and returns structure JSON to the model.
Why use ITool, complete control over validation,
async execution, error handling, and result formatting.
2, annotate methods with LM-F-U-N-C-I-O-N-C-I-O-N-Q-Binding, best for rapid prototyping and simple synchronous tools.
What it does? Add to public instance methods.
LM-KID discovers them and exposes each as in, generating a JSON schema from method parameters.
How it's wired, reflect and bind with, or, then register the resulting tools via
why use LM function, less boilerplate. The binder generates schemas from parameter types and registers everything
in one line. Three, import MCP catalogs, external services, best for connecting to third-party tool
ecosystems via the model context protocol. What it does? Uses McPee client to establish a JSON-RPC
session with an MCP server, fetch its tool catalog, and adapt those tools so your agent can call them.
How it's wired, create, optionally set a bearer token, then to import the catalog,
L.M. Kit manages, retries, and session persistence. Why use MCP? Instant access to curated tool catalogs.
The server handles and over JSONRPC. LMKit validates schemas locally. See MCP client documentation,
execution modes that match your workflow. Choose the right policy for each conversational turn.
Simple function one tool, one answer. Example. What is the weather in Tokyo? Calls once in answers.
multiple function chain tools sequentially example convert 75 degrees fahrenheit to celsius then
tell me if i need a jacket one calls and gets 23 9 degrees celsius two calls and gets conditions
three synthesizes answer it is 24 degrees celsius and sunny a light jacket should be fine
quote dot parallel function execute multiple tools concurrently example compare weather in
in Paris, London, and Berlin, calls simultaneously, waits for all results, compares, and answers.
Only enable if your tools are item potent and thread safe. Parallel multiple function combine
chaining and parallelism. Example. Check weather in three cities. Convert all temps to Fahrenheit.
Andre commend which to visit. One. Parallel. Fetches weather for three cities.
2. Parallel converts all temperatures.
3. Sequential. Recommends based on results. See tool call policy documentation for all options including in.
Defaults are conservative. Parallel off, max calls capped. Safety, control, and observability.
Policy controls configure safe defaults and per turn limits. See tool call policy documentation.
Human in the LOOP review, approve, or block tool execution. Hooks. Before tool invocation.
after tool invocation, before token sampling, memory recall.
Structured data flow every call flows through a typed pipeline for reproducibility and clear logs.
Incoming with stable and outgoing with and success or error.
Try it.
Multi-turned chat sample.
Create multi-turned chatbot with tools in.
Net application's purpose demonstrates LM kit.
Net-sagentic tool calling.
During a conversation, the model can decide to call one or multiple.
tools to fetch data or run computations, pass JSON arguments that match each tools, and use each
tools JSON result to produce a grounded reply while preserving full multi-turn context. Tools implement
and are managed by a registry per turn behavior is shaped via why tools in chatbots, reliable,
source backed answers, weather, fx, conversions, business APIs. Agentic chaining. Call several tools
in one turn and combine results. Determinism and
safety. Type schemas, clear failure modes, policy control. Extensibility. Implement for domain
logic. Keep code auditable. Efficiency. Offload math. Look up to tools. Keep the model focused on
reasoning. Target audience. Product and platform teams, dev ops and internal tools. B2BAPs, educators and
demos. Problem solved. Actionable answers, deterministic conversions, quotes, multi-turn memory, easy extensibility.
Sample app lets you choose a local model or a customary.
Registers three tools, currency, weather, unit conversion.
Runs a multi-turned chat where the model decides when to call tools.
Prints generation stats, tokens, stop reason, speed, context usage.
Key features.
Tool calling via JSON arguments.
Full dialogue memory, progress feedback, download, load bars, special commands.
Multiple tool calls per turn and across turns.
Built-in tools tool name purpose online.
Notes ECB rates via Frankfurter.
Latest are historical plus optional trend.
Yes, no API key.
Business days.
Rounding and date support open media current weather plus optional short hourly forecast.
Yes, no API key.
Geocoding plus metric.
S.S.I. Offline conversions.
Length, mass, temperature, speed, etc.
No temperature is nonlinear.
can list supported units tools implement, JSON schema, and returning JSON. Extend with your own tool.
Use unique, stable, lowercase names. Supported models, pick per hardware. Mistral Nemo 24-712.
2B, around 7.7 gigabytes Vram. Meta Lama 3.18B, around 6 gigabyte Vram. Google Gemma 3,000
3-4B Medium, around 4GB V-RAM, Microsoft Phi 4 Mini 3, 82B, around 3, 3 gigabites
Vram, Alibaba Kwen 3-8B, around 5, 6 gigabytes Vram, Microsoft Phi 414, 7B, around 11 gigabytes
VRAM, IBM Granite 47B, around 6 gigabytes VRAM, Open AIGPT-PT-AOS 20B, around 16
gigabytes VRAM or provide a custom model RURI, G-GUF, let me know.
Commands. Clear conversation. Continue last assistant message. New answer for last user input.
Example prompts. Convert 125 United States dollars to year and show a seven-day trend.
Quote dot. Whether and to lose next six hours, metric. Quote dot, convert 65 miles per hour
to kilometer per hour. List pressure units. Quote dot. Now 75.5.
degrees Fahrenheit to degree C then two kilometers to miles quote dot behavior and policies quick
reference tool selection policy by default the sample lets the model decide you can require forbid
force a specific tool per turn multiple tool calls support several tool invocations per turn
outputs are injected back into context schemas matter precise plus concise improve argument
construction. Networking. Currency and weather require internet. Unit conversion is offline.
Errors. Clear exceptions for invalid inputs, units, dates, locations. Getting started prerequisites.
Net Framework 4.6.2 or Net 6.0 download. Run. Then pick a model or paste a custom
hurry. Chat naturally. The assistant will cologne or multiple tools is needed. Use anytime.
project link GitHub repository complete example all three integration paths why go local with
LM kit versus cloud agent framework zero API costs no per token charges run unlimited conversations
complete privacy user data never leaves the device GDPR HIPAA friendly sub 100 mislatency local inference eliminates
network round trips entirely works offline
Agents function without internet connectivity. No rate limits. Scale to millions of requests
without throttling. Full control. Own the stack. No vendor lock-in or API deprecations
versus basic prompt engineering type safe schemas. Jason schema validation catches bad arguments
before execution. Deterministic results. Clear success, error states, not fragile reg X
parsing. Parallel execution. Run multiple tools concurrently when safe.
full observability structured events at every stage not log archaeology testable contracts
mock tools inject results replay conversations error boundaries graceful failures with retry logic and
fallbacks versus manual function calling model decides agent autonomously picks tools and arguments
no brittle if else chains auto chaining multiple tool calls per turn results fed back automatically
90% less boilerplate. Register tools once, not per model or per prompt. Built in safety. Loop
Prevention. Max calls limits. Approval hooks out of the box. Model agnostic API. Same code works across
Mistral, Lama, Quinn, Granite, GPT offs. Progressive enhancement. Add tools without refactoring
conversation logic. Performance and limitations. Performance expectations tool in vocation
overhead. Around 2 to 5 milliseconds per call, parsing plus validation. Network tools. 50 to 500
milliseconds depending on API. Local tools. Less than 1 MIZ. Model inference remains the primary
latency factor. Requirements models must support tool calling, check. Network dependent tools
require internet connectivity. Parallel execution requires thread safe. Item potent tools. Recommended GPU
memory, 6 to 16 gigabytes VRAM depending on model size. Known limitations tool selection
quality depends on clear descriptions and schemas. Complex nested objects and arguments may confuse
smaller models. Very long tool chains, more than 10 calls, may exceed context windows. Ready to build?
1. Clone the sample. 2. Pick your integration approach need full control? Use. Prototyping quickly?
use using external catalogs use three add your domain logic replace demo tools with your
APIs databases or business logic four set policies that fit your use case simple lookups
complex workflows with approval hooks five ship agents that actually work on device private
reliable observable start building agentic workflows that respect user privacy
run anywhere and stay under your control thank you for listening to this hackernoon story
read by artificial intelligence. Visit hackernoon.com to read, write, learn and publish.
