Service — Agentic AI architecture
Agentic AI on the Microsoft stack — designed to be trusted in production.
Designing and delivering agentic AI on Azure AI Foundry, the Microsoft Agent Framework and Copilot Studio — with the orchestration, grounding, evaluation and observability that decide whether a system is a demo or something a regulated business can rely on.

The real problem
A demo is easy. A system you can trust is the work.
It takes an afternoon to build an agent that demos well. It takes real architecture to build one a regulated business can put in front of customers — one that is grounded in the right data, knows the limits of what it should do, can be evaluated against a known standard, and can be traced when something goes wrong.
That gap — between something that works in a demo and something that holds up in production — is where most agentic AI programmes struggle, and it is the part of the work I focus on.
The architecture
The shape of a production agentic system
How I approach it
Eval-first, observable, and honest about the limits
I work from a few principles that have held up across real programmes. Define the evaluation before the build — the eval suite is the contract for what the system is allowed to do. Make every conversation, tool call and model invocation observable, including its cost. Keep agents narrow, with a clear escalation path to a human. And start with the opportunity that genuinely needs an agent, rather than fitting one to a problem that did not.
These are not theoretical positions. They are the patterns I run and stress-test in OnDynamics Labs — on the same Microsoft stack your programme is on — before they reach a client.
Example project
Agentic AI on a real workload
Example — Public feedback analysis
Automated public feedback analysis across every channel
An agentic system for analysing public feedback as it arrives through every channel — email, posted letters, phone calls and web forms — and turning it into something an organisation can act on. The challenge was volume and inconsistency: the same underlying issue described in very different ways, through very different media.
The agents handle multi-level categorisation, raise content flags where something needs a human to look at it, produce summaries and sentiment reports, and infer a geographic location from the issue described in the feedback. It is driven by Azure OpenAI, with the orchestration and evaluation that let the output be trusted at the volume it runs at.
Have an agentic AI idea — or a stalled one?
A first conversation will usually tell whether the idea is ready to build, what the evaluation should look like, and what a credible first version is.



