Service — Agentic AI architecture
Agentic AI on the Microsoft stack — designed to be trusted in production.
Designing and delivering agentic AI on Azure AI Foundry, the Microsoft Agent Framework and Copilot Studio, with the orchestration, grounding, evaluation and observability that decide whether a system holds up in production.

The real problem
A demo is easy. A system you can trust is not.
It takes an afternoon to build an agent that demos well. It takes real architecture to build one a regulated business can put in front of customers: one that is grounded in the right data, knows the limits of what it should do, can be evaluated against a known standard, and can be traced when something goes wrong.
That gap, between something that works in a controlled run and something that holds up in production, is where most agentic AI programmes struggle, and it is the part I focus on.
The architecture
The shape of a production agentic system
How I approach it
Eval-first, observable, and clear about the limits
I work from a few principles that have held up across real programmes. Define the evaluation before the build, so the eval suite becomes the contract for what the system is allowed to do. Make every conversation, tool call and model invocation observable, including its cost. Keep agents narrow, with a clear escalation path to a human. And start with the opportunity that genuinely needs an agent, rather than fitting one to a problem that did not.
These are not theoretical positions. They are the patterns I run and stress-test in OnDynamics Labs, on the same Microsoft stack your programme is on, before they reach a client.
Example project
Agentic AI on a real workload
Example — Public feedback analysis
Automated public feedback analysis across every channel
An agentic system for analysing public feedback as it arrives through every channel — email, posted letters, phone calls and web forms — and turning it into something an organisation can act on. The challenge was volume and inconsistency: the same underlying issue described in very different ways, through very different media.
The agents handle multi-level categorisation, raise content flags where something needs a human to look at it, produce summaries and sentiment reports, and infer a geographic location from the issue described in the feedback. It is driven by Azure OpenAI, with the orchestration and evaluation that let the output be trusted at the volume it runs at.
Engagement shapes
Three ways to bring an agentic AI programme to production
Most engagements are fractional — I sit inside your team for the months it takes to ship. A short discovery is the usual entry point.
2–3 weeks · fixed scope
Agentic AI Readiness Audit
A diagnostic of your current AI estate and the target you're aiming at. Lands as a board-presentable report.
Walk out with
- a current-state architecture map
- an eval-gap analysis against a production-readiness rubric
- a prioritised roadmap
2 days a week · 3-month minimum
Fractional Principal AI Architect
I sit inside your team and hold the architecture while you ship.
Walk out with
- design reviews on every release
- an eval discipline embedded in CI
- the architecture documentation your team will keep using after I leave
6–8 weeks · project-shaped
Eval & Observability Bring-up
Stand up the evaluation suite, tracing, and content safety as a CI-gated discipline.
Walk out with
- a working eval harness
- observability you can show the board
- the runbook your team uses on every change
Have an agentic AI idea — or a stalled one?
A first conversation will usually tell whether the idea is ready to build, what the evaluation should look like, and what a credible first version is.



