OnDynamics Labs — Project EVE
A self-evolving multi-agent platform, built in the open.
EVE is the laboratory where I build, break and rebuild agentic AI patterns on the Microsoft Agent Framework. The architecture that ends up on an enterprise programme is the one I stress-tested here first.

Why a side project
The fastest way to lose credibility is to talk about agentic AI without building it
Project EVE is a self-evolving multi-agent platform built on the Microsoft Agent Framework, Copilot Studio and Azure AI Foundry — the same stack an enterprise programme is on, or about to be on. I write code on it regularly and deploy new agents to it on the cadence Microsoft ships preview features.
The patterns that work in EVE end up in client architectures. The ones that break here break before they can cost a regulated enterprise a quarter.
I would rather a pattern fail on my own platform on a Sunday than on a client’s in production.
What EVE is
A laboratory for the hard parts of agentic AI
Multi-agent orchestration
EVE runs a population of specialist agents under a supervisor. Each agent has a narrow tool surface and a clear escalation contract; the supervisor routes work, arbitrates between agents, and surfaces every decision back to a human-readable trace.
Self-evolution
EVE reads its own logs, identifies its own failure modes, and proposes changes to its own prompts and tools — each change tested against an eval suite before it is accepted. It is research-grade work, deliberately kept out of regulated client systems, but it is the front edge of where these systems are heading.
Eval-first development
Every behaviour is asserted by an eval before it ships. The eval suite is the contract — when the suite breaks, the build breaks. The same discipline a 2020-era codebase applied to unit tests, EVE applies to agent behaviour.
Observability and cost telemetry
End-to-end traces of every conversation, tool call and model invocation — with cost attributed per agent, per task and per outcome. The dashboards built for EVE are the dashboards I build for clients.
The stack
What EVE is built on
EVE is deliberately Microsoft-native and infrastructure-as-code, so the patterns transfer cleanly to a client estate. Six layers, each chosen so the lessons carry across:
Why it matters to your programme
Patterns, not generic advice
When I recommend a supervisor-worker pattern for your contact centre, it is not because I read a blog post about it. It is because I have run it, broken it, fixed it, and have the telemetry on what happens when one of the workers misbehaves at two in the morning.
Where an engagement allows it, I can share EVE code as a reference — how an MCP server is wired, how an eval is structured, how the orchestrator is made observable. And because Microsoft ships preview features for the Agent Framework on a weekly cadence, by the time you ask whether a new capability is production-ready, I have usually already tried it.
Status
Where EVE is right now
Active development
EVE is worked on continuously — new agents and patterns added as the Microsoft Agent Framework moves.
Stable core
Multi-agent orchestration, the eval suite and observability are stable. The self-evolution loop is research-grade.
Shown on request
Not a public product. I demonstrate EVE directly to clients exploring agentic AI on the Microsoft stack.
See EVE running
A short walkthrough of a real multi-agent conversation, the eval suite that asserts its behaviour, and the cost telemetry behind it. You bring the hard question.



