AI Engineering & Agent Orchestration

Subscribe Sign in

Agentic Architecture

Agentic Architecture

Harness Engineering: The Outer System That Makes Agents Reliable

Building a good harness is what separates a good agentic implementation from a great one.

02 Apr

eval driven development

Ship Prompts Like Software: Regression Testing for LLMs

Because "it seemed fine when I tested it" is not a deployment strategy. Part 4 of 4: Evaluation-Driven Development for LLM Systems

26 Feb

evals

Four Ways to Grade an LLM (Without Going Broke)

Your evaluation technique should match the question you're asking, not your ambition.

25 Feb

eval driven development

Your Golden Dataset Is Worth More Than Your Prompts

Most teams spend weeks perfecting prompts and minutes on evaluation data. That's backwards. Part 2 of 4: Evaluation-Driven Development for LLM Systems

24 Feb

evals

Build LLM Evals You Can Trust

If five correct responses are enough to ship an LLM feature, what are you actually measuring: quality, or luck? Part 1 of 4: Evaluation-Driven Development for LLM Systems

23 Feb

Temporal + LangGraph: A Two-Layer Architecture for Multi-Agent Coordination

Agentic Architecture

Temporal + LangGraph: A Two-Layer Architecture for Multi-Agent Coordination

Using Temporal and LangGraph for multi-agent systems in production solves retries, state persistence, and failures.

14 Jan

RAG at Scale: What It Takes To Serve 10,000 Queries A Day

RAG

RAG at Scale: What It Takes To Serve 10,000 Queries A Day

24 Nov

Understanding Generative UI

Agentic Architecture

Understanding Generative UI

A Layered Walkthrough of the Generative UI paper everyone is talking about

20 Nov

Eliza Redux: A Real-Time Voice AI Crisis Support Agent

Agentic Architecture

Eliza Redux: A Real-Time Voice AI Crisis Support Agent

I built a crisis support voice AI Agent in roughly 90 minutes at a voice AI hackathon and won. Here&

27 Oct

Rethinking RAG: Meta’s REFRAG

RAG

Rethinking RAG: Meta’s REFRAG

For the past few years, retrieval-augmented generation (RAG) has been the workhorse architecture for grounded LLM applications. You retrieve relevant

15 Oct

MCP + A2A: The Protocols Making AI Agents Actually Work Together

AI Agents

MCP + A2A: The Protocols Making AI Agents Actually Work Together

I presented this at the AI Engineer meetup in London. It is a short, practical overview of why agent interoperability

09 Oct

Agentic Architecture

AI Agent Use Case Evaluation: From Risk Assessment to Implementation

When I first started evaluating Agentforce implementations, I made the classic engineer's mistake: jumping straight into technical capabilities

21 Jan