Anup Jadhav

London

I'm an AI and Software Engineer with 20+ years of technology leadership experience, currently focusing on Generative AI and Agentic AI architectures. I specialize in developing Agentic AI systems that solve complex business problems.

Claude Code

Who Still Understands the Code?

AI coding agents make you dramatically faster. The cost they carry is quieter: a slow erosion of how well you understand the software you are shipping. Here is how I have come to think about that trade, and how I try to stay on the right side of it.

24 Jun

AI Engineering

Designing teams for an agentic world

AI coding agents are changing the economics of software development and the shape of engineering organisations. Here is how leaders should rethink build-versus-buy decisions, talent, team structure, platform strategy, and AI governance.

21 Jun

Agent Memory

The Frontier of Agent Memory: From Recall to Experience

Part 3 of a 3 part series post about AI Agent memory architecture.

03 Jun

Speculative Decoding

Trading cheap guesses for expensive forward passes

02 Jun

Query, Key, Values

How to think about Q, K, and V vectors in the Attention layer of a Large Language Model

01 Jun

How Modern Agent Memory Architectures Work

Part 2 of a 3 part series post about AI Agent memory architecture.

31 May

Agent Memory

Why Context Is Not Enough

Part 1 of a 3 part series post about AI Agent memory architecture.

29 May

On Durable Objects, Orleans, and prior art for the agentic web

Zak Knill wrote a sharp post this week arguing that LLMs are exposing a gap in our standard cloud-native

15 May

TIL: Ads in AI chatbots are not just a UX problem

TIL from a paper on ads in AI chatbots that putting adverts inside an AI assistant is not the same

11 May

Agentic Architecture

Welcome to Middle Loop Engineering

Where engineering rigour goes now that AI writes the code

11 May

GPUs

How fast does it serve? Throughput, latency, and picking the right GPU

Part 2 of 2 on inference engineering for AI engineers.

07 May

LLM Scaling

Fitting LLMs on Self-Hosted GPUs

How much VRAM does your LLM need, and which GPU should you actually rent? A free calculator covering DeepSeek, Llama, Mixtral on H100, B200, A100.

04 May

Claude Code

The Harness Is the Product

Where does product quality live in an LLM-based system? A leaked source and a detailed postmortem, both from Anthropic in the last four weeks, make the answer unusually concrete.

24 Apr

How "Thinking" Models Actually Work

Lilian Weng's Why We Think is a survey of test-time compute and chain-of-thought reasoning. Here's what I pulled out of it.

14 Apr

Agentic Architecture

Harness Engineering: The Outer System That Makes Agents Reliable

Building a good harness is what separates a good agentic implementation from a great one.

02 Apr

We’re Being Too Loose With the Term “World Model”

I think we are still too loose with the phrase “world model”.

31 Mar

TIL: Quantisation

Quantisation is really a precision-allocation problem.

28 Mar

Claude Code

Write Skills Like Workstations, Not Prompts

Claude Code skills work best when you treat them as workstations, not prompts: folders with scripts, gotchas, templates, and progressive disclosure that manage the agent's attention budget at runtime.

18 Mar