GPUs
How fast does it serve? Throughput, latency, and picking the right GPU
Part 2 of 2 on inference engineering for AI engineers.
šš½ I'm Anup. I'm an AI and Software Engineer building production Agentic AI and Generative AI systems. I work on RAG pipelines, multi-agent architectures, and multi-cloud deployments.
My newsletter, The AI Engineering Brief, covers practical AI engineering for people shipping real systems. X/Twitter/Bluesky: @anup.