Hello!
Writing
Notes on fullstack engineering, AI in production, and shipping reliable software—newest first.
Uncover the vulnerabilities and biases in current AI agent benchmarks and learn practical Python strategies to build more robust, secure, and trustworthy LLM evaluation frameworks.