Hello!
Writing
Notes on fullstack engineering, AI in production, and shipping reliable software—newest first.
Explore the practical implications of token usage differences between Claude Opus 4.6 and 4.7. Learn to measure and optimize LLM token consumption in Python for cost-effective AI applications.
Uncover the vulnerabilities and biases in current AI agent benchmarks and learn practical Python strategies to build more robust, secure, and trustworthy LLM evaluation frameworks.