Writing

Articles

Notes on fullstack engineering, AI in production, and shipping reliable software.

All ai python llm security fastapi developer-tooling machine-learning orchestration privacy agents automation benchmarking docker full-stack open-source productivity typescript agentic-ai ai-ml-in-production aiml algorithms amd attributeerror audio-security authentication best-practices browser career cicd claude code-quality cognitive-load content-detection cost creative-ai data-structures developer-skills developer-tools developer_tools digital-identity education eidas evaluation feature-engineering gpu huggingface human-feedback langchain langgraph language-models llms local local-deployment maintainability ml mlops mobile-security multi-agent nlp npu onehotencoder optimization performance production production-ml prompt-engineering proprietary-platforms qwen reinforcement-learning reproducibility rlhf scikit-learn side-channel skiplist software-design software-engineering speech text-generation tokens turing versioning voice_agents vs-code web-development web_security workflow

Featured

aillmbenchmarking

We Broke Top AI Agent Benchmarks. Here's How to Build Robust LLM Evaluations with Python.

Uncover the vulnerabilities and biases in current AI agent benchmarks and learn practical Python strategies to build more robust, secure, and trustworthy LLM evaluation frameworks.

Apr 2026 · 1 min read

· end ·