A pragmatic guide to LLM evals for devs
open.substack.com
TIL about Evals, the automated testing analogue to traditional unit/integration tests. Since running LLMs (for evaluation) in CI pipelines isn’t cheap, it’s good to prioritize test scenarios based on top buckets of real-world user issues.