LLM evaluation
Autorubric: New Framework Standardizes LLM Evaluation Methods
Researchers introduce Autorubric, a unified framework that brings systematic rubric-based evaluation to large language models, addressing inconsistent assessment methods across AI systems.