LLM evaluation
Noise-Response Calibration: New Protocol Fixes LLM Judge Bias
Researchers introduce a causal intervention protocol that calibrates LLM judges by measuring their response to noise perturbations, addressing systematic evaluation biases in AI assessment systems.