LLM Research
DeliberationBench: When Multiple AI Voices Hurt Performance
New benchmark reveals surprising findings about multi-LLM collaboration: more AI models deliberating doesn't always improve results. Research identifies when consensus helps and when it hurts.