LLM
Error-Localized Policy Optimization: A New Approach to LLM Tool R
New research introduces ELPO, a training method that teaches LLMs to learn from irrecoverable errors in tool-integrated reasoning chains, improving agent capabilities.
LLM
New research introduces ELPO, a training method that teaches LLMs to learn from irrecoverable errors in tool-integrated reasoning chains, improving agent capabilities.
LLM Research
New research shows that requiring LLMs to think step-by-step before responding can backfire in conversational settings, making AI agents appear cold and disengaged to users.
AI Safety
A developer built OntoGuard, an ontology-based firewall for AI agents using semantic web technologies like OWL and SHACL to validate agent actions against predefined rules, offering a new approach to AI safety.
AI Agents
New research proposes a comprehensive framework for empirically evaluating LLM-based agentic AI systems in healthcare, establishing seven key dimensions for systematic assessment.
AI Agents
New research introduces MARS, a modular agent with reflective search capabilities designed to automate AI research tasks through intelligent decomposition and self-correction.
LLM Safety
New research examines how persuasive content propagates through multi-agent LLM systems, revealing critical insights for AI safety and synthetic influence detection.
AI Agents
New research introduces synthetic semantic information gain rewards to optimize when AI agents should retrieve external knowledge, improving reasoning efficiency without sacrificing accuracy.
AI Agents
A deep dive into engineering production-ready AI agents for healthcare, covering system architecture, MLOps pipelines, safety guardrails, and governance frameworks for high-stakes deployments.
LLM Reliability
New research achieves enterprise-grade 99.99966% reliability in LLM systems through consensus-driven decomposed execution, bringing Six Sigma quality standards to AI agents.
AI Agents
New research presents a framework for building capable small language model agents using synthetic tasks, simulated environments, and structured rubric-based rewards—democratizing agentic AI development.
AI Agents
Learn how to implement short-term, long-term, and episodic memory systems in AI agents, enabling persistent context and improved reasoning capabilities across sessions.
AI Agents
Understanding when to use shallow tool-calling, ReAct reasoning loops, or deep multi-agent systems is crucial for building effective AI applications. Here's how to choose.