LLM Agents - SkrewAI (Page 2)

LLM Agents

How Memory Architecture Shapes LLM Agent Performance

New research examines how different memory architectures affect LLM agent capabilities, offering insights into designing more effective AI systems.

LLM Agents

PABU: Making LLM Agents Smarter Through Progress-Aware Updates

New research introduces PABU, a framework that helps LLM agents track their progress and update beliefs more efficiently, reducing computational waste in multi-step reasoning tasks.

Agentic AI

How to Test and Measure Agentic AI System Performance

A comprehensive guide to evaluating AI agents covering benchmarks, testing frameworks, and metrics for measuring autonomous system performance in real-world applications.

LLM Agents

A2A Framework Enables Uncertainty-Aware Planning in AI Agents

New research introduces Assumptions-to-Actions (A2A), a framework that tracks LLM reasoning uncertainties to enable more robust planning and failure recovery in embodied AI agents.

LLM Agents

Agent-Omit: Teaching LLMs to Think More Efficiently

New research introduces Agent-Omit, a reinforcement learning framework that trains LLM agents to selectively omit unnecessary reasoning steps and observations, dramatically improving computational efficiency.

LLM Agents

AgentArk: Distilling Multi-Agent Systems Into Single LLMs

New research introduces AgentArk, a framework that transfers multi-agent intelligence into single LLM agents, potentially revolutionizing how complex AI systems are deployed efficiently.

AI Research

HumanStudy-Bench: Benchmarking AI Agents as Research Participants

New benchmark evaluates how well AI agents can simulate human research participants, raising important questions about synthetic behavior, authenticity detection, and the future of AI-human interaction studies.

LLM Agents

New Method Pinpoints and Fixes LLM Planning Errors Automatically

Researchers develop a system that can identify where LLM-based planners go wrong and automatically correct mistakes, improving AI agent reliability for complex tasks.

LLM Agents

Why LLM Reasoning Breaks Down in Long-Horizon Planning Tasks

New research reveals systematic failures in how large language models approach multi-step planning, with implications for AI agents in content generation and autonomous systems.

LLM Agents

Counterfactual Intent Generation Improves LLM Agent Control

New research introduces a counterfactual generation framework that helps LLM-based autonomous systems reason about alternative intents, improving decision-making reliability in control applications.

LLM Agents

New Framework for Automated Testing of LLM Agent Reliability

Researchers introduce methods and a framework for automated structural testing of LLM-based agents, addressing critical reliability challenges in agentic AI systems through systematic evaluation approaches.

LLM Agents

Cross-Domain RL Training: Reducing the Generalization Tax for LLM

New research explores how reinforcement learning training affects LLM agent generalization across domains, introducing the concept of 'generalization tax' and strategies to minimize performance degradation.