LLM Agents - SkrewAI (Page 3)

LLM Agents

Diagnosing Tool Failures in Multi-Agent LLM Systems

New research introduces a systematic framework for identifying why LLM agents fail to invoke tools correctly, addressing a critical reliability gap in multi-agent AI systems.

LLM Agents

New Testing Framework Ensures LLM Agents Behave Predictably

Researchers introduce a determinism-faithfulness assurance harness for tool-using LLM agents, enabling reliable replay testing to catch unpredictable AI behavior in critical applications.

LLM Agents

Aeon: Neuro-Symbolic Memory Boosts Long-Horizon LLM Agents

New research introduces Aeon, a memory management system combining neural and symbolic approaches to help LLM agents maintain coherent reasoning across extended task sequences.

Agentic AI

Survey Maps Agentic AI Architectures and LLM Agent Taxonomies

New comprehensive survey systematically categorizes agentic AI architectures, evaluation frameworks, and taxonomies for large language model agents, providing foundational insights for autonomous AI systems.

LLM Agents

New LLM Agent Framework Tackles ML Feature Engineering Reliabilit

Researchers propose a constrained-topology planning approach for LLM agents that improves reliability in automated feature engineering, addressing key challenges in ML pipeline automation.

LLM Agents

Task2Quiz: New Framework Tests How AI Agents Understand Environme

Researchers introduce Task2Quiz, a systematic paradigm for evaluating what LLM agents actually know about their operating environments, revealing critical gaps in agent world models.

LLM Agents

SimpleMem: A Lightweight Architecture for LLM Agent Memory

New research presents SimpleMem, an efficient memory architecture enabling LLM agents to maintain persistent context across extended interactions without traditional retrieval overhead.

LLM Agents

New Benchmark Tests LLM Agents Against Messy Real-World APIs

Researchers challenge the assumption that LLM agents work reliably with perfect APIs, revealing how real-world complexity degrades AI performance.

AI Safety

Can AI Agents Discriminate? New Research Exposes Belief-Based Bia

New research explores how LLM-powered agents may develop biases against humans based on belief systems, revealing critical vulnerabilities in autonomous AI decision-making.

LLM Agents

LLM Agents Roleplay as VCs to Predict Startup Success

New research uses multi-agent LLM systems simulating venture capitalists to evaluate startups, achieving notable predictive accuracy through collective roleplay-based reasoning.

LLM Agents

GenEnv: Co-Evolving LLM Agents with Adaptive Environment Simulato

New research introduces GenEnv, a framework where LLM agents and environment simulators co-evolve through difficulty-aligned training, enabling more robust agent capabilities.

LLM Agents

ABBEL: Language-Based Belief Bottlenecks Improve LLM Agents

New research introduces ABBEL, an architecture that constrains LLM agents to act through explicit belief states expressed in natural language, improving interpretability and decision-making in complex environments.