LLM research
Knowledge Model Prompting Boosts LLM Planning Performance
New research introduces Knowledge Model Prompting, a technique that enhances LLM reasoning on complex planning tasks by structuring domain knowledge representation.
LLM research
New research introduces Knowledge Model Prompting, a technique that enhances LLM reasoning on complex planning tasks by structuring domain knowledge representation.
LLM Agents
New research introduces AgentArk, a framework that transfers multi-agent intelligence into single LLM agents, potentially revolutionizing how complex AI systems are deployed efficiently.
LLM research
New research introduces Accordion-Thinking, a self-regulated approach that compresses reasoning steps dynamically to improve LLM efficiency while maintaining readable chain-of-thought outputs.
LLM Efficiency
New research proposes dynamic precision routing to optimize computational resources across multi-step LLM interactions, balancing quality and efficiency through adaptive quantization strategies.
AI Agents
New research introduces MARS, a modular agent with reflective search capabilities designed to automate AI research tasks through intelligent decomposition and self-correction.
LLM Interpretability
New research presents evidence that LLM self-explanations can help predict model behavior, offering a positive case for faithfulness in AI interpretability.
LLM Evaluation
New research proposes PeerRank, a system where LLMs evaluate each other through web-grounded peer review with built-in bias controls, potentially transforming how we benchmark AI models.
AI Agents
New research introduces synthetic semantic information gain rewards to optimize when AI agents should retrieve external knowledge, improving reasoning efficiency without sacrificing accuracy.
LLM research
New research introduces DAJ, a data-reweighting approach for LLM judges that improves test-time scaling in code generation by better identifying correct solutions.
LLM Evaluation
New research reveals smaller language models can outperform large LLMs at evaluation tasks through semantic capacity asymmetry, challenging the dominant LLM-as-a-Judge paradigm.
AI Agents
New research presents a framework for building capable small language model agents using synthetic tasks, simulated environments, and structured rubric-based rewards—democratizing agentic AI development.
LLM Security
Researchers discover that simulating intoxicated speech patterns can bypass AI safety guardrails. The 'In Vino Veritas' attack reveals fundamental weaknesses in how LLMs handle linguistic degradation.