Liquid AI
Liquid AI's LFM2.5-350M: Big Performance, Tiny Model
Liquid AI releases a 350M parameter model trained on 28 trillion tokens with scaled reinforcement learning, challenging assumptions about what compact models can achieve.
Liquid AI
Liquid AI releases a 350M parameter model trained on 28 trillion tokens with scaled reinforcement learning, challenging assumptions about what compact models can achieve.
LLM evaluation
Researchers introduce REAL, a regression-aware reinforcement learning framework that trains LLM judges to produce more accurate evaluations by optimizing for numerical precision rather than classification.
AI Safety
New research examines whether safety guardrails in large language models remain intact when agents are optimized for helpfulness through reinforcement learning.
LLM
New research combines reinforcement learning with knowledge distillation to improve how smaller language models learn complex reasoning from larger teacher models.
LLM Agents
New research introduces Tool-R0, a framework enabling LLM agents to autonomously learn tool usage through self-evolution, eliminating the need for curated training datasets while achieving state-of-the-art performance.
AI Agents
New research introduces MIRA, a framework that integrates memory architectures with reinforcement learning while minimizing expensive LLM calls, advancing efficient autonomous agent design.
Agentic AI
New research proposes proxy state-based evaluation for multi-turn tool-calling LLM agents, addressing the challenge of scalable reward verification in complex agentic workflows.
LLM
New research introduces ELPO, a training method that teaches LLMs to learn from irrecoverable errors in tool-integrated reasoning chains, improving agent capabilities.
LLM Agents
New research introduces Agent-Omit, a reinforcement learning framework that trains LLM agents to selectively omit unnecessary reasoning steps and observations, dramatically improving computational efficiency.
LLM Infrastructure
New research introduces PROTEUS, a reinforcement learning framework using Lagrangian optimization to intelligently route requests across multiple LLMs while meeting strict service level agreements.
LLM Research
New research introduces a method to preserve correct reasoning steps while penalizing errors, improving LLM performance through more nuanced reinforcement learning credit assignment.
Voice AI
New research demonstrates how reinforcement learning from AI feedback can optimize spoken dialogue systems using multiple LLM evaluators, reducing dependency on costly human annotations.