LLM Interpretability
LLM Self-Explanations Can Predict Model Behavior, Study Finds
New research presents evidence that LLM self-explanations can help predict model behavior, offering a positive case for faithfulness in AI interpretability.
LLM Interpretability
New research presents evidence that LLM self-explanations can help predict model behavior, offering a positive case for faithfulness in AI interpretability.
LLM evaluation
New research proposes PeerRank, a system where LLMs evaluate each other through web-grounded peer review with built-in bias controls, potentially transforming how we benchmark AI models.
AI Agents
New research introduces synthetic semantic information gain rewards to optimize when AI agents should retrieve external knowledge, improving reasoning efficiency without sacrificing accuracy.
LLM Research
New research introduces DAJ, a data-reweighting approach for LLM judges that improves test-time scaling in code generation by better identifying correct solutions.
LLM evaluation
New research reveals smaller language models can outperform large LLMs at evaluation tasks through semantic capacity asymmetry, challenging the dominant LLM-as-a-Judge paradigm.
AI Agents
New research presents a framework for building capable small language model agents using synthetic tasks, simulated environments, and structured rubric-based rewards—democratizing agentic AI development.
LLM Security
Researchers discover that simulating intoxicated speech patterns can bypass AI safety guardrails. The 'In Vino Veritas' attack reveals fundamental weaknesses in how LLMs handle linguistic degradation.
LoRA
Learn how Low-Rank Adaptation lets you customize billion-parameter AI models on standard laptops—the same technique powering custom deepfakes and AI video generation.
LLM Reasoning
New research reveals that even frontier AI models like GPT-4 and Claude struggle with basic reasoning puzzles, exposing fundamental limitations in how large language models process logic.
AI Benchmarks
New benchmark evaluates whether frontier AI models can perform PhD-level scientific research tasks, revealing significant gaps between current capabilities and expert human performance.
AI Safety
New arXiv research investigates how varying levels of information access affect LLM monitors' ability to detect sabotage, with implications for AI safety and oversight systems.
LLM Research
New research introduces test-time policy evolution to scale LLM reasoning without additional training, enabling models to dynamically improve their problem-solving strategies during inference.