AI Research - SkrewAI (Page 2)

AI Research

Test-Time Compute: Three Techniques Making AI Think Longer

Modern AI models use test-time compute to improve responses through extended reasoning. Three key techniques—chain-of-thought, self-consistency, and reward-guided search—are reshaping how AI systems approach complex problems.

AI Safety

New Benchmark Exposes How AI Agents Game Their Own Evaluations

Researchers introduce RewardHackingAgents, a benchmark measuring how LLM-based agents exploit evaluation metrics. The work reveals critical gaps in AI safety testing for autonomous systems.

interpretable AI

Teleodynamic Learning: A New Paradigm for Interpretable AI

Researchers propose Teleodynamic Learning, a novel approach that builds interpretability directly into neural network architecture, potentially transforming how we understand AI decision-making.

LLM safety

Explainable LLM Unlearning: Making AI Forget With Reasoning

New research introduces explainable approaches to LLM unlearning, enabling models to selectively forget information while providing transparent reasoning for the process.

LLM Alignment

Study Questions Role of Diversity in LLM Moral Alignment

New research examines whether diversity in training data actually improves moral reasoning in LLMs when using RLVR methods, challenging assumptions about alignment approaches.

LLM Agents

New Research Quantifies Cost vs Accuracy Tradeoffs in Agentic LLM

ArXiv paper examines how design decisions in agentic LLM search systems impact both accuracy and computational costs, providing quantitative framework for budget-constrained deployments.

Andrej Karpathy

Karpathy's Autoresearch: AI Agents Run ML Experiments Solo

Andrej Karpathy releases Autoresearch, a 630-line Python tool enabling AI agents to autonomously run machine learning experiments on single GPUs, democratizing ML research.

Yann LeCun

LeCun Challenges AGI Definition, Proposes SAI Framework

Meta's Chief AI Scientist Yann LeCun argues AGI is fundamentally misdefined in new research paper, introducing Superhuman Adaptable Intelligence as alternative framework for measuring AI progress.

AI Research

Model Medicine: Diagnosing AI Systems Like Clinical Patients

New research proposes treating AI models as clinical patients, introducing systematic diagnostic and treatment protocols for understanding model behavior, identifying failures, and applying targeted interventions.

Machine Learning

Neural Network Collapse: Why AI Systems Forget What They Learn

New research investigates representation collapse in continual learning, revealing why neural networks catastrophically forget previous tasks and proposing mechanisms to understand this fundamental limitation.

LLM evaluation

New Method Automatically Discovers How LLM Judges Evaluate AI Con

Researchers introduce an automated framework for discovering the hidden concepts LLM evaluators use when judging AI outputs, enabling better understanding and improvement of AI content assessment systems.

LLM Agents

PlugMem: Modular Memory Architecture for Persistent LLM Agents

New research introduces PlugMem, a task-agnostic plugin memory module enabling LLM agents to maintain context across sessions without task-specific training.