Machine Learning - SkrewAI (Page 8)

LLM Interpretability

LLM Self-Explanations Can Predict Model Behavior, Study Finds

New research presents evidence that LLM self-explanations can help predict model behavior, offering a positive case for faithfulness in AI interpretability.

LLM evaluation

PeerRank: A New Framework for Autonomous LLM Evaluation

New research proposes PeerRank, a system where LLMs evaluate each other through web-grounded peer review with built-in bias controls, potentially transforming how we benchmark AI models.

AI Agents

Semantic Information Gain Rewards for Smarter AI Retrieval

New research introduces synthetic semantic information gain rewards to optimize when AI agents should retrieve external knowledge, improving reasoning efficiency without sacrificing accuracy.

LLM Research

DAJ: Data-Reweighted LLM Judges Improve Code Generation

New research introduces DAJ, a data-reweighting approach for LLM judges that improves test-time scaling in code generation by better identifying correct solutions.

LLM evaluation

Representation-as-a-Judge: Small Models Beat LLMs at Evaluation

New research reveals smaller language models can outperform large LLMs at evaluation tasks through semantic capacity asymmetry, challenging the dominant LLM-as-a-Judge paradigm.

AI Agents

Training Small AI Agents with Synthetic Worlds and Rubric Rewards

New research presents a framework for building capable small language model agents using synthetic tasks, simulated environments, and structured rubric-based rewards—democratizing agentic AI development.

LLM Security

Drunk Prompts: Novel Jailbreak Method Exposes LLM Safety Gaps

Researchers discover that simulating intoxicated speech patterns can bypass AI safety guardrails. The 'In Vino Veritas' attack reveals fundamental weaknesses in how LLMs handle linguistic degradation.

LoRA

LoRA Fine-Tuning: Run Massive AI Models on Consumer Hardware

Learn how Low-Rank Adaptation lets you customize billion-parameter AI models on standard laptops—the same technique powering custom deepfakes and AI video generation.

LLM Reasoning

Why Advanced AI Models Still Fail at Simple Logic Puzzles

New research reveals that even frontier AI models like GPT-4 and Claude struggle with basic reasoning puzzles, exposing fundamental limitations in how large language models process logic.

AI Benchmarks

FrontierScience Benchmark Tests AI on Expert Science Tasks

New benchmark evaluates whether frontier AI models can perform PhD-level scientific research tasks, revealing significant gaps between current capabilities and expert human performance.

AI Safety

Research Explores How Information Access Shapes AI Sabotage Detec

New arXiv research investigates how varying levels of information access affect LLM monitors' ability to detect sabotage, with implications for AI safety and oversight systems.

LLM Research

Policy of Thoughts: Evolving LLM Reasoning at Test Time

New research introduces test-time policy evolution to scale LLM reasoning without additional training, enabling models to dynamically improve their problem-solving strategies during inference.