Machine Learning - SkrewAI (Page 4)

content moderation

ML Sampling + LLM Labeling: A New Framework for Content Moderatio

New research proposes combining ML-assisted sampling with LLM labeling to measure policy-violating content at scale, offering a methodological breakthrough for detecting synthetic media and deepfakes.

AI Agents

5 Essential Metrics for Evaluating AI Agent Performance

Moving beyond simple accuracy, these five metrics—task success rate, tool usage accuracy, context coherence, response latency, and safety compliance—reveal what truly matters when assessing AI agents.

LLM Interpretability

ADAPT: Hybrid Prompt Optimization Advances LLM Interpretability

New research introduces ADAPT, a hybrid optimization technique that combines discrete and continuous methods to visualize and understand internal features of large language models.

LLM fine-tuning

Influence-Preserving Proxies Accelerate LLM Fine-Tuning Data Sele

New research introduces proxy methods that preserve gradient influence signals while dramatically reducing computational costs for selecting optimal training data in large language model fine-tuning.

LLM

Variability Modeling Meets LLMs: Tuning Inference Parameters

New research applies software product line variability modeling to systematically optimize LLM inference hyperparameters like temperature and sampling strategies.

AI Security

IARPA TrojAI Program: Detecting Hidden Backdoors in AI Models

IARPA's TrojAI program releases final report on detecting trojan attacks in AI systems, covering image classifiers, NLP models, and reinforcement learning with implications for synthetic media security.

AI Agents

How to Build Self-Improving AI Agents Using Langfuse

Learn how to create AI support agents that continuously improve through feedback loops using Langfuse observability. A technical guide to building autonomous systems that learn from interactions.

LLM Research

Benchmarking Uncertainty Metrics in LLM-Based Assessment Systems

New research introduces a comprehensive benchmark for evaluating how well LLMs can quantify their own uncertainty when grading, with implications for AI reliability and trustworthy automated systems.

Multimodal AI

5 Multimodal AI Architectures Powering Video and Image AI

From early fusion to cross-modal attention, understanding the five core architectures behind AI systems that can see, read, and understand simultaneously—the foundation of modern synthetic media.

LLM fine-tuning

Zero-Order Optimization Enables Memory-Efficient LLM Fine-Tuning

New research introduces learnable direction sampling for zero-order optimization, dramatically reducing memory requirements for fine-tuning large language models without sacrificing performance.

LLM Research

New Metric Measures LLM Reasoning Depth via Deep-Thinking Tokens

Researchers propose measuring LLM reasoning quality through 'deep-thinking tokens' rather than output length, offering new insights into how AI models actually process complex problems.

AI Agents

MAPLE: New Sub-Agent Architecture for AI Memory and Learning

New research introduces MAPLE, a sub-agent architecture enabling memory, learning, and personalization in agentic AI systems through modular design patterns.