LLM Alignment - SkrewAI

LLM Alignment

Study Questions Role of Diversity in LLM Moral Alignment

New research examines whether diversity in training data actually improves moral reasoning in LLMs when using RLVR methods, challenging assumptions about alignment approaches.

LLM Alignment

GRADE: New Backpropagation Method Replaces Policy Gradients for L

Researchers introduce GRADE, a technique that replaces traditional policy gradient methods with direct backpropagation for aligning large language models, potentially offering more efficient training.

LLM Alignment

ECLIPTICA: New Framework Enables Switchable LLM Alignment

Researchers introduce ECLIPTICA, a framework using Contrastive Instruction-Tuned Alignment (CITA) to enable dynamic switching between aligned and unaligned LLM behaviors for safety research.