LLM safety
Global Subspace Projection: A New Approach to LLM Detoxification
Researchers propose a novel technique for removing toxic behaviors from large language models by projecting out malicious representations in the model's latent space.
LLM safety
Researchers propose a novel technique for removing toxic behaviors from large language models by projecting out malicious representations in the model's latent space.
LLM Alignment
Researchers introduce ECLIPTICA, a framework using Contrastive Instruction-Tuned Alignment (CITA) to enable dynamic switching between aligned and unaligned LLM behaviors for safety research.
Transformer Architecture
A deep dive into the transformer architecture that powers everything from ChatGPT to AI video generators. Understanding attention mechanisms and why this design revolutionized machine learning.
AI Agents
AI agents often fail after several steps due to error compounding and context degradation. Deep Agents architecture introduces new mechanisms to maintain coherence across extended task execution.
deepfake detection
HONOR will showcase AI-powered deepfake detection technology at MWC 2025, marking a significant push to bring synthetic media authentication directly to consumer smartphones.
deepfake regulation
The UK government is preparing to enforce legislation targeting companies that provide tools for creating AI deepfakes, marking a significant regulatory shift in synthetic media governance.
deepfakes
The Deepfake Summit debuts as the inaugural Prism Project event, bringing together fraud prevention and identity verification leaders to address AI-driven synthetic media threats.
deepfake detection
New World Economic Forum-backed report details how synthetic media threatens Know Your Customer verification systems, highlighting urgent need for enhanced deepfake detection in financial identity processes.
LLM Security
Researchers reveal how large language models can be manipulated with fabricated evidence, raising critical questions about AI reliability and the spread of misinformation through synthetic content.
LLM Quantization
New quantization method FLRQ achieves up to 2.5x faster compression of large language models while maintaining accuracy through flexible low-rank matrix approximation techniques.
deepfakes
The UK government pressures Elon Musk's X platform to address AI-generated deepfakes created by Grok chatbot, marking escalating regulatory scrutiny of synthetic media on social platforms.
deepfake detection
New UNITE framework combines facial, audio, and temporal analysis for comprehensive deepfake detection, moving beyond single-modality approaches that struggle with advanced synthetic media.