How Adversarial Attacks Circumvent LLM Safety Systems
Researchers detail how prompt injection, jailbreaking, and gradient-based attacks systematically defeat the layered safety mechanisms designed to keep large language models aligned and secure.
Researchers detail how prompt injection, jailbreaking, and gradient-based attacks systematically defeat the layered safety mechanisms designed to keep large language models aligned and secure.
As deepfake cyberattacks grow more sophisticated, security experts argue that traditional perimeter defenses are insufficient. A trust-centric approach may offer better protection against AI-generated threats.
Startup Positron secures $230M Series B to develop alternative AI chips, potentially reshaping the hardware landscape that powers video generation and synthetic media systems.
New research proposes PeerRank, a system where LLMs evaluate each other through web-grounded peer review with built-in bias controls, potentially transforming how we benchmark AI models.
An AI security company's hiring process became a real-world test of deepfake detection when a synthetic candidate attempted to infiltrate through video interviews.
Pindrop integrates its real-time fraud and deepfake defense technology with NICE's CXone and CX AI platform, bringing voice authentication to enterprise contact centers.
Researchers reveal how imperceptible visual perturbations embedded in images can hijack vision-language models, bypassing safety filters and manipulating AI outputs without human detection.
Avast expands its security suite with Deepfake Guard, a real-time AI detection tool designed to protect consumers from synthetic media threats during video calls and content viewing.
New research argues AI systems claiming to be human-centric must demonstrate measurable human understanding capabilities, proposing frameworks for defining and testing these requirements.
New research examines how persuasive content propagates through multi-agent LLM systems, revealing critical insights for AI safety and synthetic influence detection.
New research introduces synthetic semantic information gain rewards to optimize when AI agents should retrieve external knowledge, improving reasoning efficiency without sacrificing accuracy.
A deep dive into engineering production-ready AI agents for healthcare, covering system architecture, MLOps pipelines, safety guardrails, and governance frameworks for high-stakes deployments.