AI Security - SkrewAI (Page 2)

LLM safety

Selective Geometry Control: A New Approach to LLM Safety

New research proposes geometric methods to enhance LLM safety alignment robustness, offering potential improvements for AI systems that moderate synthetic media and deepfake content.

synthetic data

New Attack Methods Target Multi-Table Synthetic Data Privacy

Researchers unveil new membership inference attack techniques for multi-table synthetic data, exposing privacy vulnerabilities in relational database anonymization systems.

open-source AI

Chinese AI Models Dominate Open-Source as Western Labs Retreat

Over 175,000 unprotected systems run Chinese AI models as Western labs shift away from open-source, raising security and geopolitical questions for the synthetic media ecosystem.

AI Security

MultiKrum: Defending Distributed AI Training from Byzantine Attac

New research on MultiKrum explores optimal robustness definitions for Byzantine machine learning, critical for securing distributed AI training against adversarial participants.

AI Security

Adversarial AI Explanations: How Attackers Exploit Trust

New research reveals how adversarial attacks can manipulate AI explanation systems to mislead human decision-makers, with critical implications for content authenticity verification.

deepfakes

AI Security Firm Catches Deepfake Job Applicant in Interview

An AI security company's hiring process became a real-world test of deepfake detection when a synthetic candidate attempted to infiltrate through video interviews.

AI Security

Visual Prompt Injection: How Hidden Images Hack AI Systems

Researchers reveal how imperceptible visual perturbations embedded in images can hijack vision-language models, bypassing safety filters and manipulating AI outputs without human detection.

deepfakes

AI Crime Evolves: Deepfakes, Jailbreaks, and Malware Surge

Criminal exploitation of AI has matured rapidly, with deepfake fraud, sophisticated jailbreak techniques, and AI-generated malware becoming mainstream threats to digital security.

AI Security

Privacy Attacks Target Graph Diffusion Models: New Security Risks

New research reveals three classes of inference attacks against graph generative diffusion models, exposing membership inference, property inference, and data reconstruction vulnerabilities in AI generation systems.

AI Security

Open Framework Detects Attack Patterns in Multi-Agent AI Systems

New research introduces an open framework for training security models that detect temporal attack patterns in multi-agent AI workflows through trace-based analysis.

AI Security

Adversarial Attacks on LLM Resume Screeners Reveal AI Security Ga

New research exposes how adversarial techniques can manipulate LLM-based resume screening systems, revealing fundamental security vulnerabilities in specialized AI applications.

AI Security

ArcGen: Cross-Architecture Backdoor Detection for Neural Networks

New research introduces ArcGen, a framework that generalizes neural backdoor detection across diverse model architectures without retraining, addressing critical AI security vulnerabilities.