AI Security - SkrewAI

AI Security

CREDIT: Certified DNN Ownership Against Model Extraction

New research introduces CREDIT, a certified framework for verifying deep neural network ownership and defending against model extraction attacks through provable security guarantees.

AI Security

IARPA TrojAI Program: Detecting Hidden Backdoors in AI Models

IARPA's TrojAI program releases final report on detecting trojan attacks in AI systems, covering image classifiers, NLP models, and reinforcement learning with implications for synthetic media security.

AI Security

OpenClaw Exposes Critical Prompt Injection Flaw in AI Agents

Security researchers demonstrate how hidden prompt injections in code repositories can hijack AI coding agents like Cline, exposing critical vulnerabilities in agentic AI systems.

AI Security

PBSAI: Multi-Agent Architecture for Enterprise AI Security

New research proposes a multi-agent AI reference architecture for securing enterprise AI deployments, addressing governance challenges in managing AI systems at scale.

AI Security

Prompt Injection Attacks: Critical Security Threat to AI Systems

Prompt injection exploits how LLMs process instructions, enabling attackers to hijack AI behavior. Understanding attack vectors and defenses is essential for secure AI deployment.

LLM Safety

Selective Geometry Control: A New Approach to LLM Safety

New research proposes geometric methods to enhance LLM safety alignment robustness, offering potential improvements for AI systems that moderate synthetic media and deepfake content.

Synthetic Data

New Attack Methods Target Multi-Table Synthetic Data Privacy

Researchers unveil new membership inference attack techniques for multi-table synthetic data, exposing privacy vulnerabilities in relational database anonymization systems.

open-source AI

Chinese AI Models Dominate Open-Source as Western Labs Retreat

Over 175,000 unprotected systems run Chinese AI models as Western labs shift away from open-source, raising security and geopolitical questions for the synthetic media ecosystem.

AI Security

MultiKrum: Defending Distributed AI Training from Byzantine Attac

New research on MultiKrum explores optimal robustness definitions for Byzantine machine learning, critical for securing distributed AI training against adversarial participants.

AI Security

Adversarial AI Explanations: How Attackers Exploit Trust

New research reveals how adversarial attacks can manipulate AI explanation systems to mislead human decision-makers, with critical implications for content authenticity verification.

deepfakes

AI Security Firm Catches Deepfake Job Applicant in Interview

An AI security company's hiring process became a real-world test of deepfake detection when a synthetic candidate attempted to infiltrate through video interviews.

AI Security

Visual Prompt Injection: How Hidden Images Hack AI Systems

Researchers reveal how imperceptible visual perturbations embedded in images can hijack vision-language models, bypassing safety filters and manipulating AI outputs without human detection.