02 Apr 2026 22:45

Mitigating prompt injection attacks with a layered defense strategy

Posted by Adam Gavish, Google GenAI Security TeamWith the rapid adoption of generative AI, a new wave of threats is emerging across the industry with the aim of manipulating the AI systems themselves. One such emerging attack vector is indirect prompt injections. Unlike direct prompt injections, where an attacker directly inputs malicious commands into a prompt, indirect prompt injections involve hidden malicious instructions within external data sources. These may include emails, documents, or calendar invites that instruct AI to exfiltrate user data or execute other rogue actions. As more governments, businesses, and individuals adopt generative AI to get more done, this subtle yet potentially potent attack becomes increasingly pertinent across the industry, demanding immediate attention and robust security measures.At Google, our teams have a longstanding precedent of investing in a defense-in-depth strategy, including robust evaluation, threat analysis, AI security best practices, AI red-teaming, adversarial training, and model hardening for generative AI tools. This approach enables safer adoption of Gemini in Google Workspace and the Gemini app (we refer to both in this blog as “Gemini” for simplicity). Below we describe our prompt injection mitigation product strategy based on extensive research, development, and deployment of improved security mitigations.A layered security approachGoogle has taken a layered security approach introducing security measures designed for each stage of the prompt lifecycle. From Gemini 2.5 model hardening, to purpose-built machine learning (ML) models detecting malicious instructions, to system-level safeguards, we are meaningfully elevating the difficulty, expense, and complexity faced by an attacker. This approach compels adversa...

Original source: http://security.googleblog.com/2025/06/mitigating-prompt-injection-attacks.html
Publisher: Google Security Blog