Prompt Injection Has an Enterprise-Grade Response — Finally
Since large language models gained web access, prompt injection has sat at the top of every serious enterprise security team's concern list. The attack class is straightforward in concept and deeply difficult to solve in practice: hide malicious instructions inside content the model will process — a webpage, a document, a calendar invite — and cause it to act in ways the user never intended. That might mean leaking data, impersonating a user, or taking autonomous actions on connected systems. OpenAI's Lockdown Mode is the most direct response from a frontier lab to date.
Rolling out to ChatGPT Business self-serve accounts and eligible personal accounts across the Free, Go, Plus, and Pro tiers, Lockdown Mode is opt-in and explicitly not meant for general use. OpenAI's own framing is precise: "Lockdown Mode is not intended for everyone. It is designed for people and organizations that handle sensitive data and want stricter protection from data exfiltration risks related to prompt injection." The company has drawn a clear line between the product experience it wants to protect for most users and the hardened posture it now offers for the most sensitive environments.
What Gets Disabled — and What Stays
Activating Lockdown Mode cuts four capabilities: live web browsing is replaced by access to cached content only; retrieval and display of images sourced from the web is blocked; deep research mode, which conducts multi-step autonomous searches, is suspended; and agent mode, which allows ChatGPT to take actions on connected services, is disabled. Each of these represents a channel through which a successful injection could reach out and exfiltrate data or trigger downstream effects.
Critically, what remains active includes image generation, which operates on internal compute and does not pull from the web, and standard conversational interactions based on the model's training data and any content the user directly provides. For organizations that primarily use ChatGPT for drafting, analysis, and knowledge retrieval over internal uploads rather than live web access, the trade-off may be essentially invisible in day-to-day use.
Elevated Risk Labels: A Parallel Signal
Alongside Lockdown Mode, OpenAI is introducing Elevated Risk Labels — a companion feature that flags content the model identifies as potentially containing injection attempts without automatically blocking it. This gives security-conscious users a middle path: visibility into suspicious content without the full capability restrictions of Lockdown Mode. For teams that need the web-enabled functionality but want a heads-up when something looks off, the labels offer a meaningful additional signal.
The combination of the two features reflects a layered approach to the problem. Lockdown Mode is the hard control for environments where the cost of a successful injection is simply too high to accept. Elevated Risk Labels are the softer signal for everyone else — a persistent reminder that the threat exists even when it is not being actively blocked. Together they represent OpenAI's first real acknowledgment that its platform requires a security architecture, not just a safety policy.
Honest About the Limits
What stands out about OpenAI's announcement is its candour about what Lockdown Mode cannot do. The company states plainly that even with the feature enabled, "ChatGPT could still be vulnerable to prompt injections — which could, for example, appear in cached web content or in an uploaded file, and could still affect the behavior or accuracy of a response." The stated goal is not elimination but reduction: to reduce the likelihood that sensitive data gets shared in the process.
That honesty matters more than it might seem. Security teams have spent years dealing with vendors who oversell protection and underdeliver on transparency. OpenAI's framing positions Lockdown Mode correctly: as one layer in a defence-in-depth strategy, not a replacement for data classification, access controls, output monitoring, and employee training. The most dangerous scenario is not an imperfect security feature; it is a security feature that causes organisations to stand down other controls on the assumption that the problem is solved.
The Broader Shift in Enterprise AI Security
Lockdown Mode arrives in a landscape where prompt injection has moved from proof-of-concept to documented exploitation. As we noted in our coverage of the Google Gemini voice assistant prompt injection attack — where malicious instructions delivered through phone notifications were used to manipulate the assistant — the attack surface expands every time a model gains a new capability or integration. OpenAI's approach of disabling the highest-risk channels, rather than attempting to detect every possible malicious instruction, reflects a pragmatic assessment of where the state of the art actually stands.
We are at an inflection point in how AI platforms think about security. For most of the past three years, security has been an afterthought bolted on after the capability ship. Lockdown Mode signals a shift: enterprise security is now a first-class product requirement. Every major AI provider will need to follow with equivalent or stronger controls, or accept that regulated-industry customers will route around them to more hardened alternatives.
What Security Teams Should Do Now
For security and IT leaders managing ChatGPT deployments, the immediate action is evaluation. Which user populations handle data sensitive enough to warrant Lockdown Mode? Where are the boundaries between teams that need agent and research capabilities and those that do not? The rollout to self-serve Business accounts means activation is available now — there is no reason to wait for a formal review cycle before running pilots in the highest-risk environments.
Longer term, we expect Lockdown Mode to become a standard line item in AI deployment governance frameworks — the equivalent of data loss prevention policies in the pre-AI era. Organisations that establish the governance muscle now, including the policies, user training, and monitoring infrastructure to make controls like this effective, will be substantially better positioned when regulators begin mandating equivalent protections. That moment is closer than most security leaders currently assume.



