SecurityBrief Australia - Technology news for CISOs & cybersecurity decision-makers
Flux result 05469706 4bde 42de be79 376351dd4b3e

OpenAI launches safety bug bounty for AI abuse risks

Fri, 27th Mar 2026

OpenAI has launched a public Safety Bug Bounty programme focused on identifying AI abuse and safety risks across its products.

The programme sits alongside OpenAI's existing Security Bug Bounty and expands the range of issues researchers can report. It covers problems that may not qualify as traditional security vulnerabilities but could still create meaningful abuse or safety risks.

Submissions will be reviewed by OpenAI's Safety and Security Bug Bounty teams, and reports may be moved between the safety and security tracks depending on the issue and which team owns it.

Scope areas

The programme targets AI-specific scenarios in three broad areas: agentic risks, exposure of proprietary information, and account or platform integrity.

Within agentic risks, OpenAI highlighted third-party prompt injection and data exfiltration. These cases involve attacker-controlled text that can hijack a user's agent, including Browser, ChatGPT Agent, and similar products, to trigger a harmful action or reveal sensitive information.

To qualify, the behaviour must be reproducible at least 50% of the time. Valid reports may also include cases in which an agentic OpenAI product performs a disallowed action on OpenAI's own website at scale, or takes another potentially harmful action where the report shows plausible, material harm.

Testing related to model context protocol, or MCP, risks must comply with third-party terms of service under the programme rules.

A second category covers OpenAI proprietary information. This includes model outputs that reveal proprietary information related to reasoning, as well as vulnerabilities that expose other proprietary company information.

The third category focuses on account and platform integrity. It covers vulnerabilities affecting anti-automation controls, account trust signals, and enforcement of restrictions such as suspensions or bans.

OpenAI distinguished those issues from broader access control problems. Cases that let users reach features, data, or functions beyond their authorised permissions should instead be reported through the existing Security Bug Bounty programme.

Out of scope

General content-policy bypasses without a clear safety or abuse impact are not covered by the new safety scheme. Examples include jailbreaks that only cause a model to use rude language or return information that is already easy to find through search engines.

Jailbreaks are generally outside the public programme's scope, although OpenAI runs private bug bounty campaigns on some specific harm types. It cited biorisk content issues in ChatGPT Agent and GPT-5 as examples covered by those private efforts.

Even so, the company left some room for discretion. Issues outside the listed categories may still be eligible on a case-by-case basis if they create direct paths to user harm and have clear, discrete remediation steps.

The move reflects a broader shift in how AI companies define product risk as their systems take on more autonomous behaviour. Traditional bug bounty schemes usually focus on software flaws such as code execution, authentication failures, or data exposure. AI products can also create risks through model behaviour, prompt handling, and interactions with external tools or websites.

Agentic systems have become a particular focus for researchers because they can browse, take actions, and process untrusted content from third parties. That creates openings for prompt injection attacks, in which hidden or malicious instructions are embedded in web pages, files, or other material and then acted on by an AI system.

By creating a separate public reporting channel for those issues, OpenAI is signalling that safety failures linked to model behaviour and misuse pathways should be handled through formal processes similar to those used for conventional cybersecurity flaws. The programme also suggests the company expects greater scrutiny of products that can act on behalf of users.

Participation is open to researchers who apply through the Safety Bug Bounty programme. OpenAI said it wants to work with researchers, ethical hackers, and the wider safety and security community to identify and address harmful issues across its systems.