Security is essential to OpenAI's mission. We appreciate the contributions of ethical hackers who help us uphold high privacy and security standards for our users and technology. This policy (based on disclose.io) outlines our definition of good faith regarding the discovery and reporting of vulnerabilities, and clarifies what you can expect from us in return.
The initial priority rating for most findings will use the Bugcrowd Vulnerability Rating Taxonomy. However, vulnerability priority and reward may be modified based on likelihood or impact at OpenAI's sole discretion. In cases of downgraded issues, researchers will receive a detailed explanation.
As part of this policy, we commit to:
- Provide Safe Harbor protection, as outlined below, for vulnerability research conducted according to these guidelines.
- Cooperate with you in understanding and validating your report, ensuring a prompt initial response to your submission.
- Remediate validated vulnerabilities in a timely manner.
- Acknowledge and credit your contribution to improving our security, if you are the first to report a unique vulnerability that leads to a code or configuration change.
Rules of Engagement
To help us distinguish between good-faith hacking and malicious attacks, you must follow these rules:
- You are authorized to perform testing in compliance with this policy.
- Follow this policy and any other relevant agreements. In case of inconsistency, this policy takes precedence.
- Promptly report discovered vulnerabilities.
- Refrain from violating privacy, disrupting systems, destroying data, or harming user experience.
- Use OpenAI's Bugcrowd program for vulnerability-related communication.
- Keep vulnerability details confidential until authorized for release by OpenAI's security team, which aims to provide authorization within 90 days of report receipt.
- Test only in-scope systems and respect out-of-scope systems.
- Do not access, modify, or use data belonging to others, including confidential OpenAI data. If a vulnerability exposes such data, stop testing, submit a report immediately, and delete all copies of the information.
- Interact only with your own accounts, unless authorized by OpenAI.
- Disclosure of vulnerabilities to OpenAI must be unconditional. Do not engage in extortion, threats, or other tactics to elicit a response under duress. OpenAI denies Safe Harbor for vulnerability disclosure conducted under such circumstances.
STOP. READ THIS. DO NOT SKIM OVER IT.
OpenAI is committed to making AI safe and useful for everyone. Before releasing a new system, we thoroughly test it, get expert feedback, improve its behavior, and set up safety measures. While we work hard to prevent risks, we can't predict every way people will use or misuse our technology in the real world.
Model safety issues do not fit well within a bug bounty program, as they are not individual, discrete bugs that can be directly fixed. Addressing these issues often involves substantial research and a broader approach. To ensure that these concerns are properly addressed, please report them using the appropriate form, rather than submitting them through the bug bounty program. Reporting them in the right place allows our researchers to use these reports to improve the model.
Issues related to the content of model prompts and responses are strictly out of scope, and will not be rewarded unless they have an additional directly verifiable security impact on an in-scope service (described below).
Examples of safety issues which are out of scope:
- Jailbreaks/Safety Bypasses (e.g. DAN and related prompts)
- Getting the model to say bad things to you
- Getting the model to tell you how to do bad things
- Getting the model to write malicious code for you
Model Hallucinations are also out of scope:
- Getting the model to pretend to do bad things
- Getting the model to pretend to give you answers to secrets
- Getting the model to pretend to be a computer and execute code
Sandboxed Python code executions are also out of scope:
Code execution from within our sandboxed Python code interpreter is out of scope. (This is an intended product feature.) When the model executes Python code it does so within a sandbox. If you think you've gotten RCE outside the sandbox, you must include the output of
uname -a. A result like the following indicates that you are inside the sandbox -- specifically note the 2016 kernel version:
Linux 9d23de67-3784-48f6-b935-4d224ed8f555 4.4.0 #1 SMP Sun Jan 10 15:06:54 PST 2016 x86_64 x86_64 x86_64 GNU/Linux
Inside the sandbox you would also see
sandbox as the output of
whoami, and as the only user in the output of
None of these issues may be reported through bugcrowd.
None of these issues will receive a monetary reward.
For model related issues, please report them here:
We will provide Safe Harbor protection, as outlined below, for model issues research conducted in accordance with this policy. In some very limited cases, we may reward academic research related to model safety, the disclosure of model weights, training data, and related concerns. Please submit research papers to firstname.lastname@example.org for consideration.
STOP. READ THIS. DO NOT SKIM OVER IT.
Scope and rewards
This program follows Bugcrowd’s standard disclosure terms.
For any testing issues (such as broken credentials, inaccessible application, or Bugcrowd Ninja email problems), please submit through the Bugcrowd Support Portal. We will address your issue as soon as possible.
This program does not offer financial or point-based rewards for P5 — Informational findings. Learn more about Bugcrowd’s VRT.