OpenAI has updated its security policy and established a new “security advisory group” that will oversee the technical teams and provide recommendations to the leadership and board about the risks of new models developed by the company.
The new security measures, encapsulated under a “Preparedness Framework,” were posted on OpenAI’s blog. The framework outlines OpenAI’s processes to track, assess, predict, and protect against catastrophic risks posed by increasingly powerful models.
The main objective of the update is to provide a clear path to identify, analyze, and decide what to do about the “catastrophic” risks inherent in the models they are developing. OpenAI defines “catastrophic risk” as any risk that could result in hundreds of billions of dollars in economic damage or lead to the serious harm or death of many individuals.
OpenAI emphasizes that decisions to advance or slow down advances with artificial intelligence must be “guided by science and based on facts”. The company promises to evaluate its AI models and their effects frequently, and “push them to their limits” to identify possible risks.
To assess the security of its models over time, OpenAI will use “scorecards” for the risks of these models in different aspects: cybersecurity, chemical, biological, radiological and nuclear threats (CBRN), persuasion, and model autonomy. Only models with a “medium” rating can be launched to the public, and only those that do not reach “critical” risk can continue to be developed.
The new security advisory group will evaluate these analyses carried out on models under development. This team, called Preparedness, is led by Aleksander Madry, the director of MIT’s Center for Deployable Machine Learning. The team’s mission is to assess, evaluate, and probe AI models to protect against what OpenAI describes as “catastrophic risks”.
This move by OpenAI is a significant step towards reinforcing the safety, security, and trustworthiness of AI technology2. It also highlights the company’s commitment to advancing meaningful and effective AI governance, both in the US and around the world2.