Primer on Safety Standards and Regulations for Industrial-Scale AI Development

By AI Safety Fundamentals Team (Published on August 16, 2023)

Key Points

This primer introduces various aspects of safety standards and regulations for industrial-scale AI development: what they are, their potential and limitations, some proposals for their substance, and recent policy developments. Key points are:

Standards are formal specifications of best practices, which can influence regulations. Regulations are requirements established by governments.
Cutting-edge AI development is being done with individual companies spending over $100 million dollars. This industrial scale may enable narrowly targeted and enforceable regulation to reduce the risks of cutting-edge AI development.
Regulation of industrial-scale AI development faces various potential limitations, including the increasing efficiency of AI training algorithms and AI hardware, regulatory capture, and under-resourced regulators. However, these are not necessarily fatal challenges.
AI regulation also faces challenges with international enforcement and competitiveness—these will be discussed further later in this course.
Existing proposals for AI safety practices include: AI model evaluations and associated restrictions, training plan evaluations, information security, safeguarded deployment, and safe training. However, these ideas are technically immature to varying degrees.
As of August 2023, China appears close to setting sweeping regulations on public-facing generative AI, the EU appears close to passing AI regulations that mostly exclude generative AI, and US senators are trying to move forward AI regulation.

What standards and regulations are

Standards are formal specifications of best practices. Many organizations, including industry groups, nonprofits, government agencies, and international organizations publish standards on a wide range of topics. Standards are not automatically legally binding. Still, they can shape norms, and governments sometimes make standards compliance mandatory. Governments can also establish other incentives for standards compliance, such as making compliance a condition for government contracts.

Regulations are requirements established by governments. In the US, UK, and other countries, legislatures pass laws that set high-level rules, and then a government agency is usually tasked with setting the details and enforcing compliance. The agency, in turn, may incorporate a non-governmental standard into regulation.

Why industrial-scale AI development might be a good target for regulation

This article uses “industrial-scale AI development” to refer to the highly expensive AI development that is increasingly prominent. Today, cutting-edge AI development requires vast computing resources. To make AI models like GPT-4 (an advanced version of ChatGPT), AI companies are spending over $100 million, with much of the spending going to thousands of cutting-edge, AI-specialized computer chips. AI spending does not appear likely to stagnate soon; leading AI companies are raising billions of dollars.

Industrial-scale AI development may be a good target for regulation because: it poses disproportionate risks, and narrowly regulating it may be feasible.

Industrial-scale AI development poses disproportionate risks.
- AI systems’ capabilities tend to increase with the amount of computing resources used to develop them—sometimes in unexpected ways. As a result, industrial-scale AI development will tend to produce AI systems that, along with much potential, carry especially severe risks (of both misuse and misalignment).
Industrial-scale AI development can only be afforded by wealthy organizations.
- As a result, regulations on it can avoid affecting tiny startups, most academics, and most customers.
Industrial-scale AI development is arguably a particularly tractable intervention point.
- It may be difficult to contain risks from a dangerous AI model after it is trained. By then, it is relatively easy to make many copies of the model and run them, allowing for a dangerous model to proliferate. Additionally, a trained AI system might be stolen and unleashed by less responsible hackers.
- It may be feasible to regulate industrial-scale AI development through governance of AI hardware. We’ll study this further in upcoming weeks of this course.

Limitations of regulating industrial-scale AI development

One limitation of regulating industrial-scale AI development is that small-scale AI development also poses risks. The scope of these risks may grow over time; AI development that is industrial-scale in some years may later become much less expensive, as companies continue to develop more efficient AI training algorithms and hardware. This may require new solutions. Still, regulating industrial-scale AI development may create opportunities to identify risks and create safeguards before such AI development has widely proliferated. For example, before there is mass proliferation of AI systems with dangerous offensive cyber capabilities, safe use of advanced AI capabilities could help fix software vulnerabilities.

Another limitation is that standards and regulations are often heavily influenced by industry interests—a phenomenon known as regulatory capture. Companies can influence policy by providing biased information, funding their preferred politicians, building relationships with policymakers, and having a “revolving door” of employment with lawmakers and regulators. Regulatory capture might be mitigated by rules limiting conflicts of interests, and by nonprofit stakeholders engaging extensively with policymaking.

Regulators can also struggle with insufficient funding, staff capacity, and authorities. These problems can be mitigated through legislatures adequately resourcing agencies.

To avoid some of these limitations, some have proposed liability as an approach to AI governance. The idea is that, instead of (just) regulating what AI companies do in advance, legislatures can clarify that AI companies have broad financial responsibility for damages caused by their AI systems. This would help incentivize companies to avoid these damages. However, companies are only held liable for amounts they can pay, and they are only held liable after damages occur, so liability would not necessarily prevent AI developers from causing massive, irreversible damage.

Next, we consider what the substance of AI safety standards and regulations could be.

Some existing proposals for industrial-scale AI safety practices

Various ideas have been proposed for safe AI development. These ideas could be concretized into voluntary actions, formal standards, and/or regulations. This section describes some prominent proposals (several of which are increasingly seen as best practices). As a caveat, none of the below proposals are technically mature as of August 2023, though there are at least partially useful methodologies in all these areas.

Model evaluations and conditional restrictions:
- Large AI models could be tested by independent auditors during development, before deployment, and before updates. Evaluations could test for dangerous capabilities (e.g. ability to help users acquire bioweapons, or ability to deceive users) and dangerous propensities (e.g. propensity to cause harm or resist control).
- The results of these tests—in the context of an AI developer’s plans for mitigating risks—could determine whether or how a model may be further trained or deployed. (Note: Further training could be dangerous, even if deployment is regulated, because a non-public model could be deployed by hackers, or staff may use a non-public model in dangerous ways.)
- As of August 2023, technical methods for these evaluations are not yet mature, but there is ongoing research to develop evaluations, especially capabilities evaluations.
Pre-development threat assessments and gradual scaling: In contrast to evaluations of AI models, plans for training AI models could also be assessed. (See the above note for why restrictions on public deployment may be insufficient.) However, current methodologies for pre-development threat assessments are technically immature and mostly ad-hoc. One relatively straightforward basis for evaluation could be the speed at which models are being scaled (in terms of training compute); very rapid scaling may leave little time to identify new risks.
Information security (e.g. cybersecurity): AI developers and other companies in the AI supply chain could implement strong measures to protect their intellectual property and their models from theft. Such theft could proliferate dangerous AI models or the ability to develop them. Securing AI developers from hackers poses enormous challenges, due to a very broad attack surface and the potential need to defend against powerful state actors. As a starting point, AI developers could refrain from open-sourcing dangerous models.
Monitored, staged, and safeguarded deployment: To mitigate risks in deployment, AI developers can deploy AI through an API. (This means that users do not directly interact with the AI model; instead, AI developers serve as intermediaries.) API-based deployment enables AI developers to: monitor for dangerous prompts or dangerous outputs, limit the speed at which a new AI model is rolled out, and block dangerous interactions between users and the AI model.
Safe AI training methods:
- AI systems could eventually be trained in ways that make them reliably refuse to assist with catastrophic misuse, while also being aligned with human interests.
- However, as “AI godfather” Yoshua Bengio explained, “we do not yet know how to make an AI agent controllable.” Similarly, OpenAI has stated that “We need new scientific and technical breakthroughs” for “steering or controlling a potentially superintelligent AI, and preventing it from going rogue.” Until such breakthroughs are made, safety will require not developing and deploying very powerful AI models; instead, models and training plans can be evaluated to press pause when needed.

While the measures listed above aim to directly improve safety, there have also been proposals for measures that would indirectly improve safety. For example, whistleblower protection, licensing requirements, and required disclosures to regulators may facilitate enforcement of safety regulations. Additionally, some organizational features of an AI developer could advance safety. These include staff with dedicated safety roles (including at senior levels), internal audit functions, and a strong safety culture.

We have seen some reasons to think that AI safety standards and regulations may be effective policy, technically feasible, and politically tractable. In short, standards and regulations may be able to effectively and narrowly target industrial-scale AI development, they could rely on various technical methods, and they could benefit from the substantial interest in AI regulation that legislators across the globe have shown. Still, much work remains in developing technical safety and evaluation methods, standardizing these methods, and regulating industrial-scale AI development.