Discover more from Navigating AI Risks
#9: The World is Getting Worried + Corporate Structures + Regulatory Challenges in the US and China
Welcome to Navigating AI Risks, where we explore how to govern the risks posed by transformative artificial intelligence.
In this 9th edition, you’ll learn about the UN Security Council’s discussions on AI, Anthropic’s new corporate governance structure, China’s new rules for generative AI, OpenAI’s first major US regulatory challenge, and ‘instrumental convergence’.
In the Loop
AI Risk at the UN Security Council: Countries Are Getting Worried
For the first time ever, the United Nations’ highest institution, the Security Council, dedicated an entire session to the risks (and benefits) of artificial intelligence. The UN Secretary-General, UK Foreign Secretary, Anthropic’s Jack Clark, and Chinese Professor Zeng Yi made remarks warning about the challenges of AI, including extinction, for international stability and calling for coordinated action. After these opening remarks, representatives from various nations took the stand (including China, Russia, Japan, and the United States).
Concerns over using AI in nuclear control-and-command, lethal autonomous weapons threats from non-state actors, disinformation, and the need for international governance dominated discussions. China said it supported the UN’s role in these challenges and wants to ensure all countries, not only developed ones, have a say in the evolution of this burgeoning international regime around AI (Roxana Radu, lecturer at Oxford University, wrote a helpful overview of different countries’ interventions). Chinese scientist Zeng Yi notably mentioned AI extinction risks: “In the long term, we haven’t given superintelligence any practical reasons why they should protect humans”, showing that such concerns are crossing borders.
One concrete deliverable will be the creation of a multistakeholder High-Level Advisory Body for Artificial Intelligence, tasked with coming up with options for global AI governance arrangements. Aside from this, it’s unclear what global institutions will emerge.
The UN Secretary-General mentioned three potential models: the International Civil Aviation Organization (ICAO), the International Atomic Energy Agency (IAEA), and the International Panel on Climate Change (IPCC). These organizations all have very different designs, although the ICAO and IAEA have roughly similar functions, that of setting and monitoring safety standards in their respective industries. A recent paper on ‘International Institutions for Advanced AI’ helps better understand what the different options could look like.
A note: the UN Secretary General has no autonomous authority. New rules or institutions, even created under UN auspices, have to be signed by each individual state. Still, he’s an influential voice, and international organizations have an important role in 'orchestrating' compromises.
Even an institution like the International Panel on Climate Change, though it has no power to set or enforce regulations, could be helpful. The IPCC has played a big role in fostering coordinated international efforts to reign in climate change. It would be much easier to create, too, since it doesn’t imply significant delegation of powers like for institutions such as the IAEA or ICAO. It’s worth noting that while it hasn’t really taken that role, the initial mission of the Global Partnership on AI (GPAI) as envisioned by France, who launched the organization with Canada, was to be the IPCC for AI.
An important deadline will be next year’s UN-sponsored ‘Summit for the Future’, designed to strengthen global governance for present and future generations, and perhaps an opportunity to take steps towards new AI governance arrangements.
Why Corporate Governance Matters for AI
An in-depth (must-read) article delving into the workings of Anthropic, one of the world’s leading AI labs, reveals that the company is changing its corporate structure. Its board members will be nominated by people with no financial stake in the company, and thus, may be more driven by motives other than profit.
Why does this matter? Well, because everyone’s worried that if AI labs race with each other, everyone loses in terms of safety. Companies concerned with gaining a short-term competitive advantage to make huge profits might be less likely to take the time necessary to assess the risks of their latest AI model (according to GPT-4’s system card, OpenAI spent “six months on safety research, risk assessment, and iteration” before releasing the model).
Anthropic, already a ‘Public Benefit Corporation’, will soon be governed by what they call a ‘Long-Term Benefit Trust’:
“The trust will hold a special class of stock in Anthropic that cannot be sold and does not pay dividends, meaning there is no clear way to profit on it. The Long-Term Benefit Trust will ultimately have the right to elect, and remove, three of Anthropic’s five corporate directors, giving the trust long-run, majority control over the company. [...] The trust’s initial trustees were chosen by “Anthropic’s board and some observers” [...] But in the future, the trustees will choose their own successors, and Anthropic executives cannot veto their choices. [...]”
This novel corporate structure is one among others in the AI industry. OpenAI’s for-profit has a ‘capped-profit’ model whereby investors can “only” get a 100x return on their initial investment, beyond which profits are passed to the parent, non-profit organization (OpenAI Inc.). OpenAI’s charter, published 3 years after the company’s inception, also contains a so-called "stop-and-assist" clause (discussed in our last edition):
We are concerned about late-stage AGI development becoming a competitive race without time for adequate safety precautions. Therefore, if a value-aligned, safety-conscious project comes close to building AGI before we do, we commit to stop competing with and start assisting this project.
Interestingly, Dario Amodei, who helped draft this clause at OpenAI, did not use it again at Anthropic, the company he co-founded and is now the CEO of.
While corporate governance is no substitute for legal requirements, and presents the risk of being gamed, such mechanisms seem on the whole helpful, because they reduce incentives to prioritize profits over safety. In the future, to make sure, safety-conscious regulation could mandate the adoption of certain useful safety procedures and structures, as is common in other fields (such the financial industry's audit committees or Institutional Biosafety Committees for recombinant DNA research).
China’s New AI Rules: Social Stability or Economic Competitiveness?
On July 13, China’s Cyberspace Administration and 6 other regulators jointly released provisional rules for publicly-available generative AI systems (such as ChatGPT). That includes systems that generate text, audio, video, and visual content. The rules will apply starting from August 15 and are, for now, the world’s most stringent; they effectively require Chinese companies to solve technical problems no one knows the answer to, both on the alignment side and on the ethics side.
Despite that, the rules are still significantly softer than those contained in an earlier draft released for public comments in April 2023. That’s a signal that Beijing wants to balance the twin goals of social stability and economic development. According to Matt Sheehan, who closely follows China’s AI ecosystem and regulations at the think-tank Carnegie Endowment, there was a lot of pushback “on the [more stringent] draft from Chinese academics, policy analysts and companies”.
Providers of publicly-available generative AI services will have to submit detailed security assessments to regulators, with fewer obligations for those working on business-to-business applications. Whereas the first version essentially required companies to have perfect datasets for training AI systems (fully unbiased and not obtained by infringing copyright or privacy laws) and generate outputs that perfectly followed “core socialist values”, the new rules only require ‘effective measures’ to these ends. Still, the list of prohibited content is a long one; generative AI systems must not create content that
“incites subversion of state power and the overthrow of the socialist system, endangers national security and interests, damages the image of the country, incites secession from the country, undermines national unity and social stability, promotes terrorism, extremism, national hatred and ethnic discrimination, violence, obscenity and pornography”, or “false and harmful information”
Because we still don’t know how to reliably control the outputs of generative AI systems, this wide range of requirements either points to (i) a regulation that won’t get enforced, (ii) the braindeath of China’s generative AI ecosystem, or (iii) the discovery by Chinese Big Tech of new techniques for AI safety that help ensure systems do what their operators want them to do (unlikely to happen in one month, before the provisional rules enter into force). In any case, these regulations go far beyond what the US and even the EU are considering at the moment in terms of stringency.
Large generative text AI systems remain unavailable to the public in China: OpenAI’s and Google’s models are blocked in the country, and even the AI chatbots of Chinese Big Tech companies are in “trial mode or for business-use only”.
OpenAI’s First Real Regulatory Challenge
The US Federal Trade Commission has launched an investigation against OpenAI through a 20-page letter asking for detailed records and information about the company’s models and processes. Why? For two main reasons: a leak of personal data following a bug last March (that also prompted the service from being interrupted in Italy; NAIR #1), and because ChatGPT’s inaccuracy sometimes leads the model to make defamatory statements. As the Washington Post reports, this “marks the company’s greatest U.S. regulatory threat to date.”
This should not come as a surprise; the FTC has publicly flexed its regulatory muscle several times over the past months, for example through a New York Times op-ed by its Chair Lina Khan warning that the regulatory would “vigorously enforce the laws we are charged with administering”.
Identify all third parties that you have allowed use of or access to your Large Language Models via API
Describe in detail the data you have used to train or otherwise develop [your] Large Language Model [...], including [...] how you obtained the data [and] all sources of the data.
Describe in detail the policies and procedures that the company follows in order to assess risk and safety before the company releases a Large Language Model.
Describe in detail the process of “reinforcement learning through human feedback”. Include [...] examples of questions and answers [and] an explanation of the rating system or method of evaluation for the Large Language Model’s response.
This is but a small fraction of the information requested by the FTC. All this seems to show that the regulator expects OpenAI to have the relevant procedures and policies in place. Otherwise, it may fall short of the consumer protection policies enforced by the FTC. If OpenAI is indeed breaking the law, it may face a variety of penalties, including fines or being forced to delete some of its algorithms. CEO Sam Altman said in a tweet that he will “of course” cooperate with the agency.
The US deepens cooperation with India on semiconductors, AI, and strategic trade.
An executive order, aimed to be released in August, will restrict American companies’ investments in China’s AI, chips, and quantum computing industries. The restrictions apply only to new investments and will likely enter into force in 2024 only.
Biden Administration officials, including Commerce Secretary Gina Raimondo and National Security Council director Jake Sullivan met with semiconductor company executives from Intel, Qualcomm, and Nvidia to discuss supply chains and export controls vis-à-vis China.
The European Commission adopted an “adequacy decision”, essentially saying that the United States ensures a sufficiently adequate level of data protection for the personal data of European citizens to be sent to US companies.
The European Commission, Parliament, and Council of Ministers are negotiating the final provisions of the AI Act. The obligations of high-risk AI system providers and users, as well as provisions on the role of technical standards and on the organizations tasked with carrying out conformity assessment procedures of newly released AI systems, have all been ironed out.
The central government’s crackdown on large tech companies has cost the industry more than $1.1 trillion since it started in late 2020.
The Ministry of Industry and Information Technology announced that it’s planning policies to boost the country’s national computing infrastructure, notably to “drive breakthroughs in generative AI”
Despite a series of export controls on semiconductors and semiconductor manufacturing equipment, Chinese companies continue to have access to “restricted chips through a combination of smuggling and renting through the cloud”.
After announcing that his new company X.AI would build a new large language model, Elon Musk detailed its AI safety plans, saying it would focus on building “curious” and “maximally truth-seeking AI”. Some experts are skeptical; Carina Prunkl from Oxford’s Institute for Ethics in AI said “I certainly wouldn’t bet on an advanced AI system finding humans too interesting to annihilate”.
A study finds that tweets generated by OpenAI’s GPT-3 are more believable than those written by humans; the AI model can produce “accurate information that is easier to understand” but also “more compelling disinformation” than humans.
OpenAI reportedly isn’t releasing the image-interpretation capabilities of its GPT-4 models over concerns about what it would say about people’s faces.
Facebook and Anthropic released, respectively, Llama 2 (in part open-sourced) and Claude 2, an update to their flagship large language model. They both reportedly went through extensive evaluation and red-teaming phase before they were deployed.
For the first time, OpenAI’s ChatGPT lost users, with a reduction in global traffic of ~10%.
Explainer: Instrumental Convergence
As discussed in our last edition, AI systems that possess autonomous goals are more performant than those which don’t. So, for this explainer, let's assume that AI agents are designed to achieve some goals.
Understanding the nature of goals for AI is an essential starting point, and this falls into two categories: final goals and instrumental goals. The final goals are the objectives that the AI ultimately aims to achieve. One beneficial final goal might be “Limit the increase in Earth's average temperature to 1.5 degrees Celsius above pre-industrial levels.” On the other hand, instrumental goals are pursued purely because they help to attain final goals. In our example, our AI system would need to develop several instrumental goals ("Optimize energy consumption in cities," "Design more efficient renewable energy sources," or "Propose and evaluate policies for reducing greenhouse gas emissions").
Enter the Instrumental Convergence Thesis. It posits that across a broad spectrum of final goals and circumstances, there exist several instrumental goals that an AI agent might need to fulfill to increase the chances of realizing its final goal. As an AI's intelligence increases, so does its ability to recognize and pursue these instrumental goals.
Let's have a look at the most important convergent instrumental goals described by Nick Bostrom (director of the Future of Humanity Institute at Oxford University) in his book “Superintelligence: Paths, Dangers, Strategies”:
1. Self-Preservation: “If an agent’s final goals concern the future, then in many scenarios there will be future actions it could perform to increase the probability of achieving its goals. This creates an instrumental reason for the agent to try to be around in the future.” Note that this does not mean that the agent places an intrinsic value on its survival, but it puts an instrumental value on it.
2. Goal-Preservation: An agent is more likely to achieve its goals in the future if it preserves its goals than if those get modified. It entails that to the extent an agent is able to prevent its current goals from being modified, it should do so.
3. Cognitive Enhancement: “Improvements in rationality and intelligence will tend to improve an agent’s decision-making, rendering the agent more likely to achieve its final goals. One would therefore expect cognitive enhancement to emerge as an instrumental goal for a wide variety of intelligent agents. For similar reasons, agents will tend to instrumentally value many kinds of information.”
4. Technological Improvement: To increase efficiency in achieving set tasks, an AI might try to constantly improve its level of technology. This could mean finding better ways of running its operations through more efficient algorithms or developing better engineering technology to more effectively transform inputs into valued outputs.
5. Resource Acquisition: Intelligent agents with mature technologies, like humans, can repurpose basic resources like time, space, matter, and energy towards almost any goal. They could use additional computational resources to run at greater speeds or durations, build additional infrastructure, or build back-ups and defenses for enhanced security. As such, acquiring a lot of resources often emerges as a necessary instrumental goal.
While these shared instrumental goals offer a valuable framework for navigating AI behaviors, we must remember that the specific strategies that superintelligent AI may implement can be both intricate and unforeseeable. We may assume that if anything goes wrong with our AI, we could simply shut it down. But the aim of self-preservation could lead the AI to resist such actions. We might contemplate altering its goal to better align them with ours, but the concept of goal-preservation makes this much harder. The remaining three convergent instrumental goals: cognitive enhancement, technological improvement, and resource acquisition, hint at the AI's potential rapid ascendancy to unprecedented power levels.
Superintelligence: Instrumental convergence (Nick Bostrom - 2014)
Optimal Policies Tend To Seek Power (Alex Turner, Logan Smith, Rohin Shah et al. - NeurIPS 2021)
What We’re Reading
Frontier AI Regulation: Managing Emerging Risks to Public Safety (Anderljung et al.), lays out three building blocks for the regulation of frontier models: (1) standard-setting processes to identify appropriate requirements for frontier AI developers, (2) registration and reporting requirements to provide regulators with visibility into frontier AI development processes, and (3) mechanisms to ensure compliance with safety standards for the development and deployment of frontier AI models
International Institutions for Advanced AI (Ho et al.), identifies a set of governance functions that could be performed at an international level to address these challenges, ranging from supporting access to frontier AI systems to setting international safety standards.
Risk assessment at AGI companies (Koessler & Schuett), an overview of popular risk assessment techniques from other safety-critical industries (finance, aviation, nuclear, and biolabs).
AI Safety and the Age of Dislightenment (Jeremy Howard), argues that proposals for stringent AI model licensing and surveillance will likely be ineffective or counterproductive, concentrating power in unsustainable ways.
The Trajectory of China’s Industrial Policies (UC Institute on Global Conflict and Cooperation), on how Chinese industrial & tech policy shifted from a rationale of economic development to national security.
Compute at Scale - A broad investigation into the data center industry (Pilz & Heim), an overview of the important actors, business models, main inputs, and typical locations of data centers.
Does the world need an arms control treaty for AI? (CyberScoop), on the lessons that the International Atomic Energy Agency holds for global AI governance, and the specific challenges posed by AI.
What Should the Global Summit on AI Safety Try to Accomplish? (Ben Garfinkel & Lennart Heim), on six potential valuable outcomes of the AI Safety Summit organized in December 2023 by the UK.
China’s AI Regulations and How They Get Made (Matt Sheehan), on how China makes AI governance policy, including how Chinese academics, bureaucrats, and journalists shaped China’s new AI regulations.
That’s a wrap for this 9th edition. You can share it using this link. Thanks a lot for reading us!
— Siméon, Henry, & Charles.
Be at the forefront of AI governance. Subscribe below and never miss one edition.
The information provided by OpenAI to the FTC won’t be made public
Including India, Japan, South Korea, Singapore and the Philippines