@Zach Stein-Perlman, great work on this. I would be interested in you brainstorming some questions that have to do with the lab's stances toward (government) AI policy interventions.
After a quick 5 min brainstorm, here are some examples of things that seem relevant:
I imagine there's a lot more in this general category of "labs and how they are interacting with governments and how they are contributing to broader AI policy efforts", and I'd be excited to see AI Lab Watch (or just you) dive into this more.
I can imagine this growing into the default reference that people use when talking about whether labs are behaving responsibly.
I hope that this resource is used as a measure of relative responsibleness, and this doesn't get mixed up with absolute responsibleness. My understanding is that the resource essentially says, "here's some things that would be good– let's see how the labs compare on each dimension." The resource is not saying "if a lab gets a score above X% on each metric, then we are quite confident that the lab will not cause an existential catastrophe."
Moreover, my understanding is that the resource is not taking a position on whether or not it is "responsible"– in some absolute sense– for a lab to be scaling toward AGI in our current world. I see the resource as saying "conditional on a lab scaling toward AGI, are they doing so in a way that is relatively more/less responsible compared to the others that are scaling toward AGI."
This might be a pedantic point, but I think it's an important one to emphasize– a lab can score in 1st place and still present a risk to humanity that reasonable people would still deem unacceptable & irresponsible (or put differently, a lab can score in 1st place and still produce a catastrophe).
Agree with lots of this– a few misc thoughts [hastily written]:
I agree that more research on this could be useful. But I think it would be most valuable to focus less on "is X in the Overton Window" and more on "is X written/explained well and does it seem to have clear implications for the target stakeholders?"
I'm not sure who you've spoken to, but at least among the AI policy people who I talk to regularly (which admittedly is a subset of people who I think are doing the most thoughtful/serious work), I think nearly all of them have thought about ways in which regulation + regulatory capture could be net negative. At least to the point of being able to name the relatively "easy" ways (e.g., governments being worse at alignment than companies).
I continue to think people should be forming alliances with those who share similar policy objectives, rather than simply those who belong in the "I believe xrisk is a big deal" camp. I've seen many instances in which the "everyone who believes xrisk is a big deal belongs to the same camp" mentality has been used to dissuade people from communicating their beliefs, communicating with policymakers, brainstorming ideas that involve coordination with other groups in the world, disagreeing with the mainline views held by a few AIS leaders, etc.
The cultural pressures against policy advocacy have been so strong that it's not surprising to see folks say things like "perhaps our groups are no longer natural allies" now that some of the xrisk-concerned people are beginning to say things like "perhaps the government should have more of a say in how AGI development goes than in status quo, where the government has played ~0 role and ~all decisions have been made by private companies."
Perhaps there's a multiverse out there in which the AGI community ended up attracting govt natsec folks instead of Bay Area libertarians, and the cultural pressures are flipped. Perhaps in that world, the default cultural incentives pushed people heavily brainstorming ways that markets and companies could contribute meaningfully to the AGI discourse, and the default position for the "AI risk is a big deal" camp was "well obviously the government should be able to decide what happens and it would be ridiculous to get companies involved– don't be unilateralist by going and telling VCs about this stuff."
I bring up this (admittedly kinda weird) hypothetical to point out just how skewed the status quo is. One might generally be wary of government overinvolvement in regulating emerging technologies yet still recognize that some degree of regulation is useful, and that position would likely still push them to be in the "we need more regulation than we currently have" camp.
As a final note, I'll point out to readers less familiar with the AI policy world that serious people are proposing lots of regulation that is in between "status quo with virtually no regulation" and "full-on pause." Some of my personal favorite examples include: emergency preparedness (akin to the OPPR), licensing (see Romney), reporting requirements, mandatory technical standards enforced via regulators, and public-private partnerships.
I'm interested in writing out somewhat detailed intelligence explosion scenarios. The goal would be to investigate what kinds of tools the US government would have to detect and intervene in the early stages of an intelligence explosion.
If you know anyone who has thought about these kinds of questions, whether from the AI community or from the US government perspective, please feel free to reach out via LessWrong.
To what extent would the organization be factoring in transformative AI timelines? It seems to me like the kinds of questions one would prioritize in a "normal period" look very different than the kinds of questions that one would prioritize if they place non-trivial probability on "AI may kill everyone in <10 years" or "AI may become better than humans on nearly all cognitive tasks in <10 years."
I ask partly because I personally would be more excited of a version of this that wasn't ignoring AGI timelines, but I think a version of this that's not ignoring AGI timelines would probably be quite different from the intellectual spirit/tradition of FHI.
More generally, perhaps it would be good for you to describe some ways in which you expect this to be different than FHI. I think the calling it the FHI of the West, the explicit statement that it would have the intellectual tradition of FHI, and the announcement right when FHI dissolves might make it seem like "I want to copy FHI" as opposed to "OK obviously I don't want to copy it entirely I just want to draw on some of its excellent intellectual/cultural components." If your vision is the latter, I'd find it helpful to see a list of things that you expect to be similar/different.)
I would strongly suggest considering hires who would be based in DC (or who would hop between DC and Berkeley). In my experience, being in DC (or being familiar with DC & having a network in DC) is extremely valuable for being able to shape policy discussions, know what kinds of research questions matter, know what kinds of things policymakers are paying attention to, etc.
I would go as far as to say something like "in 6 months, if MIRI's technical governance team has not achieved very much, one of my top 3 reasons for why MIRI failed would be that they did not engage enough with DC people//US policy people. As a result, they focused too much on questions that Bay Area people are interested in and too little on questions that Congressional offices and executive branch agencies are interested in. And relatedly, they didn't get enough feedback from DC people. And relatedly, even the good ideas they had didn't get communicated frequently enough or fast enough to relevant policymakers. And relatedly... etc etc."
I do understand this trades off against everyone being in the same place, which is a significant factor, but I think the cost is worth it.
I do think evaporative cooling is a concern, especially if everyone (or a very significant amount) of people left. But I think on the margin more people should be leaving to work in govt.
I also suspect that a lot of systemic incentives will keep a greater-than-optimal proportion of safety-conscious people at labs as opposed to governments (labs pay more, labs are faster and have less bureaucracy, lab people are much more informed about AI, labs are more "cool/fun/fast-paced", lots of govt jobs force you to move locations, etc.)
I also think it depends on the specific lab– EG in light of the recent OpenAI departures, I suspect there's a stronger case for staying at OpenAI right now than for DeepMind or Anthropic.
I think now is a good time for people at labs to seriously consider quitting & getting involved in government/policy efforts.
I don't think everyone should leave labs (obviously). But I would probably hit a button that does something like "everyone at a lab governance team and many technical researchers spend at least 2 hours thinking/writing about alternative options they have & very seriously consider leaving."
My impression is that lab governance is much less tractable (lab folks have already thought a lot more about AGI) and less promising (competitive pressures are dominating) than government-focused work.
I think governments still remain unsure about what to do, and there's a lot of potential for folks like Daniel K to have a meaningful role in shaping policy, helping natsec folks understand specific threat models, and raising awareness about the specific kinds of things governments need to do in order to mitigate risks.
There may be specific opportunities at labs that are very high-impact, but I think if someone at a lab is "not really sure if what they're doing is making a big difference", I would probably hit a button that allocates them toward government work or government-focused comms work.
Right now, I think one of the most credible ways for a lab to show its committment to safety is through its engagement with governments.
I didn’t mean to imply that a lab should automatically be considered “bad” if its public advocacy and its private advocacy differ.
However, when assessing how “responsible” various actors are, I think investigating questions relating to their public comms, engagement with government, policy proposals, lobbying efforts, etc would be valuable.
If Lab A had slightly better internal governance but lab B had better effects on “government governance”, I would say that lab B is more “responsible” on net.