A few months back, I read a tweet from Ilya Sutskever (OpenAI Chief Scientist) which stuck in my mind: “the long term goal is to build AGI that loves people the way parents love their children”. Was he serious?
Apparently yes. Later, the Yudkowsky “we’re all doomed” TIME magazine piece came out, and talk of AI regulation was everywhere, and the doomers were dominating the AI discourse.
Merits of the arguments aside, they were winning for two main reasons: first, everyone has been primed by years of Hollywood pessimism to think of killer robots and AI going rogue anyway, making AI doom a compelling viral meme. And second, they were appealing to governments to do something, which is music to the ears of many.
I’m glad to see more debate on these arguments, but it’s mostly on Twitter – e.g. here is Adam D’Angelo arguing that teaching AIs to respect human life will not be challenging. And I’ve never found the arguments for certain doom persuasive. So I wanted to write, publicly, about some ways in which AI might go well. When Angela at WIRED magazine asked if I wanted to write for them, this is what I pitched her. We ended up with my recent essay – The Case for Moral AI.
Writing this piece was challenging. I think it went through at least 10 rewrites. The piece had to work for laypeople; but there’s a lot of context you have to convey to a layperson on this topics. But I also wanted to avoid oversimplification or sensationalism. This balance turned out to be hard.
I decided the Waluigi Effect was an interesting hook into the broader issues – the “you can’t create an angel without also creating a demon, because a demon is just an angel with a ‘not’ before every operation” problem. This phenomenon had only been discussed on Twitter and LessWrong, and I thought a mainstream explanation of it would be fun.
I’d also noticed some funny links between it and psychoanalytic theory (Freud’s theory of taboo relies on a similar mechanism – the brain’s subconscious censor needs to know about what its censoring, which implies many horrors lurking in our subconscious…). So that connection was the beginning of it.
Still, I wasn’t able to go into the weeds on the various technical solutions to alignment as I wanted. Constitutional AI, is a clever mechanism by which we can train AIs, using feedback from AIs, to be more helpful and honest. Paul Christiano’s research agenda includes several approaches to alignment – scalable oversight, checks and balances, and others – all mentioned in the space of a paragraph, but all plausible and interesting paths. And I expect AI simulations to be a big part of testing AI agents.
Given all the constraints, I’m pleased with how it turned out. But I confess I’m excited to go back to plain old blogging, too; writing for a broader audience has trade-offs, and it’s fun to be obscure and let your people find you. Still, you can read it here:
Other miscellaneous notes:
I’m interested in analyses of how AI will affect Kaldor’s Facts. I found this old article (2014!) from Paul Christiano which does a first-pass analysis of this, but I’m curious if economists have better ideas. The facts here affect how we should feel about many of the normative questions involved (e.g. how desirable is widespread open-source AI vs. a future where 3-5 megacorps control AI? Will we really need UBI in the far future?)
After two months in Asia, I’m heading to London on Thursday and will be in the UK for the month of June. Let me know if you’d like to hang out – I’m considering organizing a meetup with Rohit, Matt Clifford, and some other friends.
I’m mostly trying to avoid writing code or doing technical side projects, since I plan to start doing serious idea exploration in Q4 anyway, so these few months are a nice opportunity to catch up on reading. Main readings this month have been Naipaul’s “Enigma of Arrival”, Shakespeare’s “Troilus and Cressida”, Coetzee’s “Waiting for the Barbarians”, Thurber’s “The Years With Ross” (on the early days of the New Yorker, a nice case study in how founder effects persist decades later), and “For Blood and Money”, a fun biopharma case study on the eccentric scientologist billionaire Bob Duggan and his company, Pharmacyclics. Reviews will come in the quarterly post as usual.
Taipei is a vastly underrated city: top-tier food, friendly warm people, and it’s on a tropical island so it’s beautiful. I recommend everyone visit. We went to Taroko Gorge, too, which was extraordinary. By contrast, Japan was choked with tourists this time around; Kyoto reminded me of Venice or Rome in how there were ten tourists for every native. (It’s still possible to have a great trip in Japan if you avoid the beaten path, and Tokyo remains one of the great world cities. But I’d urge everyone to visit Taiwan.)
“serious idea exploration in Q4”
How will this look like?