Some brief thoughts on the Yudkowsky / Wolfram chat

Video here: https://www.youtube.com/watch?v=xjH2B_sE_RQ

I'm reminded when I started writing blog posts, thinking: "Ah, I'll write about this topic, and then I'll never have to make this argument about it again! I'll just have people read this." Naive youth, I soon discovered. And I see the same with almost everyone EY has ever had a discussion with: they haven't read the Sequences, and it's really painful to see EY forced to try to compress parts of them in the space of a few sentences. They really are needed as a ground-level common pool of information and arguments and counter-arguments that have been made, some with conclusions, decades ago.

(In an attempt to preserve my naive dream, I don't care if people don't read anything I've written, none of it's particular good or original, but I don't really want to have a serious discussion about AGI with people who haven't at least read some of the SL4 archives. (I myself was just a lurker.) They're invariably going to raise points that were brought up way back then and resolved one way or another. I see this again and again with people EY talks to. People say "what about..." and there's an answer, it was already thought of and addressed decades ago.)

If I said to Wolfram: "Let's start with you telling me your conclusions, then we'll back up and examine the arguments for any I don't particularly agree with." I feel like he'd respond with something like: "I don't have conclusions." I wouldn't take this as literally true, and maybe I'm being unfair to him in this feeling, so a weaker claim is that I get the sense he doesn't want to preemptively back himself into a corner or stick his neck out (he makes a similar point in the video about doing things in real-time) without some work in private to convince himself. I get that feeling, and it's not even necessarily bad. Nevertheless, having no conclusions is perhaps the most annoying conclusion to deal with in a real-time discussion. I get the sense that he just likes to talk about ideas, and go down whatever fun rabbit holes he finds, and delights if he can tie them back to his long-standing mathematical research. But concluding anything? Saying "This is the answer, we don't need to revisit it"? I think he'd resist such thinking. Again, such attitudes are the most annoying, it's like arguing with shifting sand. Nothing is stable, nothing sticks, and sooner or later you'll find yourself turned right back to a point previously visited.

(Robin Hanson to me embodies the perfect opposite of this kind of thinking: he builds models based on the best information he can get, and believes in them/their outputs. Grabby aliens are real and are coming -- I find his model convincing, I agree with that conclusion, I'm now even less interested in discussing what-if sorts of alternative alien questions that I had fun with as a 9 year old looking up at the stars. Fermi paradox, resolved. Let's move on. I'll still enjoy fun alien stories like Farscape, I just won't take anything resembling them as serious predictions about the universe.)

Wolfram did say during the talk at one point: "I never play games." This is crazy to me, not just because I'm a gamer, but ok, whatever. I have a trivial theory that those who play video games are more accepting about possible doom scenarios. The reason is that video games have "AI" that is obviously and necessarily "agenty". It may not often be very good AI, but it's very clear that the AI actors are their own agents with goals (usually to defeat the player, but sometimes to help them) and sets of actions they can take to achieve those goals. Sometimes those actions are surprising, but even if much of the time they're very statically limited, a player can still perceive the goals and actions that could have resulted in achieving that goal if the player does nothing about it. Even single-player games where the meta-goal of the designer is to have the player win, and so the enemies are constrained to not be maximally efficient at trying to stop the player from winning, if the player still does nothing in response to an enemy, they will never win, not even by infinite chance. If someone's only experience with "AI" is as a sort of tool: like using chatGPT to do some limited tasks, or using a game engine to analyze games (there's not much fun in playing it because it destroys you), I fully understand how it can be hard to visualize how these tools could become more dangerous agents. In short, maybe play a few video games, man, or program some, to see a sense of how to program purpose and goals directly.

(Assuming AI doesn't kill us all, I'm looking forward to seeing what sorts of behaviors come from giving the reigns to something as capable as current LLMs. Consider a Minecraft Creeper, whose goal is to run up to you and explode, avoiding water and cats. Imagine you could turn its behavior tree into something that had LLM capabilities. You could program something like: once per some time step, here's some sensory data, that might include something like seeing the player, or hearing their footsteps, or nothing but in previous time steps those things were present and the system remembers that. Here are output control actions to move yourself around, if you choose to do so, you can always remain still. Try to approach the player but do it as sneakily as possible. That "sneakily as possible" is a vague task that we game programmers have to crystalize into some approximation, so we write code to calculate things like visibility amount and feed that into our A* pathfinding code so that e.g. our Creeper will avoid short paths that go over well-lit open spaces and prefer longer paths that aren't well-lit. Instead of that, how would an LLM do it? I'd love to see it. You could also instruct it at a high level to seek "humorous" opportunities for destruction, like timing the attack when the player is in their farming area so that while you might not take out them, you might take out a farm animal and cause some funny annoyance. Similarly with some fancy building. A lot of unexpected humor comes up in the games through serendipity, and that's great, I think it'd be even better if it could come up a bit more frequently though, and you need some amount of intent to pull it off.)

For EY, there are arguments I'd rather see him make than the ones he's brought out for years, I dislike his style of over-indulging in details and caveats (whether they're asked for or not) -- like he stumbled over his words a bit talking about two hydrogen atoms and one oxygen atoms when he could have made the point just saying H20, or just "the atoms". He could also ask one of his GFs to maybe do some face grooming, the eyebrows are a bit much. But perhaps that's intentional. I don't discount people by their appearance, but that doesn't mean I don't notice bad appearances and have to consciously attempt to de-bias myself against such signals.

Posted on 2024-11-13 by Jach

Tags: artificial intelligence

Permalink: https://www.thejach.com/view/id/437

Trackback URL: https://www.thejach.com/view/2024/11/some_brief_thoughts_on_the_yudkowsky__wolfram_chat

Back to the top

Back to the first comment

Comment using the form below

TheJach.com

Jach's personal blog

(Largely containing a mind-dump to myselves: past, present, and future)

Current favorite quote: "Supposedly smart people are weirdly ignorant of Bayes' Rule." William B Vogt, 2010

Some brief thoughts on the Yudkowsky / Wolfram chat

Posted on 2024-11-13 by Jach

Archives

Selected Posts

Recent Posts

Recent Comments

Better Websites

Email:
Password:
Remember Me

TheJach.com

Jach's personal blog

(Largely containing a mind-dump to myselves: past, present, and future)

Current favorite quote: "Supposedly smart people are weirdly ignorant of Bayes' Rule." William B Vogt, 2010

Some brief thoughts on the Yudkowsky / Wolfram chat

Posted on 2024-11-13 by Jach

Archives

Selected Posts

Recent Posts

Recent Comments

Tag Cloud

aliens

altruism

Anarchy

anime

anti-anarchy

artificial intelligence

assembly

atheism

awesome

bash

basics

bayes

books

business

c

c++

cars

circuit analysis

clojure

cloning

cognition

college

comics

computer engineering

couchdb

cryonics

curl

daily life

databases

debate

demo

design

disaster

economics

evolution

existentialism

FFP Machine

fiction

flex

fodder

food

forth

FPGA

free will

french

furcadia

future

games

git

government

grammar

hacking

HDL

hiring

history

immigration

intellectual property

italy

japan

java

jMonkeyEngine

language

learning

lisp

LucidDB

management

math

medicine

memo