# Notes from Probability Theory Chapter 1 continued

Last time I covered some notes on the first section of the first chapter, today we'll go a little further.

### 1.2 - Analogies with physical theories

A quote that directly precedes this section in its expanded form:
"A mathematician is a person who can find analogies between theorems; a better mathematician is one who can see analogies between proofs and the best mathematician can notice analogies between theories. One can imagine that the ultimate mathematician is one who can see analogies between analogies."
--Stefan Banach

There are some interesting analogies with the theory of probability theory and physical theories that Jaynes mentions. The first is how reality is complicated; there is so much stuff out there that even our best theories cannot handle it all. Where does reality get its computation power, we need some of that! So our theories generally start small and look at little pieces of things, and when they work out they're expanded into larger theories to look at larger things. We're still not sure if we'll eventually get a Theory of Everything or not, but at least history seems to indicate that we'll get a Theory of Very Nearly Approximately Everything at some point.

As physical models get bigger they also get more complicated, so too with models founded on probability theory. If you've read Feynman's QED, he elegantly expresses the very simple rules which singular photons and other particles behave. The problem is when you try to reason about billions of them all over the place.

There's also another analogy the two share, that of the peculiar notion that our theories often have trouble with things "familiar" to us. Jaynes gives the example of the difference in ultraviolet spectra of iron and nickel which can be explained in exhaustive mathematical detail and that the existence of those things is still unknown to the vast majority of humans, but something as familiar and ordinary to most humans like the growth of grass pushes our limits and in some cases we're just utterly useless. This peculiar notion should put some prior constraint on our models that we shouldn't expect too much out of them and be prepared to change and update.

Another analogy is that any advances frequently lead to consequences that have great practical value, but it's somewhat unpredictable when this will happen or if it will happen. I've heard rumor about a cheap way to do a full intuitive calculation that's new, and while it may be wrapped up in useless-sounding papers like "here's how we can do a database join faster", the eventual consequences could be great. Jaynes gives two scientific discoveries: first, Röntgen's discovery of X-rays leading to many new forms of medical diagnosis. Second, Maxwell's discovery of "another term in the equation for curl H" which led to near-instant communication around the Earth.

### 1.3 - The thinking computer

From the legendary John von Neumann: "You insist that there is something a machine cannot do. If you will tell me precisely what it is that a machine cannot do, then I can always make a machine which will do just that!"

The problem with fully general AI isn't AI or slow machines or something like that, it's that we are lacking a key piece or pieces of knowledge to build it right. One piece of knowledge we lack is what exactly "thinking" consists of. Probability theory helps us answer many of those questions, because it's also the study of common sense.

Probability has also led to the creation of very powerful software programs whose ability humans cannot match alone. Humans may be able to reason about a couple competing hypotheses at once, as long as they're really different, but for problems some software is made for like determining relative plausibilities of 100 different hypotheses competing to explain 10,000 separate observations and pieces of evidence? Good luck doing that with paper and pen! But similarly good luck if all you had were a computer and no probability theory. In the policeman example from before, what determines whether the policeman's suspicion of a crime should raise a lot or a little when he sees a broken window? Probability theory will tell us.

The notion of a thinking machine is also useful for developing the theory. So far we've been talking about "human common sense" and things like that, but humans are weird creatures and prone to craziness and outbursts that don't match the common sense of others around them. So let us ask instead of building something that perfectly matches human common sense, let's build a machine that can do useful plausible reasoning following a set of rules, and let's make sure these rules express a sort of idealized common sense that respected, educated, [otherfeelgoodwordwehumansusehere] humans can agree to when they're not under emotional distress.

### 1.4 - Introducing the Robot

Probability theory is about describing the actions of a robot brain we design according to some rules. These rules will be proposed from considered desiderata; that is, things that are desirable in a working human brain, and things that we think a respected, educated, rational individual, on discovering they were violating one of these desirable traits, would wish to revise their thinking.

Our premises are truly arbitrary assumptions (sort of). We can make the robot do whatever we like with whatever rules we like. But this robot should have a purpose. It should provide "useful plausible reasoning". So when we make a set of rules, we need to test the robot and see if its reasoning seems comparable to ours and if we think it might be a good candidate on problems we can't do ourselves. When the rules work, it's an accomplishment of the theory, not a particular premise.