Marvin Minsky – I Sat Next To Him Once! :-)

Marvin Minsky was a giant of the 20th century – not just his own field, and especially for me.

He is credited, along with John McCarthy and one or two others, with having invented the term Artificial Intelligence in 1956 at that famous Dartmouth conference. (It seems the term was actually coined by 1955 – a suitable year for its birth, I’d say 🙂 .) Like other pioneers in the cognitive sciences (Freud: sex in eels; Pavlov: the basic mechanics of dog dribbling or something), Minsky did some early experimental work on animals – lobsters in his case. Minsky was bright, inventive and combative, and had a lot of interesting ideas in cog.sci (I was particularly impressed with his The Society of Mind published in 1986) but he also invented the Confocal Scanning Microscope. Funnily enough, at a taxonomy conference I attended once, two of his main involvements pervaded the proceedings: the CF microscope, and the notorious Perceptron. The two concepts, of Minsky and the perceptron, could not be themselves without each other!

The perceptron is a simplified concept of a nerve cell, with inputs to it from other neurons. If enough inputs are helping our neuron to fire, and not too many are inhibiting it, then it will fire and send outputs on to other neurons. Minsky (and his close associate Pappert) didn’t like the perceptron. But surely, if you had enough such units arranged together next to each other and in layers, couldn’t that do whatever a brain could do? Like many people back in the 1970s, my instinct was that it could, and my face betrayed this when Professor Sutherland informed us in our Perception tutorial that there was a problem with the perceptron’s abilities. In those days I didn’t argue with professors, especially those who weighed about 20 stone and had a half-empty bottle of Beefeater gin on their desk, mid-morning, and with no lid or glass in sight. But he was pretty perceptive himself, though exacting, taking care to do a good job and explain that when perceptrons are used in a single layer they cannot do certain things. They can’t separate areas of the input space into enclosed regions. That means that a perceptron with just two inputs, one representing the east-west position, and one representing north-south, could only ever be taught to say whether a point was on one side of a line, but not that it was in some kind of enclosed space.

Why people had to get so hot under the collar about this was never clear to me, since it was well known that if you stacked perceptrons in multiple layers, the outputs from one layer inputting to units in other layers, you could perfectly well judge whether a point was in an enclosed area (in 2D space with just two inputs, 3D with three, or “multidimensional space” for more complex real-life problems that needn’t have anything to do with space). But for some reason, Minsky and Pappert had this amazing ding-dong battle with the champion and inventor of the perceptron, Frank Rosenblatt.

Years later Minsky admitted it was “probably overkill” on his part. Well, just “kill” was bad enough, not just because shortly after that episode not only did Rosenblatt die, but also it became almost impossible to get funding on any project that involved neural nets (a perceptron being a unit in a neural net of course). In the UK, this was superimposed on the negative effects of the Lighthill Report which directed funding to be cut back because it thought there was no prospect of AI becoming a reality, or becoming even useful, in the near future (there doesn’t seem to have been a similar report for the prospects of the CERN experiments, but then some sciences take PR more seriously than others).

This damage to the AI effort, largely self-administered, is somewhat reminiscent of the zombification of palaeontology, but since AI isn’t packed with pompous ignorant blockheads the way palaeontology is, good work (on neural nets) did actually continue. In 1986 Rummelhart and McClellend edited their landmark two volume Parallel Distributed Processing, reclaiming the territory but carefully avoided the term neural nets wherever possible even though that was what it was about, and which included famous influential work by Hinton and Sejnowski. (1986, ’87 & ’88 were good years for publishing: in ’86: Society of Mind, Parallel Distributed Processing, The Dinosaur Heresies; ’87: my Idea For A Mind paper 🙂 , suspected of being influenced by PDP and SM, though I’d actually read neither by then; ’88: Predatory Dinosaurs of the World, Science as a Process, Reconstructing the Past.)

The atmosphere in the conference chambers that hosted the perceptron rows was reportedly legendary. Long after, in about 2002, I attended an AI conference in Birmingham (not the Wham Bam Alabam one! – the “real”, though admittedly smaller one) which both Minsky and neural nets hero Geoffrey Hinton attended. I was a bit late since I’d had to drive round and round the campus finding where the meeting actually was, without notices posted anywhere nor anywhere convenient to park and check the map. By the time I got in, most of the seats where taken, but people seemed to have steered clear of the one next to Minsky, so I sat there. This time there seemed to have been some kind of maths challenge between Minsky and Hinton, and Minsky had his head down busily trying to do some complicated sums. All the while, speaker after speaker gave talk after talk unavoidably mentioning the major influence Minsky had had on first this area of AI and then that, with Minsky himself paying no attention to the repeated mentions of his name. He also paid no attention to an American lady in the middle of the room who didn’t seem to be tuned in much to the AI but was keen on cheerfully implying that she at least considered that she was a good friend of Minsky, while constantly rustling sweet wrappers. That was the first day I actually met Stan Franklin, a great chap, and very nice to me, whose time I feel I’ve terribly wasted. Stan was someone else Minsky never paid much attention to; a big mistake if you ask me, not to mention insult. Oh yeah – also there, in the back row, was that chap Push Singh. Now whatever happened to him, I wonder? Yup – suicide: becoming a standard AI fate, viz Turing, and including, I suspect, Michie! But anyway… there were echoes of the bad old days in the interchanges between Hinton and Minsky: calm, patient and slightly long-suffering from Hinton; somehow his manner and very long suit trousers reminded me of some representation of a good young father from a 1950’s film. He had studied under Christopher Longuet-Higgins (as Higgs the nuclear physicist had when he and C L-H had both been chemical physicists), so he’d had experience at dealing with big, demanding personalities. (C L-H seemed to bring his associates good luck! Freeman Dyson was at school with him.) Funnily enough Minsky paid no attention to me either, until I asked him “What would you say was the definition of science?” He turned slowly to me, regarding me with some suspicion, and said fairly patiently and apparently not for the first time: “I don’t know but I know when I see someone doing it.” I’ve never considered that a good answer. It doesn’t distinguish between animals snuffling about, non-scientists snuffling about, and people (or agents) explicitly seeking the models that best explain the observations… and to ignore the Popperian view without saying why or even acknowledging it… 😦

But I think I’ve worked out to my own satisfaction what drove Minsky to wage the perceptron war. At that time it was thought, wrongly, that you had to choose between the Symbolic Approach to AI (which was all about solid blocky concepts, LISP programming and languages), or on the other hand, the Connectionist Approach – i.e. neural nets. In fact, you do need lumpy concepts with subtle characteristics interacting complexly with different kinds and categories of other concepts, for their standard execution, as Minsky realised, but what he didn’t realise was that you can, and must, create (i.e. learn) those blobby concepts by neural net methods – more subtle and complex methods than we currently have – in the same way that strands of candy-floss (cotton-candy) are accumulated round the stick to form a blob. He didn’t see the perceptron as the simplest first example of an endless sequence of instantiations of neural net advancing into the future… and he couldn’t see this partly because of groupism – he was a symbolist – and partly because of a kind of arrogance that mathematicians (coughLighthillcough) often have, of thinking they’ve summed up everything about a concept and its future possibilities because they’ve demonstrated some isolated point concerning it. Paradoxically Minsky was one of the first to investigate neural nets experimentally (the “SNARC” – Stochastic neural analog reinforcement calculator) but he was hampered by trying it before the invention of the programmable computer. I think that because he couldn’t do it, it annoyed him to think that someone else could. But just because you can’t do something with a soldering iron, it doesn’t mean it can’t be done. There is a picture of the electronic box Rosenblatt constructed to investigate his perceptron instantiation of the neural net, and to my eye is looks like he’s been able to use wires that plug in and out of sockets to make connections. Just a simple technical advance like that can perhaps make all the difference!

When he was good he was very, very good, and when he was bad he was horrid; but the concepts described in The Society of Mind would make a good list of features to be accounted for by an ambitious AI implementation.

What pleased me most about him was something he said in his main conference talk about John Searle’s “Chinese Room” argument. Searle claims that if you have a room where slips of paper are passed in though one letter box with Chinese writing on it, and people inside the room busily follow various rules which eventually produces an English version which is passed out of the other side of the room, there’s nothing in the model that actually understands Chinese. In Minsky’s view, and mine, that is a claim that within any process that does something there has to be some sub-component that actually does the whole job.

I had wondered if his death would be reported on the BBC radio news. If it was I missed it, and if it wasn’t, I’m a bit peeved. [Ah yes – died Jan 24th; had nice feature on BBC Radio 4 Last Word Fri 12th Feb. A very notable student of his, Patrick Winston, gave a glowing talk on him, though missed out the controversy 🙂 .]

This entry was posted in Artificial Intelligence and tagged , , , , , , , , , , , , , , . Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s