Pavlov’s AI – What did it mean?

So recently I gave a talk at the H+ Summit in Los Angeles. However, I got the impression that the talk, which was about the fundamentals of Artificial General Intelligence (something I decided to call ‘foundations of AGI’) was not fully understood. I apologize to anyone in the audience who didn’t quite ‘get’ it, as the blame must fall upon the speaker in such instances. Although, in my defense, I had only 15 minutes to describe a series of arguments and philosophical threads that I had been musing over for a good few months 🙂

If you haven’t seen the talk, and would like to watch it, here it is:

However, this article is written as a standalone resources, so don’t worry if you haven’t seen the talk.

What I would like to do is start exploring some of those issues on this blog. So, here is my attempt to describe the first of the points that I set out to try and explore in the talk. I’ve used a slightly modified argument, to try and complement the talk for those who have already seen it.


Pavlov’s AI:
What do superintelligences really want?

S. Gildert November 2010

(Photo © Thomas Saur)


Humans are pretty intelligent. Most people would not argue with this. We spend a large majority of our lives trying to become MORE intelligent. Some of us spend nearly three decades of our lives in school, learning about the world. We also strive to work together in groups, as nations, and as a species, to better tackle the problems that face us.

Fairly recently in the history of man, we have developed tools, industrial machines, and lately computer systems to help us in our pursuit of this goal. Some particular humans (specifically some transhumanists) believe that their purpose in life is to try and become better than human. In practice this usually means striving to live longer, to become more intelligent, healthier, more aware and more connected with others. The use of technology plays a key role in this ideology.

A second track of transhumanism is to facilitate and support improvement of machines in parallel to improvements in human quality of life. Many people argue that we have also already built complex computer programs which show a glimmer of autonomous intelligence, and that in the future we will be able to create computer programs that are equal to, or have a much greater level of intelligence than humans. Such an intelligent system will be able to self-improve, just as we humans identify gaps in our knowledge and try to fill them by going to school and by learning all we can from others. Our computer programs will soon be able to read Wikipedia and Google Books to learn, just like their creators.

A perfect scientist?

But the design of our computer programs can be much more efficient in places where we, as humans, are rather limited. They will not get ‘bored’ in mathematics classes. They will work for hours on end, with no exhaustion, no fatigue, no wandering thoughts or daydreams. There would be no need for such a system to take a 2-hour lunch break, to sleep, or to worry about where its next meal will come from. The programs will also be able to analyze data in many more interesting ways than a human could, perhaps becoming a super-scientist. These programs will be far greater workers, far greater scholars, perhaps far greater citizens, than we could ever be.

It will be useful in analyzing the way such a machine would think about the world by starting with an analysis of humans. Why do humans want to learn things? I believe it is because there is a reward for doing so. If we excel in various subjects, we can get good jobs, a good income, and time to spend with others. By learning about the way the world works and becoming more intelligent, we can make our lives more comfortable. We know that if we put in the hard work, eventually it will pay off. There seem to be reward mechanisms built into humans, causing us to go out and do things in the world, knowing that there will be a payoff. These mechanisms act at such a deep level that we just follow them on a day-to-day basis – we don’t often think about why they might be there. Where do these reward mechanisms come from? Let’s take an example:

Why do you go to work every day?
To make money?
To pay for the education of your children?
To socialize and exchange information with your peers?
To gain respect and status in your organization?
To win prizes, to achieve success and fame?

I believe that ALL these rewards – and in fact EVERY reward – can be tied back to a basic human instinct. And that is the instinct to survive. We all want to survive and live happily in the world, and we also want to ensure that our children and those we care about have a good chance of surviving in the world too. In order to do this, and as our society becomes more and more complex, we have to become more and more intelligent to find ways to survive, such as those in the list above. When you trace back through the reasoning behind each of these things, when you strip back the complex social and personal layers, the driving motivations for everything we do are very simple. They form a small collection of desires. Furthermore, each one of those desires is something we do to maximize our chance at survival in the world.

So all these complex reward mechanisms we find in society are built up around simple desires. What are those desires? Those desires are to eat, to find water, to sleep, to be warm and comfortable, to avoid pain, to procreate and to protect those in our close social group. Our intelligence has evolved over thousands of years to make us better and better at fulfilling these desires. Why? Because if we weren’t good at doing that, we wouldn’t be here! And we have found more and more intelligent ways of wrapping these desires in complex reward mechanisms. Why do we obfuscate the underlying motivations? In a world where all the other members of the species are trying to do the same thing, we must find more intelligent, more complex ways of fulfilling these desires, so that we can outdo our rivals. Some of the ways in which we go about satisfying basic desires have become very complex and clever indeed! But I hope that you can see through that veil of complexity, to see that our intelligence is intrinsically linked to our survival, and this link is manifested in the world as these desires, these reward mechanisms, those things that drive us.

Building intelligent machines

Now, after that little deviation into human desires, I shall return to the main track of this article! Remember earlier I talked about building machines (computer systems) that may become much more intelligent than we are in the future. As I mentioned, the belief that this is possible is a commonly held view. In fact, most people not only believe that this is possible, but that such systems will self-improve, learn, and boost their own intelligence SO QUICKLY that once they surpass human level understanding they will become the dominant species on the planet, and may well wipe us out in the process. Such scenarios are often portrayed in the plotlines of movies, such as ‘Terminator’, or ‘The Matrix’.

I’m going to argue against this. I’m going to argue that the idea of building something that can ‘self-improve’ in an unlimited fashion is flawed. I believe there to be a hole in the argument. That flaw is uncovered when we try to apply the above analysis of desires and rewards in humans to machine intelligences. And I hope now that the title of this article starts to make sense – recall the famous experiments done by Pavlov [1] in which a dog was conditioned to expect rewards when certain things happened in the world. Hence, we will now try to assess what happens when you try to condition artificial intelligences (computer programs) in a similar way.

In artificial intelligence, just as with humans, we find that the idea of reward crops up all the time. There is a field of artificial intelligence called reinforcement learning [2], which is the idea of teaching a computer program new tricks by giving it a reward each time it gets something right. How can you give a computer program a reward? Well, just as an example, you could have within a computer program a piece of code (a mathematical function) which tries to maximize a number. Each time the computer does something which is ‘good’, the number gets increased.

The computer program therefore tries to increase the number, so you can make the computer do ‘good things’ by allowing it to ‘add 1’ to its number every time it performs a useful action. So a computer can discover which things are ‘good’ and which things are ‘bad’ simply by seeing if the value of the number is increasing. In a way the computer is being ‘rewarded’ for a good job. One would write the code such that the program was also able to remember which actions helped to increase its number, so that it can take those actions again in the future. (I challenge you to try to think of a way to write a computer program which can learn and take useful actions but doesn’t use a ‘reward’ technique similar to this one. It’s actually quite hard.)

Even in our deepest theories of machine intelligence, the idea of reward comes up. There is a theoretical model of intelligence called AIXI, developed by Marcus Hutter [3], which is basically a mathematical model which describes a very general, theoretical way in which an intelligent piece of code can work. This model is highly abstract, and allows, for example, all possible combinations of computer program code snippets to be considered in the construction of an intelligent system. Because of this, it hasn’t actually ever been implemented in a real computer. But, also because of this, the model is very general, and captures a description of the most intelligent program that could possibly exist. Note that in order to try and build something that even approximates this model is way beyond our computing capability at the moment, but we are talking now about computer systems that may in the future may be much more powerful. Anyway, the interesting thing about this model is that one of the parameters is a term describing… you guessed it… REWARD.

Changing your own code

We, as humans, are clever enough to look at this model, to understand it, and see that there is a reward term in there. And if we can see it, then any computer system that is based on this highly intelligent model will certainly be able to understand this model, and see the reward term too. But – and here’s the catch – the computer system that we build based on this model has the ability to change its own code! (In fact it had to in order to become more intelligent than us in the first place, once it realized we were such lousy programmers and took over programming itself!)

So imagine a simple example – our case from earlier – where a computer gets an additional ‘1’ added to a numerical value for each good thing it does, and it tries to maximize the total by doing more good things. But if the computer program is clever enough, why can’t it just rewrite it’s own code and replace that piece of code that says ‘add 1’ with an ‘add 2’? Now the program gets twice the reward for every good thing that it does! And why stop at 2? Why not 3, or 4? Soon, the program will spend so much time thinking about adjusting its reward number that it will ignore the good task it was doing in the first place!
It seems that being intelligent enough to start modifying your own reward mechanisms is not necessarily a good thing!

But wait a minute, I said earlier that humans are intelligent. Don’t we have this same problem? Indeed, humans are intelligent. In fact, we are intelligent enough that in some ways we CAN analyze our own code. We can look at the way we are built, we can see all those things that I mentioned earlier – all those drives for food, warmth, sex. We too can see our own ‘reward function’. But the difference in humans is that we cannot change it. It is just too difficult! Our reward mechanisms are hard-coded by biology. They have evolved over millions of years to be locked into our genes, locked into the structure of the way our brains are wired. We can try to change them, perhaps by meditation or attending a motivational course. But in the end, biology always wins out. We always seem to have those basic needs.

All those things that I mentioned earlier that seem to limit humans – that seem to make us ‘inferior’ to that super-intelligent-scientist-machine we imagined – are there for a very good reason. They are what drive us to do everything we do. If we could change them, we’d be in exactly the same boat as the computer program. We’d be obsessed with changing our reward mechanisms to give us more reward rather than actually being driven to do things in the world in order to get that reward. And the ability to change our reward mechanisms is certainly NOT linked to survival! We quickly forget about all those things that are there for a reason, there to protect us and drive us to continue passing on our genes into the future.

So here’s the dilemna – we either hard code reward mechanisms into our computer programs – which means they can never be as intelligent as we are – they must never be able to see or adjust those reward mechanisms or change them. The second option is that we allow the programs full access to be able to adjust their own code, in which case they are in danger of becoming obsessed with changing their own reward function, and doing nothing else. This is why I refer to as humans being self-consistent – we see our own reward function but we do not have access to our own code. It is also the reason why I believe super-intelligent computer programs would not be self-consistent, because any system intelligent enough to understand itself AND change itself will no longer be driven to do useful things in the world and to continue improving itself.

In Conclusion:

In the case of humans, everything that we do that seems intelligent is part of a large, complex mechanism in which we are engaged to ensure our survival. This is so hardwired into us that we do not see it easily, and we certainly cannot change it very much. However, superintelligent computer programs are not limited in this way. They understand the way that they work, can change their own code, and are not limited by any particular reward mechanism. I argue that because of this fact, such entities are not self-consistent. In fact, if our superintelligent program has no hard-coded survival mechanism, it is more likely to switch itself off than to destroy the human race willfully.


As this analysis stands, it is a very simple argument, and of course there are many cases which are not covered here. But that does not mean they have been neglected! I hope to address some of these problems in subsequent posts, as including them here would make this article way too long.

[1] – Pavlov’s dog experiment –

[2] – Reinforcement Learning –

[3] – AIXI Model, M Hutter el el. –

Transhumanism and objectivity: An introduction

I have been involved in the transhumanism community for a fair while now, and I have heard many arguments arising from both proponents and skeptics of the ‘movement’. However, many of these arguments seem to stem from instinctive reactions rather than critical thinking. Transhumanism proponents will sometimes dogmatically defend their assumptions without considering whether or not what they believe may actually be physically possible. The reasoning behind this is fairly easy to understand: Transhumanism promises escape from some of humanity’s deepest built in fears. However, the belief that something of value will arise if one’s assumptions are correct can often leave us afraid to question those assumptions.

I would currently class myself as neither a proponent or a skeptic of the transhumanism movement. However I do love to explore and investigate the subject, as it seems to dance close to the very limits of our understanding of what is possible in the Universe. Can we learn something from analyzing the assumptions upon which this philosophical movement is based? I would answer not only yes, but that to do so yields one of the most exciting applications of the scientific method that we have encountered as a society.

I find myself increasingly drawn toward talking about how we can explore transhumanism from a more rational and objective point of view. I think all transhumanists should be obliged to take this standpoint, to avoid falling into a trap of dogmatic delusion. By playing devil’s advocate and challenging some of the basic tenets and assumptions, I doubt any harm can be done. At the least those tenets and assumptions will have to be rethought. But moreover, we may find that the lessons learned from encountering philosophical green lights and stop signs may inform the way we steer our engineering of the future.

I’ve thus decided to shift the focus of this blog a little towards some of these ideas. In a way I have already implemented some of this shift: I have written a couple of essays and posts before. But from now on, expect to see a lot more of this in the future. A blog format is an excellent way of disseminating information on this subject: It is dynamic, and can in principle reach a large audience. I also think that it fits in well with the Physics and Cake ethos – applying the principles of Physics to this area will form a large part of the investigations. And, of course, everything should always be discussed over coffee and a slice of cake! Another advantage is that this is something that everyone can think about and contribute to. You don’t need an expensive lab or a PhD in theoretical Physics to muse over these issues. In a lot of cases, curiosity, rationality, and the patience to follow an argument is all that is necessary.

Humanity+ Conference 2010 Caltech

I gave a presentation yesterday at the H+ conference at Caltech. The session in which I spoke was the ‘Redefining Artificial Intelligence’ session. I’ll try to get the video of the talk up here as soon as possible along with slides.

Other talks in this session were given by Randal Koene, Geordie Rose, Alex Peake, Paul Rosenbloom, Adrian Stoica, Moran Cerf and Ben Goertzel.

My talk was entitled ‘Pavlov’s AI: What do superintelligences really want?’ I discussed the foundations of AGI, and what I believe to be a problem (or at least an interesting philosophical gold-seam) in the idea of building self-improving artificial intelligences. I’ll be writing a lot more on this topic in the future, hopefully in the form of essays, blogposts and papers. I think it is very important to assess what we are trying to do in the area of AI, what the overall objectives are, and looking at what we can build from an objective point of view is helpful in framing our progress.

The conference was livestreamed, which was great. I think my talk had around 500 viewers. Add to that the 200 or so in the lecture hall; 700 is a pretty big audience! Some of talks had over 1300 remote viewers. Livestreaming really is a great way to reach a much bigger audience than is possible with real-life events alone.

I didn’t get to see much of the Caltech campus, but the courtyard at the Beckman Institute where the conference was held was beautiful. I enjoyed the fact that coffee and lunch was served outside in the courtyard. It was very pleasant! Sitting around outside in L.A. in December was surprisingly similar to a British summer!

I got to talk to some great people. I enjoy transhumanism-focused conferences as the people you meet tend to have many diverse interests and multidisciplinary backgrounds.

I was very inspired to continue exploring and documenting my journey into the interesting world of AGI. One of the things I really love doing is looking into the fundamental science behind Singularity-focused technologies. I try to be impartial to this and give both an optimistic account of the promise of future technologies whilst maintaining a skeptical curiosity about whether such technologies are fundamentally possible, and what roadmaps might lead to their successful implementation. So stay tuned for more Skepto-advocate Singularity fun!

ASIM-2010 – not quite Singularity but close :)

So I’ll post something about the Singularity Summit soon, but first I just wanted to talk a little about the ASIM-2010 conference that I helped organise along with Randal Koene.

The main idea of the conference was to hold a satellite workshop to the Singularity Summit, with the purpose of sparking discussion around the topics of Substrate Independent Minds. See the carboncopies website for more information on that! Ah, I love the format of blogging. I’m explaining what happened at a workshop without having introduced the idea of what the workshop was trying to achieve or what our new organisation actually *is*. Well, I promise that I’ll get round to explaining it soon, but until then it will have to be a shadowy unknown. The carboncopies website is also in the process of being filled with content, so I apologise if it is a little skeletal at the moment!

One interesting thing that we tried to do with the workshop was to combine a real life and a virtual space component. It was an interesting experience trying to bring together VR and IRL. In a way it was very fitting for a workshop based around the idea of substrate independent minds. Here we were somewhat along the way to substrate independent speakers! I am hoping that this will inspire more people to run workshops in this way, which will force the technology to improve.

I was very pleased too see so many people turning out. We had about 30 people in meatspace and about another 15 in virtual space on both days. Giulio Prisco has some nice write-up material about the workshops, including PICTURES and VIDEOS! Here are the links to his posts:

General overview
First day in detail
Second day in detail

For a first attempt, I don’t think that things went too badly! The technology wasn’t perfect, but we gave it a good try. The main problem was with the audio. Teleplace, the conferencing software we were using, works well when everyone in online with a headset and mic, there are no feedback problems. However, when you try and include an entire room as one attendee, it becomes a little more tricky.

This could be improved by either everyone in the room having headsets and mics, and then having a mixer which incorporated all the input into a single Teleplace attendee. The other way is that everyone in the room could also be logged into Teleplace with their own headsets and mics. *Make* that Singularity happen, ooh yeah! (/sarcasm)

A particularly bad attack on the Singularity

Whilst I am not a raging advocate of ‘sitting back and ‘waiting’ for the Singularity to happen (I prefer to get excited about the technologies that underlie the concept of it), I feel that I have a responsibility to defend the poor meme in the case where an argument against is is actually very wrong, such as in this article from Science Not Fiction:

Genomics Has Bad News For The Singularity

The basic argument that the article puts forward is that the cost of sequencing the human genome has fallen following a super-exponential trend over the past 10 years. And yet, we do not have amazing breakthroughs in drug treatment and designer therapies. So how could we expect to have “genuine artificial intelligence, self-replicating nanorobots, and human-machine hybrids” even though Moore’s law is ensuring that the cost of processing power is falling? And it is falling at a much slower rate than genome sequencing costs!

The article states:

“In less than a decade, genomics has shown that improvements in the cost and speed of a technology do not guarantee real-world applications or immediate paradigm shifts in how we live our day-to-day lives.”

I feel however, that the article is somewhat comparing apples and oranges. I have two arguments against the comparison:

The first is that sequencing the genome just gives us data. There’s no algorithmic component. We still have little idea of how most of the code is actually implemented in the making of an organism. We don’t have the protein algorithmics. It’s like having the source code for an AGI without a compiler. But we do have reasonable physical and algorithmic models for neurons (and even entire brains!), we just lack the computational power to simulate billions of neurons in a highly connected structure. We can simulate larger and larger neural networks as hardware increases in speed, connectivity, and efficiency. And given that the algorithm is ‘captured’ in the very structure of the neural net, the algorithm advances as the hardware improves. This is not the case in genome sequencing.

The second argument is that sequencing genomes is not a process that can be bootstrapped. The very process of knowing a genome sequence isn’t going to help us sequence genomes faster or help you engineer designer drugs. But building smart AI systems – or “genuine artificial intelligence” as the article states – CAN enable you to bootstrap the process, as you will have access to copyable capital for almost no cost: Intelligent machines which can be put to the task of designing more intelligent machines. If we can build AIs that pass a particular threshold in terms of being able to design improved algorithmic versions of themselves, why should this be limited by hardware requirements at all? Moore’s law really just gives us an upper bound on the resources necessary to build intelligent systems if we approach the problem using a brute-force method.

We still need people working on the algorithmic side of things in AI – just as we need people working on how genes are actually expressed and give rise to characteristics in organisms. But in the case of AI, we already have an existence proof for such an object – the human brain, and so even with no algorithmic advances, we should be able to build one in-silico. Applications for genomics do not have such a clearly defined goal based on something that exists naturally (though harnessing effetcs like the way in which cancer cells avoid apoptosis might be a good place to start).

I’d be interested in hearing people thoughts on this.

Essay: Language, learning, and social interaction – Insights and pitfalls on the road to AGI

This is the first in a series of essays that I’m going to make available through this site. I’m going to put them in the resources page as I write them.

I’m doing this for a couple of reasons. The first is that I like writing. The second is that a blog is not the best format to present ideas that you need to frequently refer to, and often when you get asked the same question by multiple people it is better to point them to a permanent resource rather than repeating yourself each time. The third is that I would like to add some ideas to the general thought-provoking mash-up of AGI memes around the internet. The fourth is that I think people enjoy reading short articles and opinion pieces somewhat more than entire books. The fifth is that (somewhat in contradiction to my previous reason) I’m hoping to eventually write a neat book about related topics, and whilst I have already started doing this, I feel that I need a lot more practice writing several-page documents before I feel I can make something that is 100+ pages long which I am happy with. Note, the PhD thesis does not count!! 😉

So here we go. Click on the title to download the PDF.
You can also find a copy of the essay in the resources tab.

Language, learning, and social interaction
Insights and pitfalls on the road to AGI

Why is language important? What purpose does it serve on a fundamental level? How is it related to the ways in which we learn? Why did it evolve? In this short essay I’m going to take a Darwinian look at language and why we must be careful when considering the role played by language in the building of intelligent machines which are not human-like.

H+ Summit 2010 @ Harvard

I’m currently in Harvard listening to some pretty awesome talks at the H+ Summit. I always really enjoying attending these events, the atmosphere is truly awesome. So far we have had talks about brain preservation, diy genomics, neural networks, robots on stage, AI, consciousness, synthetic biology, crowdsourcing scientific discovery, and lots lots more.

The talks are all being livestreamed, which is pretty cool too. I can’t really describe the conference in words, so here are some pictures from the conference so far:

Audience pic:

General overview:

Here is a picture of Geordie’s talk about D-Wave, quantum computing and Intelligence:

Here is a picture of me next to the Aiken-IBM Automatic Sequence Controlled Calculator Mark I. This thing is truly amazing, a real piece of computing history.

The MIT museum was also really cool 🙂 More news soon!