The Physics World is my Oyster

Physics and Cake got a mention in Physics World this month! 🙂 As a long time reader of Physics World, I’m really happy to see this! I guess this means I’ll have to blog more about Physics and less about the speculative promises and hidden possibilities of Artificial General Intelligence… (especially as AGI apparently didn’t make the transcription below). Though I ‘m afraid I cannot currently shake my desire to explore the intersection between AGI and Physics!

Hmm, looking at this post in the browser is oddly fractal! Though not quite enough to become a Strange Loop. (H/T Douglas Hofstadter, you are awesome).

New scheduling for ‘Thinking about the Hardware of thinking’

I was scheduled to give a live virtual seminar, streamed to the Transvision conference in Italy on October 23rd. Unfortunately I was not able to deliver the presentation due to technical problems at the conference venue.

But the good news is, I will be giving the talk this weekend instead!

Here is the abstract (slightly updated as the talk will be a little longer than originally planned)

Thinking about the hardware of thinking:
Can disruptive technologies help us achieve uploading?

S. Gildert,
Teleplace, 28th November 2010
10am PST (1pm EST, 6pm UK, 7pm continental EU).

We are surrounded by devices that rely on general purpose silicon processors, which are mostly very similar in terms of their design. But is this the only possibility? As we begin to run larger and more brain-like emulations, will our current methods of simulating neural networks be enough, even in principle? Why does the brain, with 100 billion neurons, consume less than 30W of power, whilst our attempts to simulate tens of thousands of neurons (for example in the blue brain project) consumes tens of KW? As we wish to run computations faster and more efficiently, we might we need to consider if the design of the hardware that we all take for granted is optimal. In this presentation I will discuss the recent return to a focus upon co-design – that is, designing specialized software algorithms running on specialized hardware, and how this approach may help us create much more powerful applications in the future. As an example, I will discuss some possible ways of running AI algorithms on novel forms of computer hardware, such as superconducting quantum computing processors. These behave entirely differently to our current silicon chips, and help to emphasize just how important disruptive technologies may be to our attempts to build intelligent machines.

Here is a link to the Teleplace announcement.

Hope to see you there!

Building more intelligent machines: Can ‘co-design’ help?

Here is a little essay I wrote in response to an article on HPCWire about hardware-software co-design and how it relates to D-Wave’s processors. I’ve also put this essay in the Resources section as a permanent link.

.
Building more intelligent machines: Can ‘co-design’ help?

S. Gildert November 2010

There are many challenges that we face as we consider the future of computer architectures, and as the type of problem that people require such architectures to solve changes in scale and complexity. A recent article written for HPCwire [1] on ‘co-design’ highlights some of these issues, and demonstrates that the High Performance Computing community is very interested in new visions of breakthrough system architectures. Simply scaling up the number of cores of current technologies seems to be getting more difficult, more expensive, and more energy-hungry. One might imagine that in the face of such diminishing returns, there could be innovations in architectures that are vastly different from anything currently in existence. It seems clear that people are becoming more open to the idea that something revolutionary in this area may be required to make the leap to ‘exascale’ machines and beyond. The desire for larger and more powerful machines is driving people to try to create more ‘clever’ ways of solving problems (algorithmic and software development), rather than just increasing the speed and sheer number of transistors doing the processing. Co-design is one example of a buzzword that is sneakily spreading these memes which hint at ‘clever’ computing into the HPC community.

Generalization and specialization

I will explain the idea of co-design by using a colorful biological analogy. Imagine trying to design a general purpose animal: Our beast can fly, run, swim, dig tunnels and climb trees. It can survive in many different environments. However, anyone trying to design such an animal would soon discover that the large wings prevented it from digging tunnels effectively; that the thick fur coat to survive the extreme cold was not helpful in achieving a streamlined, fast swimmer. Any animal that was even slightly more specialized in one of these areas would quickly out-compete our general design. Indeed, for this very reason, natural selection causes specialization and therefore great diversity amongst the species that we see around us. Particular species are very good at surviving in particular environments.

How does this tie in with computer processing?

The problems that processors are designed to solve today are mostly all very similar. One can view this as being a bit like the ‘environmental landscape’ that our general purpose creatures live in. If the problems that they encounter around their environment on a day-to-day basis are of the same type, then there is no reason to diversify. Similarly, a large proportion of all computing resources today address some very similar problems, which can be solved quite well using general purpose architectures such as Intel Centrino chips. These tasks include the calculations that underlie familiar everyday tasks such as word-processing, and displaying web pages. But there do exist problems that have been previously thought to be very difficult for computers to solve, problems which seem out of reach of conventional computing. Examples of such problems are face-recognition, realistic speech synthesis, the discovery of patterns in large amounts of genetic data, and the extraction of ‘meaning’ from poetry or prose. These problems are like the trees and cliffs and oceans of our evolutionary landscape. The general purpose animals simply cannot exploit these features, they cannot solve these problems, so the problems are typically ignored or deemed ‘too hard’ for current computing platforms.

But there are companies and industries that do care about these problems. They require computing power to be harnessed for some very specific tasks. A few examples include extracting information from genetic data in the biotechnology companies, improving patient diagnosis and medical knowledge of expert systems in the healthcare sector, improving computer graphics for gaming experiences in entertainment businesses, and developing intelligent military tools for the defense industry. These fields all require the searching and sorting of data in parallel, and the manipulation of data on a much more abstract level for it to be efficient and worthwhile. This parallel operation and abstraction is something that general purpose processors are not very good at. They can attempt such a feat, but it takes the power of a supercomputer-size machine to tackle even very small instances of these specialized problems, using speed and brute force to overwhelm the difficulty. The result is very expensive, very inefficient, and does not scale well to larger problems of the same type.

It is this incorporation of variety and structure, the addition of trees, cliffs and oceans, into our computational problems causes our general-purpose processors to be very inefficient at these tasks. So why not allow the processors to specialize and diversify, just like natural selection explores the problem environment defined by our biological needs?

Following nature’s example

Co-design attempts to address this problem. It tries to design solutions around the structure of the problem type, resulting in an ability to solve that one problem very well indeed. In practice this is done by meticulous crafting of both software and hardware in synchrony. This allows software which complements the hardware and utilizes subtleties in the construction of the processor to help speed things up, rather than software which runs on a general architecture and incurs a much larger overhead. The result is a blindingly fast and efficient special purpose architecture and algorithm that is extremely good at tackling a particular problem. Though the resulting processor may not be very good at certain tasks we take for granted using general-purpose processors, solving specialized problems instead can be just as valuable, and perhaps will be even more valuable in the future.

A selection of processors which are starting to specialize are discussed in the HPCwire article. These include MDGRAPE-3, which calculates inter-atomic forces, and Anton, a system specifically designed to model the behaviour of molecules and proteins. More common names in the processor world are also beginning to explore possible specializations. Nvidia’s GPU based architectures are gaining in popularity, and FPGA and ASIC alternatives are now often considered for inclusion in HPC systems, such as some of Xilinx’s products. As better software and more special purpose algorithms are written to exploit these new architectures, they become cheaper and smaller than the brute-force general purpose alternatives. The size of the market for these products increases accordingly.

The quantum processors built by D-Wave Systems [2] are a perfect example of specialized animals, and give an insightful look into some of the ideas behind co-design. The D-Wave machines don’t look much like regular computers. They require complex refrigeration equipment and magnetic shielding. They use superconducting electronics rather than semiconducting transistors. They are, at first inspection, very unusual indeed. But they are carefully designed and built in a way that allows an intimate match between the hardware and the software algorithm that they run. As such they are very specialized, but this property allows them to tackle very well a particular class of problems known as discrete optimization problems,. This class of problems may appear highly mathematical, but looks can be deceiving. It turns out that once you start looking, examples of these problems are found in many interesting areas of industry and research. Most importantly, optimization forms the basis of many of the problems mentioned earlier, such as pattern recognition, machine learning, and meaning analysis. These are exactly the problems which are deemed ‘too hard’ for most computer processors, and yet could be of incredible market value. In short, there are many, many trees, cliffs and oceans in our problem landscape, and a wealth of opportunity for specialized processors to exploit this wonderful evolutionary niche!

Co-design is an important ideas in computing, and hopefully it will open people’s minds to the potential of new types of architecture that they may never have imagined before. I believe it will grow ever more important in the future, as we expect a larger and more complex variety of problems to be solved by our machines. The first time one sees footage of a tropical rainforest, one can but stare in awe at the wonders of never-before-seen species, each perfectly engineered to efficiently solve a particular biological problem. I hope that in the future, we will open our eyes to the possibility of an eco-sphere of computer architectures, populated by similarly diverse, beautiful and unusual creatures.

[1] http://www.hpcwire.com/features/Compilers-and-More-Hardware-Software-Codesign-106554093.html

[2] http://www.dwavesys.com/

Transvision2010 presentation: Thinking about the hardware of thinking

I will be giving a presentation at Transvision2010, which takes place next weekend. The talk will be about how we should consider novel computing substrates on which to develop AI and ASIM (advanced substrate independent minds) technologies, rather than relying on conventional silicon processors. My main example will be that of developing learning applications on Quantum Computing processors (not entirely unpredictable!), but the method is generalisable to other technologies such as GPUs, biologically based computer architectures, etc…

.

The conference is physically located in Italy, but I unfortunately cannot make in in person, as I will be attending another workshop. I will therefore be giving the talk remotely via the teleconferencing software Teleplace.

Anyway, here is some information about the talk, kindly posted by Giulio Prisco:

Thinking about the hardware of thinking:
Can disruptive technologies help us achieve uploading?

Interesting news coverage of Teleplace QC talk

So I enjoyed giving my Teleplace talk on Quantum Computing on Satuday, and I received quite a lot of feedback about it (mostly good!).

My talk was reported on Slashdot via a Next Big Future writeup, which in turn linked to Giulio’s Teleplace blog! This level of coverage for a talk has been very interesting, I’ve never had anything linked from /. before. They unfortunately got my NAME WRONG which was most irritating. Although I’m fairly impressed now that if you Google for ‘my name spelt incorrectly + quantum computing’, it does actually ask if you meant ‘my name spelt correctly + quantum computing’ which is a small but not insignificant victory 🙂 Note: I’m not actually going to write out my name spelt incorrectly out here, as it would diminish the SNR!!

The talk also prompted this guest post written by Matt Swayne on the Quantum Bayesian Networks blog. Matt was present at the talk.

I’ve had a lot of people asking if I will post the slides online. Well here they are:

LINK TO SLIDES for QUANTUM COMPUTING: SEPARATING HOPE FROM HYPE
Teleplace seminar, S. Gildert, 04/09/10

quantum computing

Or rather, that’s a direct link to them. They are also available along with the VIDEOS of the talk and a bunch of other lectures and stuff are on the Resources page. Here are the links to the VIDEOS of the talk, and look, you have so many choices!!

  • VIDEO 1: 600×400 resolution, 1h 32 min
  • VIDEO 2: 600×400 resolution, 1h 33 min, taken from a fixed point of view
  • VIDEO 3: 600×400 resolution, 2h 33 min, including the initial chat and introductions and the very interesting last hour of discussion, recorded by Jameson Dungan
  • VIDEO 4: 600×400 resolution, 2h 18 min, including the very interesting last hour of discussion, recorded by Antoine Van de Ven
  • Here are a couple of screenshots from the talk:


    Online seminar on Quantum Computing

    I’m giving a VIRTUAL seminar in Teleplace this Saturday…

    I’m going to entitle the talk:

    ‘Quantum Computing: Separating Hope from Hype’
    Saturday 4th September, 10am PST

    “The talk will explain why quantum computers are useful, and also dispel some of the myths about what they can and cannot do. It will address some of the practical ways in which we can build quantum computers and give realistic timescales for how far away commercially useful systems might be.”

    Here’s Giulio’s advertisement for the talk:
    GIULIO’S BLOGPOST about quantum computing seminar which is much more explanatory than the briefly thrown together blogpost you are being subjected to here.

    Anyone wishing to watch the talk can obtain a Teleplace login by e-mailing Giulio Prisco (who can be contacted via the link above). Teleplace is a piece of software that is simple to download and quick to install on your computer and has an interface a bit like Second life. Now is a great time to get an account, as there will be many more interesting lectures and events hosted via this software as the community grows. Note the time – 10am PST Saturday morning (as in West Coast U.S. time zone, California, Vancouver, etc.)

    The seminar is also listed as a Facebook Event if you would like to register interest that way!

    ASIM-2010 – not quite Singularity but close :)

    So I’ll post something about the Singularity Summit soon, but first I just wanted to talk a little about the ASIM-2010 conference that I helped organise along with Randal Koene.

    The main idea of the conference was to hold a satellite workshop to the Singularity Summit, with the purpose of sparking discussion around the topics of Substrate Independent Minds. See the carboncopies website for more information on that! Ah, I love the format of blogging. I’m explaining what happened at a workshop without having introduced the idea of what the workshop was trying to achieve or what our new organisation actually *is*. Well, I promise that I’ll get round to explaining it soon, but until then it will have to be a shadowy unknown. The carboncopies website is also in the process of being filled with content, so I apologise if it is a little skeletal at the moment!

    One interesting thing that we tried to do with the workshop was to combine a real life and a virtual space component. It was an interesting experience trying to bring together VR and IRL. In a way it was very fitting for a workshop based around the idea of substrate independent minds. Here we were somewhat along the way to substrate independent speakers! I am hoping that this will inspire more people to run workshops in this way, which will force the technology to improve.

    I was very pleased too see so many people turning out. We had about 30 people in meatspace and about another 15 in virtual space on both days. Giulio Prisco has some nice write-up material about the workshops, including PICTURES and VIDEOS! Here are the links to his posts:

    General overview
    First day in detail
    Second day in detail

    For a first attempt, I don’t think that things went too badly! The technology wasn’t perfect, but we gave it a good try. The main problem was with the audio. Teleplace, the conferencing software we were using, works well when everyone in online with a headset and mic, there are no feedback problems. However, when you try and include an entire room as one attendee, it becomes a little more tricky.

    This could be improved by either everyone in the room having headsets and mics, and then having a mixer which incorporated all the input into a single Teleplace attendee. The other way is that everyone in the room could also be logged into Teleplace with their own headsets and mics. *Make* that Singularity happen, ooh yeah! (/sarcasm)

    Essay: Language, learning, and social interaction – Insights and pitfalls on the road to AGI

    This is the first in a series of essays that I’m going to make available through this site. I’m going to put them in the resources page as I write them.

    I’m doing this for a couple of reasons. The first is that I like writing. The second is that a blog is not the best format to present ideas that you need to frequently refer to, and often when you get asked the same question by multiple people it is better to point them to a permanent resource rather than repeating yourself each time. The third is that I would like to add some ideas to the general thought-provoking mash-up of AGI memes around the internet. The fourth is that I think people enjoy reading short articles and opinion pieces somewhat more than entire books. The fifth is that (somewhat in contradiction to my previous reason) I’m hoping to eventually write a neat book about related topics, and whilst I have already started doing this, I feel that I need a lot more practice writing several-page documents before I feel I can make something that is 100+ pages long which I am happy with. Note, the PhD thesis does not count!! 😉

    So here we go. Click on the title to download the PDF.
    You can also find a copy of the essay in the resources tab.
    .

    ESSAY TITLE:
    Language, learning, and social interaction
    Insights and pitfalls on the road to AGI

    Abstract:
    Why is language important? What purpose does it serve on a fundamental level? How is it related to the ways in which we learn? Why did it evolve? In this short essay I’m going to take a Darwinian look at language and why we must be careful when considering the role played by language in the building of intelligent machines which are not human-like.

    Quantum Computing – cool new video!

    Here’s a neat video made by my friend and colleague Dr. Dominic Walliman, which gives a great an introduction to all those budding Quantum Computer Engineers of the future 🙂

    .

    .

    Not only is this a Physics-based educational and entertainment extravaganza, but the video is interspersed with some cool shots of my old lab at Birmingham, and my old dilution refrigerator – I miss you, Frosty… *sniff*

    What is quantum co-tunneling and why is it cool?

    You may have see this cool new paper on the ArXiv:

    Observation of Co-tunneling in Pairs of Coupled Flux Qubits

    (I believe there is something called a ‘paper dance’ that I am supposed to be doing)….

    Anyway, here I’ll try and write a little review article describing what this paper is all about. I’m assuming some knowledge of elementary quantum mechanics here. You can read up about the background QM needed here., and here.

    First of all, what is macroscopic resonant tunneling (MRT)?

    I’ll start by introducing energy wells. These are very common in the analysis of quantum mechanical systems. When you solve the Schrodinger equation, you put into the equation an energy landscape (also known as a ‘potential’), and out pop the wavefunctions and their associated eigenvalues (the energies that the system is allowed to have). This is usually illustrated with a square well potential, or a harmonic oscillator (parabolic) potential, like this:

    Well, the flux qubit (quantum bit), which is what we build, has an energy landscape that looks a bit like a double well. This is useful for quantum computation as you can call one of the wells ‘0’ and the other ‘1’. When you measure the system, you find that the state will be in one well or the other, and the value of your ‘bit’ will be 0 or 1. The double well potential as you might imagine also contains energy levels, and the neat thing is that these energy levels can see each other through the barrier, because the wavefunction ‘leaks’ a little bit from one well into the neighbouring one:

    One can imagine tilting the two wells with respect to one another, so the system becomes asymmetric and the energy levels in each well move with respect to one another. In flux qubit-land, we ’tilt’ the wells by applying small magnetic fields to the superconducting loops which form the qubits. Very crudely, when energy levels ‘line up’ the two wells see each other, and you can get quantum tunneling between the two states.

    This effect is known as macroscopic resonant tunneling. So how do you measure it? You start by initializing the system so that state is localised in just one well (for example, by biasing the potential very hard in one direction so that there is effectively only one well), like this:

    and then tilt the well-system back a little bit. At each tilt value, you stochastically monitor which well the state ends up in, then return it to the initialisation state and repeat lots and lots of times for different levels of tilt. As mentioned before, when the energy levels line up, you can get some tunneling and you are more likely to find the system on the other side of the barrier:

    .

    .

    .

    .

    .

    .

    In this way you can build up a picture of when the system is tunneling and when it isn’t as a function of tilt. Classically, the particle would remain mostly in the state it started in, until the tilt gets so large that the particle can be thermally activated OVER the barrier. So classically the probability of the state being found on the right hand side ‘state 1’ as a function of tilt looks something like this:

    Quantum mechanically, as the energy levels ‘line up’, the particle can tunnel through the barrier – and so you get a little resonance in the probability of finding it on the other side (hence the name MRT). There are lots of energy levels in the wells, so as you tilt the system more and more, you encounter many such resonances. So the probability as a function of tilt now looks something like this:

    This is a really cool result as it demonstrates that your system is quantum mechanical. There’s just no way you can get these resonances classically, as there’s no way that particle can get through the barrier classically.

    Note: This is slightly different from macroscopic quantum tunneling, when the state tunnels out of the well-system altogether, in the same way that an alpha particle ‘tunnels’ out of the nucleus during radioactive decay and flies off into the ether. But that is a topic for another post.

    So what’s all this co-tunneling stuff?

    It’s all very nice showing that a single qubit is behaving quantum mechanically. Big deal, that’s easy. But stacking them together like qubit lego and showing that the resulting structure is quantum mechanical is harder.

    Anyway, that is what this paper is all about. Two flux qubits are locked together by magnetic coupling, and therefore the double well potential is now actually 4-dimensional. If you don’t like thinking in 4D, you can imagine two separate double-wells, which are locked together so that they mimic each other. Getting the double well potentials similar enough to be able to lock them together in the first place is also really hard with superconducting flux qubits. It’s actually easier with atoms or ions than superconducting loops, because nature gives you identical systems to start with. But flux qubits are more versatile for other reasons, so the effort that has to go into making them identical is worthwhile.

    Once they are locked together, you can again start tilting the ‘two-qubit-potential’. The spacing of the energy levels will now be different (think about a mass on the end of the spring – if you glue another mass to it, the resonant frequencies of the system will change, and the energies levels of the system along with them. We have sort of made our qubit ‘heavier’ by adding another one to it.

    But we still see the resonant peaks! Which means that two qubits locked together still behave as a nice quantum mechanical object. The peaks don’t look quite as obvious as the ones I have drawn in my cartoon above. If you want to see what they really look like check out Figure 3 of the preprint. (Note that the figure shows MRT ‘rate’ rather than ‘probability’, but the two are very closely linked)

    From the little resonant peaks that you see, you can extract Delta – which is a measure of the energy level spacing in the wells. In this particular flux-qubit system, the energy level spacing (and therefore Delta) can be tuned finely by using another superconducting loop attached to the main qubit loop. So you can make your qubit mass-on-a-spring effectively heavier or lighter by this method too. When the second tuning loop is adjusted, the resulting change in the energy level separation agrees well with theoretical predictions.

    As you add more and more qubits, it gets harder to measure Delta, as the energy levels get very close together, and the peaks start to become washed out by noise. You can use the ‘tuning’ loop to make Delta bigger, but it can only help so much, as the tuning also has a side effect: It lowers the overall ‘signal’ level of the resonant peaks that you measure.

    In summary:

    Looking at the quantum properties of coupled qubits is very important, as it helps us experimentally characterise quantum computing systems.
    Coupling qubits together makes them ‘heavier’ and their quantum energy levels become harder to measure.
    Here two coupled qubits are still behaving quantum mechanically, so this is promising. This means that in the quantum computation occurring on these chips involves at least 2-qubits interacting in a quantum mechanical way. Physicists calls these ‘2-qubit processes’. There may be processes of much higher order happening too.
    This is pretty impressive considering that these qubits are surrounded by lots of other qubits, and connected to many, many other elements in the circuitry. (Most other quantum computing devices explored so far are much more isolated from other nearby elements).