rating discuss

Gödel’s Gift

Jürgen Schmidhuber, co-director of the Swiss AI lab IDSIA, explains why he is optimistic about the creation of a generalised, self-learning Artificial Intelligence

In his science fiction books, author Iain M. Banks created a utopian society called the ‘culture’ that is governed by tremendously powerful but benevolent AIs.  These almost god-like ‘minds’ are capable of holding simultaneous conversations with millions of other beings, both biological and artificial, and this while they subtly manipulate the geopolitical intrigues of neighbouring, inevitably inferior, societies. Science fiction indeed; but is it a feasible scenario? In the field of Artificial Intelligence today, opinions differ markedly. Today’s AIs are programmed to do a specific task; they cannot do anything else.  But to create a Culture Mind, we’ll first need to create a generalised, self-learning, self-improving AI. For this article we spoke to the pioneer of self-referential, self-improving AIs - one of the optimists, Professor Jürgen Schmidhuber, who is co-director of the Swiss AI lab IDSIA in Lugano and head of the Cognitive Robotics Lab at the Tech University Munich. 

All current AIs are limited to specific, although increasingly complex, environments.  Today’s AIs that perform well do so in a specific task; they have clearly defined and pre-programmed specific problem solvers.  But if you were to drop a chess playing AI in a game of Go it would collapse.  Similarly, you cannot ask a car-driving AI to fly a plane, or vice versa.  That is the difference between AIs and human intelligence: contrary to today’s machines, we are able to adapt to different environments and learn as we go. Human beings are constantly confronted with new problems, relevant to the situation they find themselves in, and somehow create new strategies or behaviours or programs to solve those problems.  In other words, we have a rather general  type of intelligence. 

You state that you want to build an optimal scientist and then retire.  But is it at all possible to create a generally intelligent AI?  And even if it is theoretically possibly, is it practically feasible to build one in the coming decades?

In the new millennium there has been substantial progress in the field of theoretically optimal and practically feasible algorithms for AI capabilities in environments of a very general type. Our work at IDSIA has led to the first theoretically optimal universal problem solvers, such as the asymptotically fastest algorithm for all well-defined problems, and the Gödel Machine.  
 
The challenge for general AI is that we typically do not know, upfront, the exact goal nor the environment in which the system will operate. Also, the environment is dynamic, for example, as we act we change our environment. To deal with this, my former post-doc Marcus Hutter (author of a book on this topic, ‘Universal Artificial Intelligence,’ and now a professor in Canberra) came up with a very interesting system coined AIXI. It is basically a mathematical theory of universal artificial intelligence, assuming the availability of unlimited computational resources, extending Ray Solomonoff’s theoretically optimal way of predicting the future, given the past.  AIXI defines an optimal rational agent that maximizes its expected reward in almost arbitrary environments. The breakthrough here is that this work represents the first mathematically sound theory of general AI. However, AIXI ignores computation time - it is only mathematically optimal, but not computationally feasible. Of course, in reality we need to take time into account.  The asymptotically best way of doing so is  the so-called “asymptotically fastest algorithm for all well-defined problems”. Optimal in an even more general sense (not only asymptotically optimal) is the self-referential, self-improving Gödel Machine.
 
In 1931, Gödel explored the limits of mathematics and computation by creating self-referential formulae that speak about themselves. In the process he showed that math is either flawed in an algorithmic sense, or contains unprovable truths. Gödel ‘s self-reference trick inspired my Gödel machine, which is an agent-controlling program that speaks about itself, ready to rewrite itself in arbitrary fashion once it has found a proof that the rewrite is useful according to an arbitrary user-defined utility function. A little thought shows that the Gödel machine’s strategy is indeed optimal in a very general sense. 
 
However,  it is still not necessarily the most practical solution to all problems, because it still leaves open one essential remaining question:  If the intelligent agent can execute only a fixed number of computational instructions per unit time interval (say, 10 trillion elementary operations per second), what is the best way of using them to get as close as possible to the recent theoretical limits of universal AIs? Once we have the answer to that we can all go home.
 
I am optimistic that we can answer this question, mainly because the basic principles of the new methods—which have made AI a formal science—are actually very simple. Now I am assuming that the answer to this last question will also be reasonably simple.

Historically, the field of AI has been split in two different methodological ‘schools’ of thought: the deductive and the statistical approaches.  Are we still pendulum-swinging from one approach to another or are these approaches finally beginning to merge?

It is true that deductive and statistical approaches have begun to merge. I think this has already progressed further than some AI researchers are aware of. Hutter’s above-mentioned book ‘Universal Artificial Intelligence’ is a major step in that regard. And way back in 1931, when Kurt Gödel laid the foundations of theoretical computer science, he already introduced some essential deduction-oriented concepts for the theory of AI. Although much of subsequent AI research has focused on heuristics, which still play a major role in many practical AI applications, general AI theory has now become a fully-fledged formal science, and we got there by combining probability theory  and deduction theory along the lines of Gödel.
 
I am promoting AI as a formal science because theorems are for eternity, while heuristics come and go. Although I do admit that some of the practical successes of late (including ours at IDSIA) are based not on theoretical optimality results but instead often come from ‘scruffy’ biology-inspired  systems. I believe, however, that theory and practice will converge soon.

Can you give us some examples of your more practical work?

We have a number of projects based on our state-of-the-art brain-like recurrent neural nets (RNN). The human brain is a RNN: a network of neurons with feedback connections and as such it can learn many ‘behaviours’ (or sequence processing tasks/algorithms/programs) that are not learnable by traditional machine learning methods. This explains the interest in artificial RNNs for technical applications. Our recent applications include adaptive robotics and control, handwriting recognition, speech recognition, keyword spotting, music composition, attentive vision, protein analysis, stock market prediction, and many other sequence problems. Hardware advances are very important in this too. For example, we recently broke the world record in handwritten digit recognition, using a 30 year old algorithm for neural networks, but implementing it on graphics cards, making the system run 50 times faster than on standard computers. 
 
A line of work that got quite a lot of attention was IDSIA’s Artificial Ants (AA) research. The basic idea here is that a large number of simple artificial agents are able to build good solutions to hard combinatorial optimization problems via low-level communication that involves aspects of the environment, inspired by real ants which cooperate in their search for food by depositing chemical traces (pheromones) on the floor. Artificial ant colonies simulate this behaviour by communicating via artificial pheromones that evaporate over time. To solve optimization problems, my fellow co-director of IDSIA, Luca Maria Gambardella , and Marco Dorigo (ex-IDSIA, now professor in Belgium at the ULB) introduced AAs equipped with local search methods. This approach broke several important benchmark world records, including those for sequential ordering and routing problems. This success also led to a recent IDSIA spin-off company called ANTOPTIMA. Commercial applications include vehicle routing, logistics for the largest Mediterranean container terminal in LaSpezia, and truck fleet management.  
 
We also have ongoing projects on robots implementing  my general theory of surprise & fun & creativity (1990-) that explains essential aspects of subjective beauty, novelty, interestingness, attention, curiosity, music, jokes and art. For a long time I have been arguing, using various wordings, that all of this is driven by a very simple algorithmic mechanism that uses reinforcement learning (RL) to maximize the fun or internal joy for the discovery or creation of novel patterns. Both concepts are essential: pattern, and novelty. A data sequence exhibits a pattern or regularity if it is compressible, that is, if there is a relatively short program that encodes it, for example, by predicting some of its components from others (irregular noise is unpredictable and boring). Relative to some subjective observer, a pattern is temporarily novel or interesting or surprising if the observer initially did not know the regularity but is able to learn it. The observer's learning progress can be precisely measured and translated into intrinsic reward for a separate RL controller  selecting the actions causing the data. Hence the learning controller of our robot is continually motivated to create more surprising data. Our curious artificial scientists and artists like this are interested in learnable but yet unknown regularities, and get bored by both predictable and inherently unpredictable things. Art and science and humour are just by-products of the desire to create or discover more data that is predictable or compressible in previously unknown ways.  By the way, applications of the new theory of humour can be found in the video below.

Rating

Disagree
0
Agree
Poorly argued
0
Well argued
Irrelevant idea
0
Important idea
Rate this article
close You're not logged in. Please login here.
Not a member of the council yet? Become a member.

Share

Comments (0)

You're not logged in. Please login here.
Not a member of the council yet? Become a member.

Website maintenance by Maxiware CC.

Hosted by Combell