Tuesday, March 26, 2013

Secrets of the Holy Grail, Part II

Part I, II, III, IV, V

Abstract

In Part I, I gave a brief description of the brain's memory architecture. In this post, I explain how the brain does pattern learning and catches "thieves" in its sleep.

Winner-Takes-All vs the Bayesian Brain

Although it feels like I am preaching in the wilderness, I have been railing against the use of Bayesian statistics in machine learning for some time now. The idea that the brain reasons or recognizes objects by juggling statistics is ridiculous when you think about it. The brain actually abhors uncertainty and goes to great lengths to eliminate it. As computer scientist Judea Pearl put it not too long ago, "people are not probability thinkers but cause-effect thinkers."

Even though it is continually bombarded with noisy and incomplete sensory data, internally, the brain is strictly deterministic. It uses a winner-takes-all mechanism in which sequences compete to fire and the winner is the one with the most hits. Once a winner is determined, the other competitors are immediately suppressed. The winning sequence is assumed to be perfect. To repeat, the brain is not a probability thinker. It learns every pattern and sequence that it can learn, anything that is more than mere random chance. Then it lets them compete for attention. Read The Myth of the Bayesian Brain for more on this topic.

Pattern Learning

The job of the pattern learner is to discover as many unique patterns in the sensory space as possible. Pattern learning consists of randomly connecting sensory inputs to pattern neurons and checking to see if they fire concurrently. However, keep in mind that a pattern neuron will fire when a majority of its input signals arrive concurrently.
The pattern learning rule can be stated thus:
In order to become permanent, an input connection must contribute to the firing of its pattern neuron at least X times in a row.
X is a number that depends on the desired learning speed of the system. In the human brain, X equals 10. With this rule, the brain can quickly find patterns in the sensory space. It is based on the observation that sensory signals are not always imperfect. Every once in a while, even if for a brief interval, they are perfect. This perfection is captured in pattern memory.

Catching Thieves During Sleep

The pattern learning rule is simple and powerful but it suffers from a major flaw: it imposes no restrictions or boundaries on the growth of a pattern. Without proper boundaries, patterns become more and more complex and the simpler ones eventually disappear, crippling the system. Obviously, we need a way to prevent a pattern neuron from acquiring more complexity than its level within the hierarchy requires. The solution to the boundary problem consists of enforcing the boundary rule:
A pattern may not have duplicate sensory input connections.
Here is another way of putting it: a sensory signal may not arrive at a pattern neuron via more than one path. For example, in the illustration below, pattern neuron A behaves as if it were connected directly to sensors a, b, c, d, and e.
Suppose sensor c was connected (dotted red line) to pattern neuron C. This would mean that pattern neuron A would have two duplicate inputs from sensor c, one via C and the other via D. This is forbidden by the boundary rule. What this means is that somewhere along the paths leading from sensor c to pattern neuron A, there is a bad connection. The culprit is always the most recent or weakest one. It is called a thief because it "stole" something that does not belong to it. By purging thieves from all branches of the pattern hierarchy, the growth of every pattern is automatically limited to a degree of complexity commensurate with its level in the hierarchy.

The power of the boundary rule is betrayed by its simplicity. It prevents runaway pattern growth while facilitating the discovery of every possible unique pattern in the sensory space. It is indispensable to pattern learning and works for any type of sensory patterns, not just visual.
Note: As far as I know, the boundary rule is not in any books. Please make copies of this page on your computer. This is intended to serve as "prior art" in the public domain, i.e., it cannot be patented. :-D
The brain cannot eliminate thieves while it is awake because it must test fire all untested connections. This could cause problems during waking hours. This is one of the reasons that sleep is so important. An intelligent machine, by contrast, is not so limited. During learning, a computer program can examine a branch on the fly to see if a new connection is a thief.

Coming up

In Part III, I will show how learning occurs in sequence memory.

See Also

The Myth of the Bayesian Brain
The Holy Grail of Robotics
Raiders of the Holy Grail
Jeff Hawkins Is Close to Something Big

Sunday, March 24, 2013

Secrets of the Holy Grail, Part I

Part I, II, III, IV, V

Abstract

According to brain and machine learning expert, Jeff Hawkins, goal-directed behavior is the holy grail of intelligence and robotics. He believes that the best way to solve the intelligence puzzle is to emulate the brain. Hawkins is right, of course. There is no question that we can learn everything we need to know about intelligence by studying the brain. The only problem is that some of the answers are so deeply buried in an ocean of complexity that a hundred years of painstaking research could not uncover them. In this multi-part article, I will describe some of the amazing secrets of the brain before revealing the surprising source of my knowledge (no, it's not the brain, sorry).

Liars and Thieves

Let me come right out with a bold statement: nobody can rightfully claim to understand the brain’s perceptual learning mechanism without also knowing exactly what the brain does during sleep and why. Sure, we know what neuroscientists and psychologists have told us, that the brain uses sleep to consolidate recent memories, whatever that means. Unfortunately, that is pretty much the extent of their knowledge on the subject. Hawkins doesn't know either, although he should. That is, assuming he wants to stay in this business. It turns out that the brain performs at least two essential functions while we are asleep: it purges liars (bad predictors) from sequence memory and eliminates thieves (redundant connections) from pattern memory.
Note: I will explain my choice of the liars and thieves metaphors in an upcoming post.
Without these frequent purges, the brain would get confused and eventually stop working. But why is that, you ask? That, my astute and inquisitive friend, is one of the secrets of the holy grail, which is why you must read the rest of the article. But before I can answer your question, I must first say a few things about how memory is organized.

Pattern Memory

I did not always think so but the brain has two types of hierarchical memories: pattern memory and sequence memory. My original objection was that a pattern hierarchy cannot do invariant object recognition. That was before I realized that it doesn't have to; that's the purpose of sequence memory. Pattern memory is a hierarchy of pattern detectors that send their output signals directly to sequence memory. A pattern is a transient group of sensory signals that occur together often and a pattern detector or neuron is best viewed as a complex event sensor. Pattern detectors (red-filled circles) can have an indefinite number of inputs.
A hierarchy makes sense for several reasons. First, it gives us a very compact storage structure because of the inherent reuse of lower level patterns. Second, and just as importantly, it provides a way to automatically limit the boundaries of patterns. This, in turn, makes it possible to discover all possible patterns in the sensory space. I'll have more to say on this later.

A peculiar but critical aspect of pattern memory is that the time it takes an incoming signal to propagate through the hierarchy must be very fast. The cortex uses electric synapses to do this. The end result is that signal propagation through the hierarchy appears instantaneous to the rest of the brain. And the reason for this has to do with timing integrity. For instance, if a high level neuron (A) fires, all the pattern neurons in the branch below A in the hierarchy are assumed to have fired concurrently with A.

Sequence Memory

It would be accurate to say that sequence memory is the seat of intelligence. It is used for many functions such as recollection, prediction, attention, invariant object recognition, reasoning, goal-directed motor behavior and adaptation. Sequence memory contains sequences of patterns organized hierarchically just like pattern memory. Note that, in the diagram below, the pattern hierarchy is shown as a single flat layer (red circles). This is because sequence memory (yellow circles) does not see pattern memory as a hierarchy. That is to say, the system must act as if sensory signals could travel through pattern hierarchy instantaneously. Otherwise, pattern detection timing would be askew.
One of the more interesting design characteristics of sequence memory is that a sequence detector has a maximum of seven nodes or inputs. Why seven? For one, it explains the capacity of what psychologists call short-term or working memory. Second, it is a compromise that aims to minimize energy usage while maximizing the breadth of focus. As it turns out, the brain can focus on only one branch of sequence memory at a time. A branch should be seen as a grouping mechanism that represents a single object or concept. No need to look any further. The branch is the mechanism of both attention and invariant object recognition.

What is even more interesting from the point of view of invariant object recognition is that multiple sequences may and do share patterns. In fact, every complex recognized object in memory consists of multiple, tightly intertwined sequences. This will become clearer later.

Coming up

In Part II, I will explain how learning occurs in pattern memory and how to catch a thief.

See Also

The Holy Grail of Robotics
Raiders of the Holy Grail
Jeff Hawkins Is Close to Something Big

Wednesday, March 6, 2013

Raiders of the Holy Grail

Numenta Aims High

In my recent article, The Holy Grail of Robotics, I wrote that I was impressed with Jeff Hawkins' description of goal-oriented behavior in his book On Intelligence. In a recent blog post about the Obama administration's $3 billion science initiative known as the Brain Activity Map Project (BAM), Hawkins wrote the following:
The activity and connection maps envisioned by BAM will be useful, but brain theorists today are not lacking in empirical data. We haven’t come close to understanding the tremendous amount of data we already have. If we want to understand how brains work, then a better direction is to focus on brain theory, not brain mapping. We should set goals for brain theory and goals for machine intelligence tasks based on those theories. That is what we do at Numenta. For example, we set goals to understand how neurons in the neocortex form sparse distributed representations and then how they learn to predict future events. This resulted in Cortical Learning Algorithm (CLA) which is the heart of our Grok streaming prediction engine. The next big theoretical challenge we are working on is how the cortex generates behaviors from predictions, what is sometimes called the sensory-motor integration problem.
This is even bigger news than Numenta's recent announcement of Grok, in my opinion. I would not think so had it come from anybody other than Hawkins. As I wrote elsewhere, Hawkins is no dummy. This tells me that Numenta is aiming to solve the entire intelligence problem single-handedly. Why? Because, once you have figured out how to do both perceptual and motor learning, there isn't much more to add other than an appetitive/aversive learning mechanism. This is psychology lingo for a pain and pleasure (reward and punishment) mechanism, a must for adaptation. But this is a rather trivial problem once you've gotten this far down the road.

The Challenge

Creating a viable model for sensorimotor behavior is not an easy task. It starts with designing a working perceptual learning system (both pattern and sequence recognizers) and an attention mechanism. I don't think Numenta has perfected either of those, regardless of the hype emanating from Redwood City. An attention mechanism is a must because there can be no coherent motor behavior without the ability to focus.

There is more to motor learning and behavior than what happens in the neocortex, however. Mammals and birds have an additional sensorimotor control structure known as the cerebellum. Humans use it to help with a bunch of automatic tasks such as walking, standing, maintaining posture, and even driving. The cerebellum works on completely different motor control principles than the neocortex. It is needed because it frees the neocortex from having to handle routine motor behavior so it can focus on other things. But even without a cerebellum, a robot could still learn some sophisticated skills.

Even without a good perceptual learner, one can still build a very impressive learning robot with multiple degrees of freedom. This is assuming one has the motor learning part right. Hawkins can save his company a boatload of money by reading my recent article on goal-oriented motor learning. In it, I  gave a biologically plausible definition of goal and explained how the neocortex finds the right motor connections for any given goal. I already did a major part of the homework on motor learning. It took me a while to arrive at my current model but it's easy to explain to others once you know how it works. So, in my opinion, Hawkins would do well to skip to something else. For example, he will have to figure out how to handle motor conflicts but, from my perspective, this is not that hard either.

The AI Race Is On

There is something new in the air. There has been a frenzy of activity in AI and brain research in the last couple of weeks. A lot of money is suddenly being allocated for research by both the government and the private sector. It's strange but there is a sense of desperation in the air, as if time was of the essence. I don't know what but something must have happened to trigger this. What is certain is that the race is on to be the first to understand how the brain works.

My prediction is that these initiatives will fail. Like Hawkins, I don't think that throwing money at the problem is the way to go. At this time, I think Jeff Hawkins has the best chance to unlock the secrets of the brain. He knows a few already. However, unless he can fully grok perceptual learning and attention (he doesn't, even if he thinks he does), his efforts will also fail in the end. He may come up with a useful gadget or some other product but the holy grail of intelligence and robotics will remain out of his grasp.

In the meantime, I continue with my own efforts and I don't need a million dollar budget. All I need is a personal computer and some spare time. May the best model win.

See Also:

The Holy Grail of Robotics
Goal-Oriented Motor Learning