Monday, August 22, 2011

Rebel Cortex: Temporal Learning In the Tree of Knowledge, Part II

Part I, II, III

Abstract

In Part I, I described how memory is organized in Rebel Cortex and wrote that the Tree of Knowledge (TOK) uses neither Bayesian inference nor the Hidden Markov model. In this post, I will explain how sequence and pattern learning work in the TOK. But first, I would like to say something about a theory of cognition called Confabulation Theory.

Confabulation Theory

Back in May, one of my readers, Juha Ranta, mentioned a competitor to Numenta's Hierarchical Temporal Memory called the Hecht-Nielsen's Confabulation Theory of cognition. I had not heard of confabulation theory before. Although, as a Christian, I dismiss as pathetic nonsense the idea that cognition is the evolutionary outcome of movement, there is one aspect of Confabulation Theory that I found compelling. Like Rebel Cortex, it uses neither Bayesian inference nor the Hidden Markov approach to prediction.
Confabulation theory hypothesizes that cognition involves one, and only one, information processing operation: confabulation, a specialized type of winners-take-all competition between the symbols of a module on the basis of the total input excitation each symbol is receiving from knowledge links.
Although I did not fully realize it at the time, this sounds amazingly similar to what I have been saying about how recognition works in Rebel Cortex:
Essentially, there is a huge number of sensory signals competing for the brain's attention. Various branches of the tree of knowledge (TOK) must decide on whether or not to wake up (activate) based on the number of signals that satisfy their nodes.
Recognition is the prerequisite of prediction. An intelligent system cannot make a prediction about a given situation unless the situation is first recognized. In Rebel Cortex, recognition occurs when a branch of the TOK wins over the other competing branches and wakes up. This happens primarily because the winning branch received more sensory stimuli than the others (there are other reasons but they are beyond the scope of this article). Since a branch is just a collection of sequences, it is easy for the system to trace the sequences to see where they lead. This is how, I believe, prediction occurs in the brain. I think that the use of Bayesian statistics and Markov chains in AI research, although useful in certain limited domain applications, will prove to be a major mistake with regard to the overall goal of achieving human-like intelligence in machines.

That being said, Confabulation Theory has been around since at least 2005. One wonders why it has not been implemented in some compelling demo or product. I suspect that the reason is that other parts of the theory are either too weak or too complicated.

Learning Sequences of Patterns

A sequence is a temporal unit of recognition. Every recognized object or phenomenon (a single branch in the TOK) may consist of hundreds or even thousands of such units. A bottom-level sequence in the TOK consists of an array of seven special nodes called pattern neurons. A pattern neuron has an indefinite number of concurrent input connections originating from the sensory layer. In the brain, these synaptic connections can number in the thousands. Unlike Numenta's Hierarchical Temporal Memory, pattern learning in Rebel Cortex is not a separate process but occurs as an integral part of sequence learning. In other words, a pattern is not just a group of concurrent inputs. It is a group of concurrent inputs that belongs to one ore more sequences, each consisting of multiple other similar groups. In order for a sensory input connection to become a permanent member of a pattern, it must fire in a specific order dictated by the pattern's position in its parent sequence. The number of concurrent inputs in a pattern is indefinite and is not artificially restricted to a small patch of the visual field as it is done in Numenta's HTM.

At first glance, one could say that every pattern neuron in a sequence, except the first one, always fires after the preceding neuron in the sequence. However, given the uncertainties inherent in the sensory space and the fact that a pattern can be shared by more than one sequence, it is unreasonable to expect sequences to be perfect all the time. Neither sequences nor patterns are deterministic. More often than not, a sequence will miss a beat or may fail to happen altogether. Not to worry though. The working assumption for a good sequence learning mechanism is that patterns and sequences only need to occur every once in a while, but often enough and repeatedly enough to be permanently captured in the TOK. The learning method must be such that the input connections arriving from the sensory layer automatically find their proper places in the sequence. This is crucial because correct sequence formation is required for predictions. We can use these observations to design an effective sequence learner.

The Learning Mechanism

Here is how the sequence learning mechansim works in Rebel Cortex. We start by pre-connecting the sequence so that the first pattern neuron sends a master output connection to the second, the second to the third and so on, up to the last neuron. A pattern neuron uses its master output connection to signal its successor in the sequence every time it fires. The first pattern neuron in the sequence is pre-marked as an adult neuron. This is important because only adult pattern neurons can fire. The system should then make as many random sensory input connections (SIC) with the Separation layer (a sort of sensory signal sorting mechanism) as possible. Only a few of the initial SICs will survive the learning phase. New SICs are added continually as they become available. Every SIC is given an initial strength that will be weakened or strenghtened according to the following rules.
  1. A SIC is weakened if it fires before the firing of the predecessor neuron.
  2. A SIC is strenghened if it fires immediately after the firing of the predecessor neuron.
  3. A pattern neuron fires if the majority of its adult SICs fire.
  4. If a prediction is successful, every SIC that correctly predicted it is strengthened.
  5. If a SIC did not contribute to the firing of its pattern neuron, it is weakened.
  6. If the strength of a SIC decreases below a predetermined value, the connection is considered unfit and is immediately severed.
  7. If the strength of a SIC increases above a predetermined value, the connection is marked as an adult and can no longer be changed or severed.
These simple rules will ensure that the right connections find their proper places in the sequence automatically. There are other important housekeeping rules but they don't contribute to either sequence or pattern learning.

Coming Up

I am in the process of writing TOK code for the Rebel Speech program. I'll let you know how it turns out after I have debugged and tested it. In Part III, I will explain how sequence learning works in the upper levels of the TOK.

See Also:

Rebel Speech Recognition
Rebel Cortex
Invariant Visual Recognition, Patterns and Sequences

2 comments:

juha.ranta said...

Thanks, Louis.

Hecht-Nielsen's model does have some demos. In his book" Confabulation Theory: The Mechanism of Thought" he shows how to make a simple demo with his model using a corpus of text. Right now I'm traveling with my laptop, but I'll send some examples of the results later.

One thing that's different to your approach is that he doesn't have a model of learning "on the run". Instead, the network it taught according to some quite simple mathematical rules.

One thing Hecht-Nielsen has is a biologically plausible mechanism for the "winner-takes-all" competition. You're talking about neurons, but I think sometimes they may actually be a group of neurons. For instance, if a neuron in your head dies, you don't suddenly forget the idea of your mother. So, I'd rather call them nodes (they should work the same programmatically). The confabulation theory involves that the neurons in a module of cortex send their input to the thalamus, where the competition takes place and then awake the winning node (a group of neurons) in the cortex. I don't think you need to implement all of this with your program, just choose the the node which wins.

I have more to say about "modules" in the brain and how they're perhaps connected, but I have to think more of it. But let's say that maybe you have a "sunflower" node in your temporal lobes. This is the general "sunflower" node in your brain. It has connections, for example, to the "yellow" node of the visual cortex, and to the aural word "sunflower" node of the aural cortex, etc. And maybe this "winner-takes-all" competitions takes place in all of these cortexes.

One thing I love to meditate on is illusions like this:

http://home.snu.edu/~HCULBERT/illusion.gif

You see a young woman or a an old one? The shift seems to take place in an instant.

Lots of shit, but I hope it'll feed some new flowers.

Louis Savain said...

Juha,

You wrote:

One thing that's different to your approach is that he doesn't have a model of learning "on the run". Instead, the network it taught according to some quite simple mathematical rules.

I am not sure I understand this. How can an intelligent system learn anything except on the run? Does his system learn from sensors?

One thing Hecht-Nielsen has is a biologically plausible mechanism for the "winner-takes-all" competition. You're talking about neurons, but I think sometimes they may actually be a group of neurons. For instance, if a neuron in your head dies, you don't suddenly forget the idea of your mother. So, I'd rather call them nodes (they should work the same programmatically). The confabulation theory involves that the neurons in a module of cortex send their input to the thalamus, where the competition takes place and then awake the winning node (a group of neurons) in the cortex. I don't think you need to implement all of this with your program, just choose the the node which wins.

Yeah. Well, somehow I doubt that anybody has yet shown that the thalamus is a competition arbiter for cortical modules but, be that as it may, I agree that an object is not represented in the brain by a single neuron. In hierarchical memory, and object as complex as one's mother is an entire branch consisting of tens of thousands of sequences and millions of neurons. The death of even thousands of neurons in a branch will only result in a slight degradation of the recognition performance of the branch.

One of the things that bothered me in reading the Scholarpedia article is this:

Actions are typically stored using cognitive knowledge links arranged in nested spatiotemporal symbol hierarchies.

I don't get this. I hope he is not making the same mistake as Jeff Hawkins and Dileep George by repeating pattern recognition (the spatio part in spatiotemporal) nodes at every level of the hierarchy. I have already shown the error of this approach in my articles.

One thing I love to meditate on is illusions like this:

http://home.snu.edu/~HCULBERT/illusion.gif

You see a young woman or a an old one? The shift seems to take place in an instant.

Lots of shit, but I hope it'll feed some new flowers.


Interesting illusion. Not everybody will get it, though. This sort of illusion is easily explained, in my opinion, by the branch mechanism of recognition whereby branches compete for attention.