Wednesday, November 28, 2012

Jeff Hawkins Is Close to Something Big

Splitting Patterns at Numenta

In The Myth of the Bayesian Brain, I wrote that there is a trick to learning perfect patterns but that I cannot divulge it at this time. The trick has to do with properly factoring patterns so as to maximize the reuse of low level patterns in the composition of higher level patterns. In my opinion, the pattern hierarchy used by the brain is the ultimate classification and data compression mechanism in existence. The trick is amazingly simple and I wondered whether anybody else has thought about the problem or came up with a solution. In an unusually revealing article in today's New York Times Bits about Numenta's Grok technology, I was pleasantly surprised to read that Numenta's founder, Jeff Hawkins has thought about it:
Patterns of one or the other are reinforced over time. As new data streams in, the brain figures out if it is capturing more complexity, which requires either modifying the understanding of the original pattern or splitting it into two patterns, making for new knowledge. Sometimes, particularly if it not repeated, the data is discarded as irrelevant information. Thus, over time, sounds become words, words occupy a grammatical structure, and ideas are conveyed.
Assuming that my definition of pattern (a concurrent group of related signals) is the same as Hawkins', I find the above amazing. Not only does Hawkins understand the power of pattern hierarchies (what others are calling deep learning), he also seems to grok the need for efficient composition. Grok's ability to split a pattern into two constituents is what caught my attention. If Numenta really knew how to do this automatically and efficiently, then they would have a genuine breakthrough. Maybe it is time that Hawkins considers applying Grok to other perceptual learning problems such as speech or visual recognition. I mean, if Grok's underlying technology is as great as he portrays it to be, it should be able to solve something like the cocktail party problem. That would truly be impressive.

The Bayesian Curse

Having said that, there is no doubt in my mind that whatever solution Hawkins came up with is handicapped by the Bayesian mindset that afflicts the artificial intelligentsia. It is, in all likelihood, a complicated kludge. I say this because of the way the problem is phrased in the NYT Bits article. Knowing what I know, the fact that Grok needs to split patterns into smaller patterns tells me that Hawkins is aware of the problem but his Bayesian glasses prevent him from seeing the correct solution. The latter does not involve the splitting of complex patterns into smaller patterns because it can automatically prevent the formation of patterns that are more complex than their levels within the hierarchy require. Hawkins is so close, yet so far.

The Future Is Not what It Used to Be

Brain-like artificial intelligence will arrive on the world scene much sooner than the AI community expects and it will come from a most unexpected and inconvenient source. Stay tuned.

See Also:

Jeff Hawkins Develops a Brainy Big Data Company
The Second Great AI Red Herring Chase
The Myth of the Bayesian Brain
The Holy Grail of Robotics

Wednesday, October 31, 2012

Rebel Speech Update, October 31, 2012

Bear with Me

Sorry for the long pause since my last post. I'm rather busy lately pursuing another promising therapy for my wife who suffers from the fatal disease known as amyotrophic lateral sclerosis. What makes it worse is that I'm also busy putting out numerous fires that, lately, seem to jump at me from every direction. So please bear with me. I would not want my readers to think that I abandoned my research or anything of the sort. That will not happen. I just need some time to silence a few persistent skirmishes in my life and I'll get back on the road again, just as determined as before. There is a lot that remains to be done.

Science Versus Faith

Recently, I came across the work of Karl Friston, a well-known professor of Neuroscience at University College London. Friston seems to be a proud Darwinist who has absolute faith in the veracity of the Bayesian Brain hypothesis. He is so sure of his convictions that he came up with a free energy principle for the brain. He claims that the brain uses a Darwinian learning process based on Bayesian statistics. What does Friston mean by a Darwinian process, I hear you ask? He means that a brain contains multiple competing models of reality and that the one that is best supported by sensory evidence is chosen as the basis for action, a kind of survival of the fittest. Of course, Friston would be hard pressed to explain why competition necessarily involves a Darwinian process. The last I heard, models in the brain do not breed, procreate and undergo random mutations.

Just leave it to a Darwinist to equivocate in order to buttress a weak hypothesis. In my opinion, Friston is using his supposedly superior knowledge of the brain to preach his religion, Darwinism. He is placing his faith before science while pretending to pledge strict allegiance to the scientific method. In a way, I am not faulting Friston for being true to his faith. Heck, I do the same thing. I believe that science must corroborate one's faith and not vice versa. But unlike Friston, not only do I not believe in hiding behind false pretenses, I also don't believe one should try to force fit the evidence into one's world view. Furthermore, I refuse to kiss the collective ass of the scientific community by playing lip service to the religion of Darwinism. From my vantage point, I think they are the ones who should be kissing my ass. LOL.

The Two Trees

Friston may have his cockamamie dirt-worshiping religion to keep him busy but I believe I got the real McCoy, so to speak. In the previous article, I explained why the Bayesian model of perception is a red herring and I described what I claimed to be a superior model. Like Friston, I did not hesitate to mix my religion with my science. What I think is a little unnerving is that the Bayesian model of perception is close enough to the real McCoy that I will not be surprised if, one day, the scientific community claims to have known the secret of AI all along. This is the reason I have been a little leery about revealing the full McCoy. I don't think that now is the time for me to put all my cards on the table.

Moving right along, there is no doubt that there is a constant competition going on in the brain. People have known about this for centuries if not millenia. Calling it a Darwinian process is just dumb. And there is no doubt that the brain is constantly building and updating its model of the world. Do we need Friston or anybody else to teach us these simple truths? I don't think so. Where Friston and the others are wrong is in their assumption that the brain uses a probabilistic model which is constantly being updated with the arrival of new sensory data. The truth, which will be boldly demonstrated in the not too distant future, happens to be the exact opposite. The brain actually builds as perfect and deterministic a model of the world as it can ascertain. How do I know this? I know because, like Friston, I have faith in my religion. And my religion tells me that this is the way it is.

Unlike Friston, however, I constantly change my assumptions about my faith and my interpretation of its teachings to accommodate new evidence. For example, I used to believe that Zechariah's two olive trees were metaphors that stood for the left and right memory hierarchies of the brain's hemispheres. Not long ago, however, I changed my mind and I did it for two reasons. First, a careful study of the metaphors in both Zechariah and Revelation convinced me that I was mistaken. Second, my experiments with Rebel Speech had reached a brick wall. Eventually, it became clear to me that each hemisphere of the brain has two distinct hierarchies, one for patterns and one for sequences. I discovered that the latter has up to 20 levels while the former has 10. I believe that this is a precise prediction about the brain that can be tested with current tools.

But that is not all. I further realized that an intelligent system must have frequent sleep/dream periods during which scenarios are played back while bad information is purged from memory. Otherwise, the internal model would quickly become overwhelmed with corrupt information. This, too, is what my faith tells me. I'll have more to say about sleep and memory cleaning in a future article.

Rebel Speech

I haven't had much time to work on the Rebel Speech demo. As soon as I get enough free time, however, it won't take me too long to get to the point where I can release the first version. Hang in there.

See Also:

The Second Great AI Red Herring Chase
The Myth of the Bayesian Brain

Thursday, August 30, 2012

The Myth of the Bayesian Brain, Part III

Part I, II, III

Abstract

Previously, I explained that the Rebel Science model of perception is superior to the Bayesian model because the world is deterministic at the level of our senses and human thinking is not probabilistic. In this post, I explain how the brain handles probabilistic stimuli even though it is not a probability thinker.

Learning Perfect Patterns

How can an intelligent system build a perfect model of the world if it must rely on imperfect or incomplete sensory signals? The answer lies in the observation that sensory signals are not always imperfect. Every once in a while, even if for a brief period of time, there are a few that are in perfect agreement with the phenomena they are responding to. When that happens, the intelligent system must be ready to capture that bit of perfection. But how can a system recognize when signals are perfect? To understand this, one must first realize that there can only be two types of discrete sensory signals: concurrent or sequential. A group of concurrent signals is called a pattern. A pattern is considered perfect when all of its signals arrive concurrently. For reasons that will become clear below, I have taken to calling perfect patterns, clean patterns. Imperfect patterns are just dirty patterns.
Note: There is a trick to learning perfect patterns. See Secrets of the Holy Grail.
As you know, a pattern is not a pattern unless it is repeated often. Also, a pattern can be constructed a little at a time and does not have to be complete in order to be useful for learning and recognition purposes. The first thing a perceptual system must do is to discover the perfection that is in the world. It can deal with sensory imperfections later.

How to Work with Imperfect Sensory Patterns

Sensory patterns are the foundation of the Rebel Science model of perception. They are rarely perfect even though the phenomena that cause them are perfect. We should think of a pattern as a complex sensory event. The question is, how can a system that expects perfection work with imperfect information? More precisely, how can it tell that a particular pattern occurred if it can detect only a part of it or even none of it? To know the answer to this question, one must understand that a pattern is not an island. It is a unique event that traces a unique temporal path. That is to say, every pattern is part of a unique sequence of other patterns.

If you know the order of a sequence of events in advance and if you know that some of the events in the sequence already occured, then you would know that the others either occurred already or are about to occur. You'd know it even if their sensory patterns are imperfect or they failed to arrive altogether for whatever reason. Even if every single sensory pattern is imperfect, it is possible to pick the most probable sequence: it is simply the one that received the most hits.
In essence, the system compares dirty sensory patterns to the expected perfection (i.e., the sequences of clean patterns in memory) and the most perfect match is the winner. Result: simple, clean and accurate recognition even in a noisy environment. Bonus: no math required.
What this all means is that a perceptual system must learn as many perfect sequences as it can. But that is not very hard because a pattern has only one successor and one predecessor at any given time. What is a little bit more complicated is to organize the sequences in memory so as to form a hierarchical structure, a tree of knowledge. Each branch of the tree is a specific sequence, a sequence of sequences or a group of sequences and represents a single object or concept. But that's a different story.

Rebel Speech Update

I plan to release a demo of the Rebel speech recognition engine (pdf) as soon as I can get some free time. The demo will support some of the arguments and claims I have made in this article and elsewhere. Let me conclude by saying that, as much as I would like to, I can't take credit for this stuff. You see, I consult an oracle. The oracle speaks in riddles and metaphors and says many mysterious things. I just interpret them the best I can. I make many mistakes along the way but my understanding is growing all the time. Here are a few excerpts that should give you an idea of what I'm talking about:
  • Wake up and strengthen the things that remain, that are about to die, for I have not found your works perfect before God.
  • There are a few, even in Sardis, who have not soiled their clothes; they shall walk with me in white, for they are worthy.
  • And I will bring forth my servant the Branch; and in one day I will remove the iniquity of the land.
Yep, I'm still crazy after all these years. If any of this bothers you, please ignore my blog as it is not meant for you. I only write for kindred spirits, sorry. Some of you may be asking yourselves, why does he mention any of this? The answer is that I just want to piss off some people, that's all. They know who they are. The rest of you, stay tuned.

See Also:

Speech Recognition Theory
The Second Great AI Red Herring Chase
Secrets of the Holy Grail

Wednesday, August 29, 2012

The Myth of the Bayesian Brain, Part II

Part I, II, III

Abstract

In Part I, I wrote that the Bayesian brain hypothesis is the last great barrier to our gaining a full understanding of intelligence. I described the difference between the Bayesian model of perception and the Rebel Science model. In this post, I explain why the latter is superior.

The Real GOFAI Lesson

History repeats itself. During the second half of the 20th century, AI experts were convinced that intelligence was just symbol manipulation. Their rationale was based on the observation that humans are very good at using and manipulating symbols. It was a classic case of equating a system with its behavior, i.e., with what it does. Sure, we can manipulate symbols but we can do much more than that. Needless to say, symbolic AI (aka "good old fashioned AI" or GOFAI) died a slow and painful death.

Did the AI cognoscenti learn their lesson from the GOFAI debacle? Not really. They are busy repeating the same mistake. The new rationale is that the brain is a Bayesian system because it is very good at handling probabilities. That is regrettable because the real lesson of GOFAI is that what we do and how we do it are two different things. Our Bayesian-like behavior is just one of the many products of our intelligence, not the basis of it.

Non-Probabilistic Perception

Why is the Rebel Science model of perception superior to the Bayesian model? Here is why:
  1. Humans use absolute certainties to reason with, not probabilities. This is true even when we reason about probabilities.
  2. Even though our senses are plagued with noise and incomplete data, humans do not assume that the world is probabilistic.
  3. It is certainly true that, at the quantum level, everything is probabilistic but not at the macroscopic level in which our senses operate. At that level, almost everything is deterministic. Probabilistic events are few and far between.
In the Rebel Science model, every event in the world is a unique and deterministic precursor of its immediate successor. I remember when I first understood this simple truth. It was as if the scales had fallen off my eyes and I could see clearly for the first time. By the way, I am not the only one who has come to similar conclusions. When asked in a recent interview, "What was the greatest challenge you have encountered in your research?", Judea Pearl, an Israeli computer scientist and early champion of the Bayesian approach to AI, replied:
In retrospect, my greatest challenge was to break away from probabilistic thinking and accept, first, that people are not probability thinkers but cause-effect thinkers and, second, that causal thinking cannot be captured in the language of probability; it requires a formal language of its own.
In a weird sort of way, we seem to be going full circle. GOFAI scientists, especially the recently deceased John McCarthy, used to believe that AI was based on formal logic. The idea was abandoned when it became clear that formal logic could not handle common sense and the uncertainties in the sensory space. But, if humans are not probability thinkers, how is it that they handle probabilistic sensory stimuli so well? Counterintuitively, the solution rests in the assumption that the world is perfect. That's the topic of my next post in this series. Stay tuned. I don't have much time but I think this is something that must be heard, not just because my entire approach to AI revolves around it but also because these ideas are crucial to our understanding of intelligence.

See Also:

Speech Recognition Theory
The Second Great AI Red Herring Chase

Sunday, August 26, 2012

The Myth of the Bayesian Brain, Part I

Part I, II, III

Abstract

Previously, I wrote that, by embracing Bayesian statistics, the artificial intelligence community has embarked on yet another great red herring chase. I claimed that Bayesian statistics will prove to be just as wasteful of time, brain and money as the symbolic AI craze of the last century. In this article, I explain why the Bayesian mindset is injurious to progress in our understanding of intelligence. I do not argue that the Bayesian approach is bad in and of itself but that, when it comes to explaining the brain’s ability to handle uncertainty, there is a competing model that is orders of magnitude better.

The Last Great Barrier to Fully Understanding Intelligence

Scientists are a conservative and taciturn lot. They will live with a myth or obvious falsehood for decades and even centuries because the humiliation and other hardships that come from rejecting the lie are too painful for them to bear. The Bayesian brain is just such a myth. The problem is that it is now so firmly entrenched in the AI community that accommodating a different perspective would be suicidal to many careers. I will argue that the Bayesian mindset is the last great barrier to progress in AI because it cripples our understanding of the most important aspect of intelligence: perception. I believe that having a correct understanding of perception will unleash a flood of insights that will quickly lead to a full understanding of intelligence, artificial or otherwise.

Two Competing Models of Perception

Below is the essence of the two competing models of perception.
  • The Bayesian model assumes that events in the world are inherently uncertain and that the job of an intelligent system is to discover the probabilities.
  • The Rebel Science model, by contrast, assumes that events in the world are perfectly consistent and that the job of an intelligent system is to discover this perfection.
As you can see, the two models are polar opposites of each other in their assumptions. The two views have drastically different consequences in the way we design our perceptual systems. In my next post in this series, I will explain why the Rebel Science model is far superior to the Bayesian model. Hang in there.

See Also:

The Second Great AI Red Herring Chase

Thursday, August 23, 2012

The Second Great AI Red Herring Chase

Vicarious in the News

Vicarious Systems is in the news again. They have managed to raise $15 million to continue their ongoing research in their flagship artificial intelligence technology, the Recursive Cortical Network. Nice chunk of change. Unfortunately, nobody seems to know much about RCN and Vicarious is not talking. But we all know why: it's the old "we got something big but our IP lawyer told us not to say anything" excuse all over again. Still, it is easy to speculate that RCN is just a different take on Numenta's hierarchical temporal memory.

An Addiction to Math and Bayes

Back in October of last year, I wrote a critical article about Vicarious' approach to solving the AI problem and their chances of success. Here's an excerpt:
So what do I think of Vicarious' chances of solving the AI problem? I'll be blunt. I think they have no chance whatsoever. Zilch. Here's why. Dileep George, the brain of the company, is a PhD electrical engineer and mathematician who believes that math is essential to solving the AI puzzle. This alone tells me that he has no real understanding of the problem. Furthermore, although I think that a study of the brain can eventually lead to a major breakthrough, it is highly unlikely that this approach will lead to a breakthrough in the foreseeable future. The brain has a bad habit of hiding its secrets in a forest of apparent complexity. The Wright brothers never had to deal with hidden knowledge. They, like everyone else, could easily observe the gliding flight of birds and derive useful principles.

As an example, let's take George's adoption of Bayesian inference for sequence prediction in hierarchical memory. Bayesian statistics is the sort of thing that a mathematician like George would find attractive just because it's math. But is it based on the known biology of the brain? Not at all. Can George search the neuroscience and biology literature to find out what method the brain uses for prediction? The answer is no because biologists have not yet discovered how the brain does it. They just know from psychology that the brain is very good at judging probabilities based on experience. That is the extent of their knowledge.
So has there been any substantial changes in George's approach to AI since? I don't think so. Perusing their help wanted page, I can see that, other than getting their hands on a load of cash with which to hire new engineers and buy new equipment, nothing has changed much. Notice that George takes pains not to mention the word Bayesian in the section on desired skills. However, he wants his hired engineers to have experience "with belief propagation and approximation methods", which is essentially the same thing as Bayesian statistics. Of course, George wouldn't be George is he didn't also ask for solid math skills. The man is a mathematician at heart and he is absolutely convinced there can be no brain-like machine intelligence without math. The man is mistaken, in my opinion, and I'll explain why in a future article.

The Second Great AI Red Herring Chase

Back in the 1950s, thanks mostly to ideas advanced by Alan Turing, the scientific community embraced symbol manipulation as the correct approach to solving the AI problem. They were wrong from the start. Unfortunately, it took over half a century for them to realize that they had been chasing after a red herring all these years. What a waste! Even now, some are still not convinced but, by and large, AI researchers have moved to greener pastures.

Lately, the community has plunged headlong into yet another great red herring chase. They're convinced that Bayesian statistics are the key to AI. They are convinced because Bayesian models (e.g., the hidden Markov model) are currently being used to develop very impressive speech recognition products and other applications that deal with uncertainty and probability. Unfortunately the technology has run into a nasty brick wall that refuses to budge: unlike the human brain, the accuracy of Bayesian systems drops precipitously in the presence of noise. Try speaking commands into your smartphone or use dictation software at a crowded party and you'll see what I mean. There is no question that the brain can effectively and efficiently process probabilistic stimuli. My claim is that it does not use Bayesian statistics to do it.

I'll have more to say about this topic in the near future. Stay tuned.

See Also:

Vicarious Systems' Disappointing Singularity Summit Talk
The Myth of the Bayesian Brain

Wednesday, August 15, 2012

Rebel Speech Recognition Theory

A Different Approach

The approach that I use for speech recognition in the Rebel Speech recognizer is completely different than the one used by most existing technologies. As I have mentioned elsewhere, I disagree with AI experts that the brain uses anything like Bayesian statistics to process probabilistic stimuli. I added a theory section in the Rebel Speech design document (pdf) and I reproduce it below.

Bayesian Bandwagon

The most surprising thing about the Rebel speech recognition engine is that, unlike current state of the art speech recognizers, it does not use Bayesian statistics. This will come as a surprise to AI experts because they have all jumped on the Bayesian bandwagon many years ago. Even those who claim to be closely emulating biological systems believe in the myth of the Bayesian brain. Of course, this is pure speculation and wishful thinking because there is no biological evidence for it. In a way, this is not unlike the way the AI community jumped on the symbol manipulation bandwagon back in the 1950s, only to be proven wrong more than half a century later. I have excellent reasons to believe that, in spite of its current utility, this is yet another red herring on the road to true AI.

Traditional Speech Recognition

Most speech recognition systems use a Bayesian probabilistic model, such as the hidden Markov model, to determine which senone, phoneme or word is most likely to come next in a given speech segment. A learning algorithm is normally used to compile a large database of such probabilities. During recognition, hypotheses generated for a given sound are tested against these precompiled expectations and the one with the highest probability is selected as the winner.

Rebel Speech Recognition

In contrast to the above, the Rebel Speech engine does not rely on pre-learned probabilities. Rather, it uses an approach that is as counter-intuitive as it is powerful. In this approach, the probability that the interpretation of a sound is correct is not known in advance but is computed on the fly. The way it works is that the engine creates a hierarchical database of as many sequences of learned sounds as possible, starting with tiny snippets of sound that are shorter than a senone. When sounds are detected, they attempt to activate various sequences and the sequence with the highest hit count is the winner. A winner is usually found even before the speaker has finished speaking. It works because sound patterns are so unique, they form very few sequences. Once a winner is determined, all other sequences that do not belong to the same branch in the hierarchy are immediately suppressed. This approach leads to very high recognition accuracy even when parts of the speech are missing; and it makes it possible to solve the cocktail party problem (pdf).

Wednesday, August 8, 2012

Rebel Speech Update, August 8, 2012

Back to Work on AI

I had a little free time in the last several days, enough to let me get back to work on my AI research. I'm working on the Rebel speech recognition engine again. Hurrah! I prepared a design document in order to give you an idea of where I'm going with the engine. The doc is not yet finished (only a brief introduction is written) but you can download (pdf) it now.

Writing Code

I'm busy writing C# code for the speech engine. There are three main software modules: the fast Fourier transform (FFT) module, the pattern learner/recognizer and the Tree of Knowledge (TOK). The FFT module and the pattern learner are already completed. The TOK is the most complex part of the engine. It's coming along rather slowly, mostly because it crashes often and it's hard to debug (I'm using parallel algorithms for speed). So far, I've gotten it to accurately recognize the words 1 through 7. The engine is also very noise tolerant and the recognition speed is so fast as to be virtually instantaneous. In fact, it recognizes a word even before I'm finished saying it. This is because the engine uses pattern completion, a probabilistic recognition technique based on prediction.

Coming Soon

If all goes well and if I can find enough time, I hope to have a demo executable for downloading within a couple of months or so. It will come with a data file that contains the engine's current knowledge. I'm sorry that I cannot release a trainable demo. I was going to publish everything but I changed my mind again. I need to keep some of this stuff secret so that I can capitalize on this technology. Stay tuned.


Thursday, March 29, 2012

Hiatus

Taking a Break

I'm sorry I haven't posted anything in a while. I am afraid that I must take a prolonged break from my research in order to properly care for my wife who suffers from amyotrophic lateral sclerosis, a neurodegenerative disease. The good news is that her health is continuing to improve, slowly but steadily. Lately, there has been excellent progress in figuring out what is causing the disease and what can be done to stop its progression and cure it once and for all.

Hang in there. It is not over until it's over.

Wednesday, February 8, 2012

February 8 Update

Hopefully, I will be back to blogging more regularly now that my wife's health has stabilized. In the last month, she made two trips to the emergency room. Both were the results of side effects from medication that was prescribed to her by her doctors. Now that she has successfully weaned herself from those poisons, her health has markedly improved.

To the amazement of her doctors, the progression of her ALS has stopped completely. I attribute this to a home brew treatment that consists of a daily dose of sodium chlorite and oxygen therapy. In addition, she takes some B vitamins and the amino acid l-Arginine for circulation. Thanks to everyone for the words of encouragement. Stay tuned.