Friday, June 20, 2014

I Am Paranoid About the Future

A Potential Death Sentence for Humanity

The more I think about the consequences of artificial intelligence, the more I tremble with fear and apprehension. No, I'm not worried about some mythological sci-fi scenario in which robots rebel against their owners and wipe out humanity. That's just nonsense coming from the Singularity movement. Those who hold those views are clueless as to the true nature of intelligence. They are lost in a lost world.

I am paranoid because I understand the power of intelligence, artificial or otherwise. Any government, organization or individual who manages to control artificial intelligence will have the power to turn our beautiful planet into hell. The introduction of true AI into this world, as it currently is, with all its wars, corruption, crime and countless other horrors, would be a death sentence for humanity. I have seen the enemy and he is not a machine. He is us.

Hang in there.

Wednesday, May 28, 2014

The Rebel Speech Recognition Project

Progress Update

I am making rapid progress working on the Rebel Speech project and it will not be long before I release a demo. Please have patience. Rebel Speech will be a game changer in more ways than one. There are many things I need to consider as far as when and how to publish the results of my research. I cannot divulge the state of the engine at this time but what I can say is that it will take many by surprise.

My plan, which is subject to change, is to release a program that will demonstrate most of the capabilities of the model. The demo will consist of an executable program and a single data file for the neural network, aka the brain. The latter will be pre-trained to recognize the digits 1 to 20 (or more) in three or four different languages. I will not release the learning module and the source code, at least not for a while. The reason is that I need to monetize this technology to raise enough money to continue my AI research. What follows is a general description of Rebel Speech.

The Rebel Speech Recognition Engine

The Rebel Speech recognition engine is a biologically plausible spiking neural network designed for general audio learning and recognition. The engine uses two hierarchical subnetworks (one for patterns and one for sequences) to convert audio waveform data into discrete classifications that represent phonemes, syllables, words and even whole phrases and sentences. The following is a list of some of the characteristics that distinguish Rebel Speech’s architecture from other speech recognizers and neural networks:
  • It can learn to recognize speech in any language, just by listening from a microphone.
  • It can learn multiple languages concurrently.
  • It can learn to recognize any type of sound, e.g., music, machinery, animal sounds, etc.
  • Learning is fully unsupervised.
  • It is as accurate as humans on trained data. Or better.
  • It is noise and speaker tolerant.
  • It can recognize partial words and sentences.
  • It uses no math other than simple arithmetic.
Even though Rebel Speech has multiple layers of neurons in two hierarchical networks, this is where the similarity with deep learning ends. Unlike deep neural networks, the layers in Rebel Speech are not pre-wired and synaptic connections have no weights. A synapse is either connected or it is not. In fact, when Rebel Speech begins training, both networks are empty. Neurons and synapses are created and added on the fly during learning and only when needed.

Program Design

The engine consists of three software modules as depicted below.


The sensory layer is a collection of audio sensors. It uses a Fast Fourier Transform algorithm and threshold detectors (sensors) to convert audio waveform data into multiple streams of discrete signals (pulses) representing changes in amplitude. These raw signals are fed directly to pattern memory where they are combined into concurrent groups called patterns. Pattern detectors send their signals to sequence memory where they are organized into temporal hierarchies called branches. Each branch is a classification structure that represents a specific sound or sequence of sounds.

Winner-Takes-All

Most speech recognition systems use a Bayesian probabilistic model, such as the hidden Markov model, to determine which phoneme or word is most likely to come next in a given speech segment. A special algorithm is used to compile a large database of such probabilities. During recognition, hypotheses generated for a given sound segment are tested against these precompiled expectations and the one with the highest probability is selected.

In Rebel Speech, by contrast, the probability that the interpretation of a sound is correct is not known in advance. During learning, the engine creates a hierarchical database of as many non-random sequences of patterns as possible. Sequences compete for activation. When certain sound segments are detected, they attempt to activate various pre-learned sequences in memory and the one with the highest hit count is the winner. A winner usually pops up before the speaker has finished speaking. Once a winner is found, all other competing sequences are suppressed. This approach leads to high recognition accuracy even in noisy environments or when parts of the speech are missing.

Stay tuned.

Tuesday, February 25, 2014

Artificial Intelligence and the Bible: Sensory Learning in Smyrna, Part II

Part I, II

Abstract

In Part I, I wrote that, according to my interpretation of the message to the church of Smyrna in the Book of Revelation, the brain uses two types types of sensors: rich and poor. I explained that it takes an onset sensor and an offset sensor to properly represent a single sensory phenomenon or stimulus at a given amplitude. In today's post, I interpret verses 10 and 11 of the message to Smyrna which describe how sensory learning works. But first, a word about the importance of sensory timing.

The Timing of Sensory Signals in the Brain

Why is it so important that there be two complementary sensors for a stimulus? The reason is that perception is primarily concerned with the evolution of events, i.e., with how things change relative to one another. Phenomena come and go at precise times.

Certain changes happen concurrently and these are called patterns. Patterns succeed each other to form precisely timed sequences. For example, sensors A and B in the diagram above will sometimes fire concurrently with other sensors. Knowing when this happens is valuable information. Sensory learning consists of capturing the temporal correlations in the sensory space and this allows the brain to gain an understanding of how changes occur in the environment. With a good temporal model of the world, an intelligent system can form predictions, plan future actions, adapt to changes and achieve various goals. This is what intelligence is about.

Message to the Church of Smyrna

Revelation 2:8-11
8 “And to the angel of the church in Smyrna write, ‘These things says the First and the Last, who was dead, and came to life:
9 “I know your works, tribulation, and poverty (but you are rich); and I know the blasphemy of those who say they are Jews and are not, but are a synagogue of Satan.
10 Do not fear any of those things which you are about to suffer. Indeed, the devil is about to throw some of you into prison, that you may be tested, and you will have tribulation ten days. Be faithful until death, and I will give you the crown of life.
11 “He who has an ear, let him hear what the Spirit says to the churches. He who overcomes shall not be hurt by the second death.”’

Commentary (continued)

10 Do not fear any of those things which you are about to suffer. Indeed, the devil is about to throw some of you into prison, that you may be tested, and you will have tribulation ten days. Be faithful until death, and I will give you the crown of life.

Sensory learning is a relentless and uncompromising trial. The phrase "some of you" means that new connections for patterns are chosen randomly and then subjected to a brutal trial period during which they are tested 10 times. As I will explain in a future article, a day symbolizes a single neuronal or firing cycle, which is the duration of a single pulse (about 10 milliseconds in the brain). In other words, there are 10 tests and each one lasts a single firing cycle. From this, we can logically deduce that every connection is tested for concurrency with other connections. Why? Because concurrency is the only thing that can be tested during the time of a single pulse.

One of the important things to note here is that connections either survive or they don't. The ones that fail are put to death, that is, they are disconnected. The "crown of life" means that disconnected synapses are reborn and tried again elsewhere. The "prison" metaphor symbolizes the fact that connections are not allowed to "earn a living" during the time of their trial. In other words, the connections cannot contribute to their churches (or patterns) until they pass all the 10 tests and are released from prison.

It goes without saying that the Biblical model sharply contradicts current neural network models that use fixed, pre-wired connections. Also, the Biblical model strongly suggests that synaptic learning is an either-or process: either a connection is made or it isn't. There are no in-betweens, i.e., there is no need for a range of connection weights to encode knowledge. This is why I maintain that deep learning will go the way of symbolic AI and that the high-tech industry is building a billion dollar AI castle in the air.

Finally, we must ask, why 10 test cycles? Why not 5 or 20? To answer this question, we must understand what exactly is being learned. The brain is looking for all possible patterns that occur often enough to be considered non-random. It does not care about their actual probabilities of occurrence because it uses a winner-takes-all mechanism whereby patterns and sequences in memory compete for activation: the ones with the greatest number of hits are the winners. A compromise must be reached between conducting too many tests, which would retard learning and miss low probability patterns, and not conducting enough tests, which would result in learning useless patterns. We can surmise that 10 is just an optimum number. On a side note, this would be a fairly easy hypothesis to falsify. The finding that sensory learning in the brain is based on a mechanism that counts to 10 would go a long way to corroborate this theory.
Note: I am still working on the Rebel Speech demo program and I hope to release it soon. I will also release the source code for the recognizer but not the learner. Rebel Speech incorporates all the principles I have described in this series on AI and the Bible.
11 “He who has an ear, let him hear what the Spirit says to the churches. He who overcomes shall not be hurt by the second death.”’

That first sentence in verse 11 is repeated in every message to the seven churches. It is a sign that the messages do not mean what they appear to mean on the surface. What is the meaning of the "second death" metaphor? I am not 100% sure at this point but it seems to mean that, once a connection is established through testing, it becomes permanent. In other words, unlike sequences which can be forgotten, patterns are retained forever. Note that I am still investigating this metaphor because it is mentioned elsewhere in the book of Revelation.

See Also:

The Billion Dollar AI Castle in the Air
Secrets of the Holy Grail
Artificial Intelligence and the Bible: Message to Sardis
Artificial Intelligence and the Bible: Joshua the High Priest
Artificial Intelligence and the Bible: The Golden Lampstand and the Two Olive Trees

Saturday, February 22, 2014

Artificial Intelligence and the Bible: Sensory Learning in Smyrna, Part I

Part I, II

Abstract

Previously in this series, I wrote that I get my understanding of intelligence and the brain (see Secrets of the Holy Grail) from ancient Biblical metaphorical texts that are thousands of years old. (Yeah, yeah, yeah, I know I am a crank and a lunatic; what else is new?) The message to the church of Smyrna in the book of Revelation is particularly interesting because it describes sensory learning, the most important aspect of perception. In this article, I interpret the metaphors in the message and I argue that experts in deep learning neural networks (the current rage in artificial intelligence research) are lost in the wilderness because they got sensory processing all wrong.

A Note About Sensory Signals in the Brain

The brain uses two types of sensors and each type serves a completely different purpose. The book of Revelation uses two metaphors to describe them: the poor and the rich. "Poor" sensors are used by the sensory cortex for unsupervised perceptual learning and pattern recognition whereas "rich" sensors are used by the cerebellum for fully supervised sensorimotor learning. I will get back to the cerebellum in a future article.

A poor sensor fires either at the onset or offset of a phenomenon or stimulus. By contrast, a rich sensor fires repeatedly during the entire duration of the phenomenon. This is illustrated in the diagram below. The curved line represents the varying intensity of a sensed phenomenon, say, the changing amplitude of an audio frequency signal over time. The brain uses multiple discrete sensors to detect different signal amplitudes. For simplicity's sake, the diagram is concerned only with the detection of a single amplitude shown as a horizontal line.
It takes two poor sensors (A and B) to sense a phenomenon at a given amplitude, one to detect the onset and another to detect the offset of the phenomenon. By contrast, a single rich sensor associated with the same phenomenon at the same amplitude fires repeatedly while the phenomenon lasts. The short vertical lines in the diagram represent the firing pulses of a rich sensor. The two red vertical lines at the beginning and end of the series are the pulses emitted by the onset and offset sensors. As seen below, the message to Smyrna is concerned only with poor sensors, i.e., with the first and the last pulses.
Note: It goes without saying that the sensory cortex responds only to changes in the environment. If you are an AI expert and your machine learning program does not use onset and offset sensors as described above, you are doing it wrong. This is especially important in visual or auditory processing. Visual processing requires frequent jerky motions (microsaccades) of the eye in order to effect changes that the sensors in the retina can respond to. Those of you who are convinced that deep learning and convolutional neural networks are God's gifts to humanity, have a surprise coming.

Message to the Church of Smyrna

Revelation 2:8-11
8 “And to the angel of the church in Smyrna write, ‘These things says the First and the Last, who was dead, and came to life:
9 “I know your works, tribulation, and poverty (but you are rich); and I know the blasphemy of those who say they are Jews and are not, but are a synagogue of Satan.
10 Do not fear any of those things which you are about to suffer. Indeed, the devil is about to throw some of you into prison, that you may be tested, and you will have tribulation ten days. Be faithful until death, and I will give you the crown of life.
11 “He who has an ear, let him hear what the Spirit says to the churches. He who overcomes shall not be hurt by the second death.”’

Commentary

The message to Smyrna is the shortest of all the messages to the seven churches of Asia in the book of Revelation, but don't let that fool you. It manages to pack an amazing amount of crucial information about sensory signals and sensory learning in just a few short sentences.

8 “And to the angel of the church in Smyrna write, ‘These things says the First and the Last, who was dead, and came to life:

"The First and the Last", of course, symbolizes the onset and offset sensory pulses explained above. As we shall see in the interpretation of verse 10, the phrase "who was dead, and came to life" alludes to the fact that, during pattern learning, sensory connections almost always die (are disconnected) and then resurrected (are reconnected somewhere else).

9 “I know your works, tribulation, and poverty (but you are rich);

The church of Smyrna goes through tribulation. This symbolizes that every sensory connection must go through a testing period. Even though the church is poor, it becomes rich through hard work and by overcoming tribulation.

9 [...] and I know the blasphemy of those who say they are Jews and are not, but are a synagogue of Satan.

This is both humorous and powerful. As I will explain in a future article, the false Jews, or the "synagogue of Satan", represent the church of Laodicea, which I interpret to symbolize the cerebellum, a supervised sensorimotor mechanism used for routine or automated tasks. The cerebellum receives sensory signals only from rich sensors.

Coming Up

In Part II, I will interpret verses 10 and 11 of the message to Smyrna, which describe the heart of sensory and pattern learning.

See Also:

The Billion Dollar AI Castle in the Air
Secrets of the Holy Grail
Artificial Intelligence and the Bible: Message to Sardis
Artificial Intelligence and the Bible: Joshua the High Priest
Artificial Intelligence and the Bible: The Golden Lampstand and the Two Olive Trees

Saturday, February 15, 2014

The Billion Dollar AI Castle in the Air

Abstract

High tech companies (e.g., Microsoft, Google, FaceBook, Netflix, Intel, Baidu, Amazon, etc.) are pouring billions of dollars into a branch of artificial intelligence called machine learning. The two main areas of interest are deep learning and the Bayesian brain. The goal of the industry is to use these technologies to emulate the capabilities of the human brain. Below, I argue that, in spite of their initial successes, current approaches to machine learning will fail primarily because this is not the way the brain works.

This Is Not the Way the Brain Works

Some in the business have argued that the goal of machine learning is not to copy biological brains but to achieve useful intelligence by whatever means. To this I say, phooey. Symbolic AI, or GOFAI, failed precisely because it ignored neuroscience and psychology. The irony is that the most impressive results in machine learning occurred when researchers began to design artificial neural networks (ANNs) that were somewhat inspired by the architecture of the brain. Deep learning neural networks, especially convolutional neural nets, are attempts at copying the brain's cortical architecture and the early results are impressive, relatively speaking. But this is unfortunate because researchers are now under the false impression that they have struck the mother lode, so to speak. Below, I list some of the reasons why, in my opinion, they are not even close.
  • Deep learning nets encode knowledge by adjusting connection strengths. There is no evidence that this is the way the brain does it.
  • Deep learning nets use a fixed pre-wired architecture. The evidence is that the cortex starts out with a huge number of connections, the majority of which disappear as the brain learns.
  • Convolutional neural nets are hard wired for translational invariance. The evidence is that the brain uses a single mechanism for universal invariance.
  • Unlike the visual cortex, convolutional neural nets do not depend on saccades or microsaccades. This tells us that the brain uses a different method to process visual signals.
  • Deep learning nets use a single hierarchy for pattern learning and recognition. The evidence is that the brain's perceptual system uses two hierarchies, one for patterns and one for sequences of patterns.
  • The Bayesian brain hypothesis assumes that the brain uses probabilities for prediction and reasoning. The evidence is that the brain is not a probability thinker but a cause-effect thinker.
  • Proponents of the Bayesian brain assume that events in the world are inherently uncertain and that the job of an intelligent system is to compute the probabilities. The evidence is that events in the world are perfectly consistent and that the job of an intelligent system is to discover this perfection.
A Castle in the Air

It feels like I am preaching in the wilderness but someone has to do it. Of course, wherever there is a lot of money exchanging hands, self preservation and politics are sure to be lurking right under the surface. My arguments will be dismissed by those who stand to profit from it all and I will be painted as a crackpot and a lunatic (I don't deny that I'm insane) but I don't really care. My message is simple. There is no doubt that the industry is building an expensive castle in the air. Sure, they will have a few so-so successes here and there that will be heralded as proof that they know what they are doing. Google's much ballyhooed cat recognizing neural network comes to mind. But sooner or later, out of nowhere, and when they least expect it, someone else will come out with the real McCoy and the castle will come crashing down. The writing is on the wall.

See Also:

The Myth of the Bayesian Brain
The Second Great AI Red Herring Chase
Why Deep Learning Will Go the Way of Symbolic AI
Why Convolutional Neural Networks Miss the Mark
Secrets of the Holy Grail


Wednesday, February 12, 2014

Why Convolutional Neural Networks Miss the Mark

Abstract

Convolutional neural networks (CNNs) are a type of deep learning neural networks that have been successfully applied to visual recognition. They owe their success to being faster to train (probably because of their sparse connectivity) and to being invariant to certain spatial transformations such as translations. In this article, I argue that CNNs miss the mark because they have a rather limited form of invariance, whereas the brain is universally invariant.

Universal Versus Translation Invariance

If you hold your hand in front of your face and rotate it, move it side to side, up and down, shine a blue or red light on it, make a fist, a thumb up or peace sign, etc., at no point during the transformations will there be any doubt in your mind that you are looking at your hand. This is in spite of the fact that, during the transformations, your visual cortex is presented with literally hundreds of very different images. This is an example of universal invariance, something that the brain accomplishes with ease. CNNs can handle only a subset of these transformations because, as seen in the diagram below, they are hardwired for translation invariance.

With some modifications, it should even be possible to get a CNN to tolerate rotations. But CNNs suffer from an even bigger problem. They may be invariant to translations but they have no way of telling whether all the successive images represent the same hand. They can only recognize each image as a hand and that's about it. This lack of continuity makes them ill suited to future robot intelligence.
Source: deeplearning.net
CNNs are invariant to translations thanks to a technique known as spatial pooling. Essentially, neighboring units in a given layer are pooled together to activate a unit in the layer immediately above. The pooling method can use either addition, averaging or maximum. The end result is that the activation of a top layer unit is invariant to the position of a stimulus at the bottom layer.

Biological Implausibility

It is highly unlikely that the visual cortex uses pre-wired spatial pooling to obtain translation invariance. Why? First off, if the brain used a different type of invariant architecture for every type of transformation, the cortex would be a wiring mess. Second, one would expect the auditory cortex to have a different architecture for invariance than the visual cortex but this is not observed. The global uniformity of the cortex is one of its most striking features. A ferret whose optic nerves were rerouted to its auditory cortex in the embryonic stage, was able to use its auditory cortex to learn to see and navigate fairly normally.

How Does the Brain Do It?

It should be fairly obvious that the brain uses a single method to achieve universal invariance. The most likely hypothesis is that the brain has two memory hierarchies, one for concurrent patterns and one for sequences of patterns. Learning in the brain is 100% unsupervised. The sequence hierarchy is a powerful memory structure that serves multiple functions. It is a common storage mechanism for attention, prediction, planning, adaptation, short and long-term memory, analogies, and last but not least, temporal pooling. Every invariant object is represented by a single branch in the hierarchy. I hypothesize that temporal pooling is the way the cortex achieves universal invariance. To emulate the brain's universal invariance, one must first design a good pattern learner/recognizer that feeds its output signals to a sequence learning mechanism. The latter must be able to automatically stitch patterns and related sequences together to form invariant object representations. I will have more to say about pattern and sequence learning in future articles.

See Also

The Billion Dollar AI Castle in the Air
Why Deep Learning Will Go the Way of Symbolic AI

Sunday, February 9, 2014

Why Deep Learning Will Go the Way of Symbolic AI

Abstract

Deep learning is a machine learning and pattern representation and recognition technique based on multi-layered, statistical neural networks. Deep learning is all the rage lately. Big corporations like Google, Facebook and others are spending billions to set up labs and acquire experts and companies with experience in the technology. In this article, I argue that the current approach to deep learning will not lead to human-like intelligence because this is not the way the brain does it.
Related:
Why Convolutional Neural Networks Miss the Mark
The Billion Dollar AI Castle in the Air

Hierarchical Representation

There is no question that the brain classifies knowledge using a hierarchical architecture. The representation of objects in memory is compositional. That is to say, higher level representations are built on top of lower level ones. For examples, low level visual representations might consist of edges and lines. These can be combined to form higher level objects such as a nose or an eye. So the one thing deep learning neural networks have going for them is that they use multiple layers to form a hierarchical structure of representations.

Weighted Connections

A deep learning network consists of multiple layers of neurons. Each layer is a restricted Boltzmann machine or RBM.
Restricted Boltzmann Machine
The visible units of an RBM receive data from input sensors and the hidden units are the outputs of the machine. In a deep learning network, the hidden units are used as the visible units for the RBM residing immediately above it in the hierarchy. Each neuron (or hidden unit) in an RBM has a number of inputs represented by connections. Each connection is weighted, that is, it has a strength that is tuned by a learning algorithm during training on a set on examples. Loosely speaking, a connection strength represents the belief (or degree of certainty) that a particular input activation contributes to the activation of a hidden unit. A hidden unit is activated by approximating a nonlinear function of its inputs.

Biologically Implausible

There are a number of problems with deep learning networks that make them unsuitable to the goal of emulating the brain. I list them below.
  1. A deep learning network encodes knowledge by adjusting the strengths of the connections between visible and hidden units. There is no evidence that the brain uses variable synaptic strengths to encode degrees of certainty during sensory learning.
  2. Every visible unit is connected to every hidden unit in an RBM. There is no evidence that sensors make connections with every downstream neuron in the brain's cortex. In fact, as the brain learns, the number of connections (synapses) between sensors and the cortex is drastically reduced. The same is true for intracortical connections.
  3. Deep learning networks must be fine-tuned using supervised learning or backpropagation. There is no evidence that sensory learning in the brain is supervised.
  4. Deep learning networks are ill-suited for invariant pattern recognition, something that the brain does with ease.
  5. Deep learning networks use highly complex learning algorithms based on complex mathematical functions that require fast processors. There is no evidence that cortical neurons solve complex functions.
  6. Deep learning networks use static examples whereas the brain is bombarded with a constantly changing stream of sensory signals. Timing is essential to learning in the brain.
Winner Takes All

Current approaches to deep learning assume that the brain learns visual representations by computing input statistics. As a result, one would expect a gradation in the way patterns are recognized, especially in ambiguous images. However, psychological experiments with optical illusions suggest otherwise.
When looking at the picture above, two things can happen. Either you see a cow or you don't. There is no in-between. Some people never see the cow. Furthermore, if you do see the cow, the recognition seems to happen instantly.

It seems much more likely that the cortex uses a winner-takes-all strategy whereby all possible patterns and sequences are learned regardless of probability. The only criterion is that they must occur often enough to be considered above mere random noise. During recognition, the patterns and sequences compete for activation and the ones with the highest number of hits are the winners. This kind of pattern learning is simple (no fancy math is needed), fast and requires no supervision.

See Secrets of the Holy Grail, Part II for more on this alternative approach to pattern learning.

Conclusion

In view of the above, I conclude that, in spite of its initial success, deep learning is just a red herring on the road to true AI. It is not true that the brain maintains internal probabilistic models of the world. After all is said and done, deep learning will be just a footnote in the annals of AI history. The same can be said about the Bayesian brain hypothesis, by the way.

See Also

The Myth of the Bayesian Brain
Why Convolutional Neural Networks Miss the Mark