I started yet another artificial intelligence project, Rebel Speech. Actually, it is part of the Rebel Cortex project, which now consists of Rebel Vision and Rebel Speech. Both subprojects will use the same sensory cortex for learning and recognition. Programming wise, speech recognition is less complex than visual recognition because the sensory mechanism is easier to implement. It's mostly a matter of using a Fast Fourier Transform to convert a time domain audio signal from a microphone into a frequency domain signal. In addition, only a fraction of the detected frequency spectrum is required for good performance. I envision that someone will one day design a microphone that works like the human ear, i.e., it would use many microscopic hair-like sensors that respond directly to various frequencies. In the meantime, a regular microphone and an FFT will do.
I've been writing some Windows C# code for Rebel Speech in my spare time in the last few days. I have already implemented the microphone capture and FFT code. Well, it's not all that hard considering that there is a lot of good and free FFT code on the net and Microsoft provides a handy Microphone class in its XNA framework. I am now working on designing the audio sensors and the sensory layer. It's a little complicated not just because I need to design both signal onset and offset sensors but also because dealing with stimulus amplitude is counterintuitive. In the brain, all signals are carried by pulses which have pretty much equal amplitude. One would think that changes in the intensity of a stimulus should be converted into frequency modulation but that is not the way it works either. The brain uses a technique that I call population modulation to encode amplitude. In other words, there are many sensors that handle a single phenomenon. The number of sensors that fire in response to a sensory stimulus is a function of the intensity of the stimulus.
In the brain, this sort of parse activation is accomplished with the use of inhibitory connections between the cells in a group. Luckily, in a computer brain simulation, all we need is a list of cells. Stay tuned.
Invariant Visual Recognition, Patterns and Sequences
Rebel Cortex: Temporal Learning in the Tree of Knowledge