Wednesday, April 29, 2015

No, a Deep Learning Machine Did Not Solve the Cocktail Party Problem

Irresponsible Hype from MIT Technology Review

MIT Technology Review is running a story claiming that a group of machine learning researchers used a convolutional deep learning neural network to solve the cocktail party problem. Don't you believe it. The network that was used has to be pre-trained separately on individual vocals and musical instruments in order to separate out the vocals from the background music. In other words, it can only separate voice from music.

The human brain needs no such training. We can instantly latch on to any voice or sound, even one that we had never heard before, while ignoring all others. We have no trouble focusing on a strange voice speaking a foreign language in a room full of talking people, with or without music playing. This is what the true cocktail party problem is about. A deep learning network cannot pay attention to an arbitrary voice while ignoring the others. To do this, it would have to be pre-trained on all the voices individually.

Note: I posted a protest comment at the end of the article but MIT Tech Review editors chose to censor it. I guess it is easier to attract visitors with a lie than the truth.

It Is Not about Speech

Contrary to rumors, the cocktail party problem has nothing specifically to do with speech or sounds. To focus on individual sounds, the brain uses the same mechanism that it normally uses to pay attention to anything, be it a bird, the letters and words on the computer screen or grandma's voice. The attention mechanism of the brain is universal and is an inherent part of the architecture of memory and how objects are represented in it. Unlike deep learning neural networks, it does not have to be trained separately for every sound or object. The ability of the cortex to instantly model a novel visual or auditory object is a major part of the brain's attention mechanism.

It is clear that the auditory cortex can quickly model a new sound on the fly and tune its attention mechanism to it. No deep learning network can do that. And knowing what I know about how the brain's attention mechanism works, I can confidently say that no deep learning network can ever do that.

See Also:

Did OSU Researchers Solve the Cocktail Party Problem?
In Spite of the Successes, Mainstream AI is Still Stuck in a Rut
Why Deep Learning Is a Hindrance to Progress Toward True AI

Wednesday, April 8, 2015

200 Million Horsemen and the Corpus Callosum

In my previous post, I claimed that the books of Revelation and Zechariah contain a detailed description of the brain, intelligence and consciousness. In this post, I just want to give interested readers a small taste of things to come. Here is a little gem from the book of Revelation that blew me away when I first understood it.

Corpus Callosum

In chapter 9 of the book of Revelation, we read the following:
Then the sixth angel sounded, and I heard a voice from the four horns of the golden altar which is before God, one saying to the sixth angel who had the trumpet, “Release the four angels who are bound at the great river Euphrates.” And the four angels, who had been prepared for the hour and day and month and year, were released, so that they would kill a third of mankind. The number of the armies of the horsemen was two hundred million; I heard the number of them.
It took me a while but I finally figured out that "two hundred million horsemen" is just a metaphor for the neuronal signals riding on the corpus callosum, the bundle of nerve fibers that connect the two hemispheres of the brain. Surprisingly enough, a quick search on Google reveals that the number of axonic fibers in the corpus callosum is estimated to be about 200 million!

More to Come

The book of Revelation is a treasure trove of information about the brain. It gives precise metrics for a number of brain structures and processes. For examples:
  • The "four angels" mentioned in the text above symbolize four distinct signal pathways or gateways within the corpus callosum.
  • The exact duration of human attention span is 12.6 seconds.
  • It takes the brain exactly 35 milliseconds to switch its focus from one subject to another.
This is just the tip of the iceberg but there is a time for everything. Please be patient and stay tuned.

See Also:

Zechariah and Revelation: Bombshells in the Way