And so what it is about the way the human being understands dialogue that's different from the way we understand sound effects? Because clearly there is a difference there. And I think what the difference is, is a question of what I came to call encoded sound and embodied sound. Language is an encoded sound because language is a code. These are just syllables that we can make with our lips and mouth and throat; that we have determined that cat is a sound that means an animal with a furry body and whiskers and pointy ears and a tail.
And that, in a different language, that's a different word. You have to know the language in order to break the code. So, when somebody is speaking to you, your brain is very busy. Even though you know the language very well, if you know English very well and somebody's speaking to you in English, your brain is nonetheless very busy behind the scenes cracking open this code and extracting the meaning out of it.
And under those circumstances, I think, my hypothesis is that the brain lets go of directionality in a cinematic situation. It says, 'I'm too busy to worry about that; I will believe it's coming from wherever my eyes say the person is.' So, in that case, the vision of the location of where the person is steers our belief that that sound is actually coming from there. Because otherwise we're too busy... The brain is occupied decoding it.