Input signals of conventional headphones are exactly the same as those of loudspeakers. For this reason, the sound that the listener hears lacks factors that provide directional images (HRTF) that normally exist in a natural listening environment. Because the sound image reproduced remains in a "fixed position in the listener's head" and the directional quality is lost, the resulting sound does not convey a close-to-life image.
Human beings determine the source of sound through a kind of triangulation based on the difference in the volume of the sound reaching the left and the right ears and their subtle time delay. Binaural technology*5 is an application of this principle. It uses a human-shaped microphone called a dummy-head microphone and is a recording/reproduction format very similar to how we naturally hear the sound.
The transformation characteristics of sound from the source to each ear is called the head-related transfer function (HRTF). They are highly important characteristics that create the sound image outside the listener's head, giving directional effect and sense of distance. They are also described as head-related impulse response (HRIR). The diagram below is an example of HRIR indicating the sound transformation characteristics from the source to a dummy-head microphone. Sony developed Virtualphones Technology (VPT) by applying and improving upon the principle of binaural technology. As a result, Sony succeeded in creating a sound image outside the listener's head. With a small electronic device, it became possible to reproduce a sound field equivalent to what is possible with 5.1-channel speaker system even with headphones.
In theory, this method can reproduce an unlimited number of sound channels. By using this headphone system, a listener can enjoy movies or music always seated in the best position in a home theater or a music hall.
When a human-shaped microphone called a dummy-head microphone is used to record 2-channel stereo sound, the sound reproduced does not remain within the listener's head (the sound image is not established in a "fixed position in the head"), bringing an enhanced sense of a live presence. This technology is called binaural recording/reproduction. Sony recognized the potential of this technology and has continued its research and development.
The benefits of this technology are, as stated before, the ability to achieve immersive life-like sound with directional quality through headphone reproduction. The disadvantages are: 1) it is difficult to gain a clear sense of distance and directional positioning of the sound coming from the front. 2) The sound signals recorded using this method tend to sound high-pitched when reproduced through loudspeakers, thus, requiring two separate sources for headphone reproduction and loudspeaker reproduction. It is mainly for these weaknesses that this technology has failed to gain common acceptance despite of its excellent quality.
In order to solve the first problem, Sony conducted research on sound physiology and acoustic theory and developed proprietary technologies including head-tracking technology*6. This, in turn, effectively made it possible to acquire a realistic positioning image of the sound coming from the front.
The solution to the second point was to add sound characteristics measured at a test listening in a listening room by using data signal processing. In other words, Sony solved the problem by adding sound field information and HRTFs equivalent to those from the sound field of desired sounds using a signal processing method called convolution. As a result, it became possible to create on an electronic circuit various sound fields where the dummy-head was placed.
Conventionally, one of the challenges of sound reproduction using headphones is that the sound field moves with the rotation of the listener's head. VIP-1000 and MDR-DS8000 have Gyro Sensors attached to the headsets. They calculate the rotating angle of the head and based on the angle, make data adjustments to fix the sound fields using digital signal processing. In other words, real time adjustments of HRIR from the source to the ears are made as the head moves.
Therefore, even when the listener looks to his/her side wearing the headphone, the sound field will stay fixed, allowing him/her to enjoy close-to-live sound images with a realistic sound field. This can be explained from an acoustic physiological point of view as a phenomenon of sound stimulation reaching motor center in the head becoming closed loop. By switching on and off the head-tracking switch of the headphone indicated in the above diagram, it is clear that head movement largely affects the sound positioning
Human beings determine the distance and direction of the source of the sound through a kind of triangulation. However, when a person rotates his/her head, the brain gets confused unless these characteristics synchronically change with the head movements. As a result, the sound image tends to be fixed within the listener's head or in the original position. This is one of the reasons why the sound appears to originate inside the listener's head when using conventional headphones. Although it is technically possible to keep the sound image outside the listener's head by adjusting the patterns of reflected waves from the wall, the sound image tends to get fixed inside the head when the person is just listening to sound without a visual image. Head-tracking technology is a solution to this problem. By constantly tracking head movements, real time adjustments are provided for the characteristics of sound transmission from the source to the ears, thus, enabling constant and continuous accurate triangulation and resulting in a clear image of the fixed position in the front. Piezoelectric vibration gyros were adopted for tracking.