There is a growing need for high-quality musical metadata (data characteristics) to support new ways of enjoying music, including advanced music search and recommendation. Conventional manual metadata assignment is costly and can lead to other problems, such as data inconsistency.
Sony has developed a unique 12 Tone Analysis system that automatically extracts a variety of metadata, including beat, chord progression, song structure, genre, instruments and mood, by using signal processing and statistical processing to analyze musical waveforms. This technology has been used in Sony's GIGA JUKE and Rolly, and also in VAIO software.

Figure 1: 12 Tone Analysis



| Perceived speed |
The speed of the music as perceived by the human ear. This feature is distinguished from tempo, since the perceived speeds of songs may vary because rhythm patterns and other factors, even if the tempo is identical. |
|---|---|
| Perceived energy |
The energy of music as perceived by human ears. A quiet song will seem to have less energy, while a bright and lively song will seem to be more energetic. |
| Genre | Whether or not the song fits a particular genre, such as rock, jazz or classical Instrumental sound: Whether or not the music includes particular instruments, such as piano, bass or guitar. |
| Instrumental sound |
Whether or not the music includes particular instruments, such as piano, bass or guitar. |
| Mood | Whether or not the song fits particular mood keywords, such as "bright" or "refined." |