The freedom to extract audio gives you the freedom to create new music

Audio source separation

AI that understands the characteristics of music can be used to extract and rearrange individual instruments from music that consists of a mix of sounds. This makes it possible to create multitrack data from old records, for example. This gives artists the freedom to create completely new music, and to give listeners musical experiences the likes of which never existed before.

Yuki Mitsufuji Fundamental Technology Research and Development Div. 1, R&D Center Sony Corporation

Yuki Mitsufuji Fundamental Technology Research and Development Div. 1, R&D Center
Sony Corporation

AI creates new possibilities with sound

The human brain differentiates the sounds of individual instruments, even when multiple instruments are playing at once. Machines are not capable of doing this; the ability is unique to humans. However, AI has made audio source separation possible. In recent years, it has become common in music to record multiple tracks and mix them down to two channels, but AI has made it possible to perform the opposite operation-creating multiple tracks from audio on one or two channels. Incidentally, we also used our audio source-separating technology on aibo. aibo's built-in microphone directly picks up the noises aibo makes as it moves, so separating and reducing those noises makes it easier for aibo to hear human voices, which makes it possible for aibo to react accordingly.

Using the variety of sound sources of the Sony Group
in AI development

Around 2013, we began to see outcomes from AI-driven speech recognition, and started technological developments with the confidence that AI could be used for music as well. Our pioneering application of AI in music allowed us to win three consecutive global competitions for audio source separation. The AI we used for the audio source separation contained neural networks that understand sound, and was designed specifically for music. We designed the AI to simultaneously learn musical progressions, the characteristics of various instruments, and more. Data diversity is extremely important for enhancing the performance of this AI, and Sony has a major advantage toward that end: Sony Group comprises Sony Music Entertainment and other companies that have a vast amount of diverse audio recordings.

Audio source separation further increases the value of recorded audio

Even though it is called "audio source separation", the actual purpose is not to separate sounds; it is to remix sounds after they are separated. For example, old recordings that no longer have a master tape can be multitracked to enable processing into 5.1 ch and other formats. The technology can be used in many other ways as well. An obvious example is orchestra recordings, which are not multitracked but all the instrument will be recorded together, simultaneously. Audio source separation technology separates and rearranges the audio of each instrument. With this technology, listeners can feel as though they are standing in the middle of the stage where the orchestra is performing. If they move closer to the flute player, the flute sounds louder to them. In addition, they can remove the vocals from their favorite tunes to enjoy genuine backing for their karaoke performances. This will eventually support the business of musicians as well.

Using AI and blockchain technology to expand the possibilities of music

Sony AI enhances artists' abilities. It gives artists the ability to control sound as they desire. They can use AI tools to compose on the spot at live venues. They can resurrect the recordings of deceased artists to release new albums. We are confident that there will come a day when these kinds of things are simple to do. Sony Music Entertainment Entertainment (Japan) Inc. recently released a service called soundmain. AI and blockchain technology are the core elements of soundmain. Sampling is on the rise throughout the world, but there are obstacles against creating sampled music in Japan due to rights-related issues. In response, we have created a system that uses blockchain to handle the rights in an effort to ensure that the owners of the samples are compensated. Technology is created from music, and that technology is used to create new music. We intend to further expand the possibilities.

Page Top