Pursue compact and high-performance AI
We are developing technology which accurately recognizes users’ natural speech amongst background noise and reverberation. Our focus is on improving the performance of audio signal processing and speech recognition technologies in the real world. We use deep learning to optimally integrate audio signal processing and speech recognition. This enables advanced speech recognition in unfavorable conditions, such as when there is mechanical noise from robotics. These technical optimizations catered to devices and use cases will be thoroughly user-friendly.
We are developing Spoken Language Understanding technology to understand user utterances. This technology converts speech recognition text strings into machine-understandable information (semantic representation). We have based our models on various linguistic phenomena such as disfluencies and abbreviations, in addition to a semantic database which links spoken language with the real world. For further understanding of natural language itself, we are developing Natural Language Processing technology that analyzes text. This process involves tokenizing, assigning parts of speech and semantic attributes, and parsing the structure. We are also developing Knowledge Information Processing technology which is applied for disambiguation of language.
Sony employs deep learning in various technology fields including video, audio and sensing data analysis. This technology has been widely utilized in our products and services, such as the swinging motion recognition feature of Xperia Ear, and the real estate price estimation engine in Sony Real Estate Corp. We are conducting cutting-edge research and development, with high-accuracy methods that employ low power consumption, low arithmetic operation resources, and methods that achieve highly accurate recognition even with limited data. We have also released our Neural Network Libraries deep learning framework for developers as open source, and are actively contributing to the global expansion of the AI community.
Behavior Learning, including but not limited to deep reinforcement learning, is the technology that enables an autonomous system to learn optimal behavior through its own trial-and-error experience. We aim to develop these technologies for planning actions in environments that are too complex or varied for humans to deal with, and for online optimization control mechanisms which effectively adapt to environments which vary more than anyone could anticipate in advance. We aim to apply this technology in robotics, including both navigation and manipulation, and in gaming AI. Also, we are proactively working on joint research projects with overseas universities and laboratories that are utilizing cutting-edge technologies.
We are developing a multi-modal agent technology development system, which has user sensing and visual feedback capability in addition to voice interaction. It is comprised of hardware capable of speech recognition and image sensing, a cloud service where dialog applications can be developed, and SDK. Applications in which the agent interacts with users can also be developed on the system, utilizing scenario-based dialog applications and image sensing.
In every Sony Group business, there is a significant amount of activity that effectively uses data to propose new customer value or business creation. Due to the extensive range of businesses that the Sony Group covers, which includes not only electronics but also finance and entertainment, a diverse array of big data is being generated every day. In order to enable the business units to solve their management issues and create and grow new business, we are developing a platform for the speedy utilization of this big data based on analytical methods and machine learning technology from Sony and others as core technologies. Through these analysis platforms, we will provide the advanced technologies for society and also contribute to AI human resources development.