Researchers from Spotify, the popular streaming service, have filed a patent on a technology that could in the future be used to improve music suggestions. While this sounds interesting and promising, the controversial bit about this technology is that it involves tracking and using the recordings of user's voices so that an algorithm can 'judge' the mood of a person and suggest them apt music tracks.
In the search for a better way to find a user's preferences or suitable music, Spotify could rely on listeners' voices, among other things, in the future. According to the patent, prosodic information, such as intonation or rhythm of the voice, could be used to determine the user's mood. Based on this, appropriate songs or musicians could then be suggested.
In addition to the voice of a user, background noises could also be included in such suggestions. These could be used, for example, to determine whether the listener is alone or surrounded by other people.
Optimizing music suggestions through speech recognition
One of the reasons for such research is the technology currently used to "get to know" users. This often involves filling out long and time-consuming dialogues or selecting a list of preferred musicians to receive personalized suggestions based on them.
According to the patent, "an entirely different approach to collecting taste attributes of a user" is needed. The goal is to make manual input at least partially unnecessary. This is exactly where the analysis of an audio signal, i.e. speech and background noise, is supposed to come in.
What speech can reveal about a user
The patent gives some examples of what the new system should be able to recognize. In addition to just understandimg what a user has said, it should also be able to filter out metadata. This includes, for example, the emotional state, gender, age and possible accent of the user.
Based on the background noises it could also possible to recognize the environment. This includes not only whether you are in a train, car, park or shop, but also whether you are alone, in a small group or at a party.
All of this information will then be combined with, for example, previous requests from the listener, their existing music library and also reviews from friends to make suggestions for upcoming content.
Audio recordings: what about privacy?
The system described leaves open how exactly Spotify intends to get this audio data from its users. It would be conceivable, for example, to have a function for direct voice input in the app. But what happens, for example, to recordings after they've been analyzed? And what about the results of these analyses themselves? These would be open questions.
However, since this is only a patent so far, we can only speculate about this. Unlocking the microphone for an app must be explicitly enabled by the user, so unwanted eavesdropping by Spotify is unlikely.
Furthermore, companies often patent technologies for themselves without these ultimately being found in the final products. It is therefore quite possible that such a feature will never be integrated into Spotify.