Smart speakers need localization techniques

March 26, 2020  - By

Smart speakers such as the Amazon Alexa or Google Home haven’t mastered determining user location in the home. Solving this problem was the focus of a University of Illinois at Urbana-Champaign research team’s recently published paper.

The team — led by Coordinated Science Lab graduate student Sheng Shen — explores the development of VoLoc, a system that uses the microphone array on Alexa as well as room echoes of the human voice to infer user location inside the home.

If applied, after receiving commands such as “turn on the light” or “increase the temperature,” Alexa would know which light and room is intended. Using a technique known as reverse triangulation, Shen and advisor Romit Roy Choudhury are getting closer to voice localization.

“Applying this technique to smart speakers entails quite a few challenges,” Shen said. “First, we must separate the direct human voice and each of the room echoes from the microphone recording. Then, we must accurately compute the direction for each of these echoes. Both challenges are difficult because the microphones simply record a mixture of all the sounds altogether.”

VoLoc addresses these obstacles through an align-and-cancel algorithm that iteratively isolates the direction of each arriving voice signal, and from them, reverse triangulates the user’s location. Some aspects of the room’s geometry is spontaneously learned, which helps with the triangulation.

While this is an important breakthrough, Shen and Roy Choudhury plan to expand the research to more applications. “Our immediate next step is to build to the smart speaker’s frame of reference,” Shen said. “This could mean superimposing the locations, as provided by VoLoc, on a floorplan to determine that the user is in the laundry room. Alternatively, if the smart speaker picks up the sounds made by the washer and dryer in the same location as the voice command, it can come to the same conclusion.”

Citation. S. Shen, R. Choudhury et al., “Voice Localization Using Nearby Wall Reflections,” to be presented at MobiCom 2020: 26th Annual International Conference on Mobile Computing and Networking, Sept. 21–25, London, UK.

Photo: Michael Wapp/iStock Editorial / Getty Images Plus/Getty Images

Photo: Michael Wapp/iStock Editorial / Getty Images Plus/Getty Images

About the Author: Tracy Cozzens

Senior Editor Tracy Cozzens joined GPS World magazine in 2006. She also is editor of GPS World’s newsletters and the sister website Geospatial Solutions. She has worked in government, for non-profits, and in corporate communications, editing a variety of publications for audiences ranging from federal government contractors to teachers.