VoiceSeeker FAQ

Voiceseeker FAQ

Q: What is VoiceSeeker?

A: VoiceSeeker is an Audio Front-End especially designed for Voice UI applications. The algorithm utilizes multi-microphone beamforming and acoustic echo cancellation to separate target speech from unwanted noise, in preparation for further processing by a local wake-word engine, such as NXP's VoiceSpot or VIT, or a cloud service.

Q: How many microphones and speakers can be supported?

A: The solution is scalable, with the number of microphones and speakers dependent on memory and MHz constraints of the target platform. For example, without AEC, a two-microphone solution uses about 14 MHz and 60 kB of memory on Cortex M7, whereas a three microphone and two speaker system, with AEC, uses about 320 MHz and 290 kB of memory on a Cortex M7.

Q: What is the latency?

A: VoiceSeeker has 12 ms of latency without AEC and 14 ms of latency when using AEC.

Q: What is the optimal placement of the microphones?

A: VoiceSeeker works with a variety of microphone geometries, such as linear, triangular and circular microphone arrays. A spacing of 4 cm is recommended between microphones, but a spacing between 2 cm and 8 cm is acceptable.

Q: Does VoiceSeeker rely on matched microphones?

No, VoiceSeeker does not have a strict tolerance for microphone matching.

Q: What is the optimal placement of the speaker?

A: The speaker/speakers should be placed as far away from the microphones as possible and should be mechanically isolated to reduce vibrations. The acoustic and mechanical paths from the speaker to the microphones should be designed as linearly as possible.