There are many applications that would be aided by the determination of the physical position and orientation of users. Some of the applications include service robots, video conference, intelligent living environments, security systems and speech sep...
There are many applications that would be aided by the determination of the physical position and orientation of users. Some of the applications include service robots, video conference, intelligent living environments, security systems and speech separation for hands-free communication devices. Without information on the spatial location of users in a given environment, it would not be possible to react naturally to the needs of the users in these applications.
To localize a user, sound source localization techniques are widely used. Sound localization is the process of determining the spatial location of a sound source based on multiple observations of the received sound signals. Current sound localization techniques are generally based upon the idea of computing the time delay of arrival (TDOA) information with microphone arrays.
An efficient method to obtain TDOA information between two signals is to compute the cross-correlation of the two signals. The computed correlation values give the point at which the two signals from separate microphones are at their maximum correlation. When only two isotropic (i.e., not directional as in the mammalian ear) microphones are used, the system experiences front-back confusion effect: the system has difficulty in determining whether the sound is originating from in front of or behind the system. To overcome this problem, more microphones can be incorporated.
Various weighting functions or prefilters such as Roth, SCOT, PHAT, Eckart filter and HT can be used to increase the performance of time delay estimation. However, the performance improvement is achieved with the penalty of large hardware overhead if the system is implemented in VLSI system.
In this thesis, we propose an efficient sound source localization technique using angle division under the assumption that three isotropic microphones are used to avoid the front-back confusion effect. By the proposed approach, the region from 0°to 180°is divided into three regions and only one of the three regions is considered to estimate the sound direction. Using Verilog simulations, it is shown that considerable amount of computation time and hardware complexity can be reduced by the proposed approach. In addition, it is also shown that the accuracy of the estimation is improved due to the proper choice of the selected region.