Direction of arrival (DOA) estimation of sound sources using a spherical microphone array is usually performed in the spherical harmonic (SH) domain. In a non-noisy environment, it suffices to use only the zeroth- and first-order spherical harmonic beams (SHBs) in the SH domain for DOA estimation. One such method is based on the pseudo-intensity vector (PIV), which is attractive due to its low computational complexity. To improve the performance of the PIV method in reverberant environments, some methods have been proposed recently to further exploit high-order SHBs. However, these methods ignore the effect of noise on high-order SHBs, which may lead to poor performance in low signal-to-noise ratio (SNR) environments. To address the problem, this paper proposes an order-aware scheme that is able to select the high-order SHBs reliable for robust DOA estimation of multiple speech sources. Simulation and real-world experimental results demonstrate that the order-aware scheme based methods outperform their existing counterparts with less computational complexity in terms of both accuracy and robustness of DOA estimation. Moreover, the performance improvement is more significant in low SNR environment and in a scenario with small angular separation of sources.

1.
J.
Meyer
and
G.
Elko
, “
A highly scaleable spherical microphone array based on an orthonormal decomposition of the soundfield
,” in
Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing
(
2002
), Vol.
2
, pp.
1781
1784
.
2.
J.
Meyer
and
G. W.
Elko
, “
Spherical microphone arrays for 3D sound recording
,” in
Audio Signal Processing for Next-Generation Multimedia Communication Systems
, edited by
Y.
Huang
and
J.
Benesty
(
Springer U.S
.,
Boston, MA
,
2004
), pp.
67
89
.
3.
B.
Rafaely
, “
Analysis and design of spherical microphone arrays
,”
IEEE Trans. Speech Audio Process.
13
(
1
),
135
143
(
2004
).
4.
B.
Rafaely
,
Fundamentals of Spherical Array Processing
(
Springer
,
Berlin
,
2015
).
5.
D. P.
Jarrett
,
E. A. P.
Habets
, and
P. A.
Naylor
,
Theory and Applications of Spherical Microphone Array Processing
(
Springer
,
Berlin
,
2017
).
6.
D.
Khaykin
and
B.
Rafaely
, “
Acoustic analysis by spherical microphone array processing of room impulse responses
,”
J. Acoust. Soc. Am.
132
(
1
),
261
270
(
2012
).
7.
S.
Tervo
and
A.
Politis
, “
Direction of arrival estimation of reflections from room impulse responses using a spherical microphone array
,”
IEEE/ACM Trans. Audio Speech Lang. Process.
23
(
10
),
1539
1551
(
2015
).
8.
J.
Meyer
and
G. W.
Elko
, “
Exploring spherical microphone arrays for room acoustic analysis
,”
J. Acoust. Soc. Am.
131
(
4
),
3208
3208
(
2012
).
9.
C.
Jin
and
N.
Epain
, “
Sound field diffusivity estimation using spherical microphone arrays
,”
J. Acoust. Soc. Am.
131
(
4
),
3283
3283
(
2012
).
10.
M. R.
Bai
and
Y. H.
Yao
, “
Source localization and signal extraction using spherical microphone arrays
,”
J. Acoust. Soc. Am.
137
(
4
),
2232
2232
(
2015
).
11.
B.
Rafaely
,
Y.
Peled
,
M.
Agmon
,
D.
Khaykin
, and
E.
Fisher
, “
Spherical microphone array beamforming
,” in
Speech Processing in Modern Communication: Challenges and Perspectives
, edited by
I.
Cohen
,
J.
Benesty
, and
S.
Gannot
(
Springer
,
Berlin
,
2010
), pp.
281
305
.
12.
D. N.
Zotkin
,
R.
Duraiswami
, and
N. A.
Gumerov
, “
Plane-wave decomposition of acoustical scenes via spherical and cylindrical microphone arrays
,”
IEEE Trans. Audio Speech Lang. Process.
18
(
1
),
2
16
(
2009
).
13.
Ning
Xiang
and
Christopher
Landschoo
, “
Bayesian inference for acoustic direction of arrival analysis using spherical harmonics
,”
Entropy
21
(
6
),
579
(
2019
).
14.
S.
Hafezi
,
A. H.
Moore
, and
P. A.
Naylor
, “
Augmented intensity vectors for direction of arrival estimation in the spherical harmonic domain
,”
IEEE/ACM Trans. Audio Speech Lang. Process.
25
(
10
),
1956
1968
(
2017
).
15.
S.
Yan
,
H.
Sun
,
U. P.
Svensson
,
X.
Ma
, and
J. M.
Hovem
, “
Optimal modal beamforming for spherical microphone arrays
,”
IEEE Trans. Audio Speech Lang. Process.
19
(
2
),
361
371
(
2010
).
16.
B.
Rafaely
, “
Plane-wave decomposition of the sound field on a sphere by spherical convolution
,”
J. Acoust. Soc. Am.
116
(
4
),
2149
2157
(
2004
).
17.
S.
Delikaris-Manias
,
D.
Pavidi
,
A.
Mouchtaris
, and
V.
Pulkki
, “
3D localization of multiple audio sources utilizing 2D DOA histograms
,” in
Proceedings of the European Signal Processing Conference
(
2016
), pp.
1473
1477
.
18.
L.
Xuan
,
S.
Yan
,
M. A.
Xiaochuan
, and
C.
Hou
, “
Spherical harmonics MUSIC versus conventional MUSIC
,”
Appl. Acoust.
72
(
9
),
646
652
(
2011
).
19.
O.
Nadiri
and
B.
Rafaely
, “
Localization of multiple speakers under high reverberation using a spherical microphone array and the direct-path dominance test
,”
IEEE/ACM Trans. Audio Speech Lang. Process.
22
(
10
),
1494
1505
(
2014
).
20.
L.
Kumar
, “
The spherical harmonics root-MUSIC
,” in
Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing
(
2016
), pp.
3046
3050
.
21.
H.
Teutsch
and
W.
Kellermann
, “
EB-ESPRIT: 2D localization of multiple wideband acoustic sources using eigen-beams
,” in
Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing
(
2005
), Vol.
3
, pp.
89
92
.
22.
Y.
Hu
,
J.
Lu
, and
X.
Qiu
, “
A maximum likelihood direction of arrival estimation method for open-sphere microphone arrays in the spherical harmonic domain
,”
J. Acoust. Soc. Am.
138
(
2
),
791
794
(
2015
).
23.
D. P.
Jarrett
,
E. A. P.
Habets
, and
P. A.
Naylor
, “
3D source localization in the spherical harmonic domain using a pseudointensity vector
,” in
Proceedings of the European Signal Processing Conference
(
2010
), pp.
442
446
.
24.
C.
Evers
,
A. H.
Moore
, and
P. A.
Naylor
, “
Multiple source localisation in the spherical harmonic domain
,” in
Proceedings of the 14th International Workshop on Acoustical Signal Enhancement
(
2014
), pp.
258
262
.
25.
D.
Pavlidi
,
S.
Delikaris-Manias
,
V.
Pulkki
, and
A.
Mouchtaris
, “
3D localization of multiple sound sources with intensity vector estimates in single source zones
,” in
Proceedings of the European Signal Processing Conference
(
2015
), pp.
1556
1560
.
26.
A.
Moore
,
C.
Evers
,
P. A.
Naylor
,
D. L.
Alon
, and
B.
Rafaely
, “
Direction of arrival estimation using pseudo-intensity vectors with direct-path dominance test
,” in
Proceedings of the European Signal Processing Conference
(
2015
), pp.
2296
2300
.
27.
S.
Hafezi
,
A. H.
Moore
, and
P. A.
Naylor
, “
3D acoustic source localization in the spherical harmonic domain based on optimized grid search
,” in
Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing
(
2016
), pp.
415
419
.
28.
S.
Hafezi
,
A. H.
Moore
, and
P. A.
Naylor
, “
Multiple source localization in the spherical harmonic domain using augmented intensity vectors based on grid search
,” in
Proceedings of the European Signal Processing Conference
(
2016
), pp.
602
606
.
29.
D.
Pavlidi
,
S.
Delikaris-Manias
,
V.
Pulkki
, and
A.
Mouchtaris
, “
3D DOA estimation of multiple sound sources based on spatially constrained beamforming driven by intensity vectors
,” in
Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing
(
2016
), pp.
96
100
.
30.
A. H.
Moore
,
C.
Evers
, and
P. A.
Naylor
, “
Direction of arrival estimation in the spherical harmonic domain using subspace pseudointensity vectors
,”
IEEE/ACM Trans. Audio Speech Lang. Process.
25
(
1
),
178
192
(
2017
).
31.
F.
Jacobsen
, “
Sound intensity
,” in
Springer Handbook of Acoustics
, edited by
T. D.
Rossing
(
Springer
,
New York
,
2014
), pp.
1093
1114
.
32.
S.
Tervo
, “
Direction estimation based on sound intensity vectors
,” in
Proceedings of the European Signal Processing Conference
(
2009
), pp.
700
704
.
33.
E. G.
Williams
,
Fourier Acoustics: Sound Radiation and Nearfield Acoustical Holography
(
Academic Press
,
London
,
1999
).
34.
B.
Rafaely
,
B.
Weiss
, and
E.
Bachmat
, “
Spatial aliasing in spherical microphone arrays
,”
IEEE Trans. Signal Process.
55
(
3
),
1003
1010
(
2007
).
35.
B.
Rafaely
, “
Phase-mode versus delay-and-sum spherical microphone array processing
,”
IEEE Sign. Process. Lett.
12
(
10
),
713
716
(
2005
).
36.
R. H.
Hardin
and
N. J. A.
Sloane
, “
Mclaren's improved snub cube and other new spherical designs in three dimensions
,”
Discrete Comput. Geom.
15
(
4
),
429
441
(
1995
).
37.
M. I.
Mandel
,
R. J.
Weiss
, and
D. P. W.
Ellis
, “
Model-based expectation-maximization source separation and localization
,”
IEEE Trans. Audio Speech Lang. Process.
18
(
2
),
382
394
(
2010
).
38.
H.
Sawada
,
S.
Araki
, and
S.
Makino
, “
Underdetermined convolutive blind source separation via frequency bin-wise clustering and permutation alignment
,”
IEEE Trans. Audio Speech Lang. Process.
19
(
3
),
516
527
(
2011
).
39.
O.
Yilmaz
and
S.
Rickard
, “
Blind separation of speech mixtures via time-frequency masking
,”
IEEE Trans. Sign. Process.
52
(
7
),
1830
1847
(
2004
).
40.
J. S.
Garofolo
,
L. F.
Lamel
,
W. M.
Fisher
,
J. G.
Fiscus
,
D. S.
Pallett
, and
N. L.
Dahlgren
,
DAPRA TIMIT Acoustic-Phonetic Continuous Speech Corpus (CDROM)
(
U.S. National Institute of Standards and Technology
,
Gaithersburg, MD
,
1993
).
41.
D. P.
Jarrett
,
E. A. P.
Habets
, and
P. A.
Naylor
, “
Rigid sphere room impulse response simulation: Algorithm and applications
,”
J. Acoust. Soc. Am.
132
(
3
),
1462
1472
(
2012
).
42.
J. B.
Allen
, “
Image method for efficiently simulating small-room acoustics
,”
J. Acoust. Soc. Am.
65
(
4
),
943
950
(
1979
).
You do not currently have access to this content.