Javascript must be enabled to continue!

Publications

Keyword: ambisonics (2) Back

2026
Beyond Omnidirectional: Neural Ambisonics Encoding for Arbitrary Microphone Directivity Patterns using Cross-Attention [Conference]

M. Heikkinen, A. Politis, K. Drossos, and T. Virtanen, "Beyond Omnidirectional: Neural Ambisonics Encoding for Arbitrary Microphone Directivity Patterns using Cross-Attention," in Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Barcelona, Spain, 2026

We present a deep neural network approach for encoding microphone array signals into Ambisonics that generalizes to arbitrary microphone array configurations with fixed microphone count but varying locations and frequency-dependent directional characteristics. Unlike previous methods that rely only on array geometry as metadata, our approach uses directional array transfer functions, enabling accurate characterization of real-world arrays. The proposed architecture employs separate encoders for audio and directional responses, combining them through cross-attention mechanisms to generate array-independent spatial audio representations. We evaluate the method on simulated data in two settings: a mobile phone with complex body scattering, and a free-field condition, both with varying numbers of sound sources in reverberant environments. Evaluations demonstrate that our approach outperforms both conventional digital signal processing-based methods and existing deep neural network solutions. Furthermore, using array transfer functions instead of geometry as metadata input improves accuracy on realistic arrays.

Attachment language: English File type: PDF document Paper (.pdf)
Updated: 11-03-2026 08:25 - Size: 1.79 MB
Attachment language: English File type: BiBTex LaTeX BibTex record (.bib)
Updated: 11-03-2026 08:25 - Size: 401 B
BibTex Record (Popup)
Copy the citation
2025
Gen-A: Generalizing Ambisonics Neural Encoding to Unseen Microphone Arrays [Conference]

M. Heikkinen, A. Politis, K. Drossos and T. Virtanen, "Gen-A: Generalizing Ambisonics Neural Encoding to Unseen Microphone Arrays," in Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Hyderabad, India, 2025

Using deep neural networks (DNNs) for encoding of microphone array (MA) signals to the Ambisonics spatial audio format can surpass certain limitations of established conventional methods, but existing DNN-based methods need to be trained separately for each MA. This paper proposes a DNN-based method for Ambisonics encoding that can generalize to arbitrary MA geometries unseen during training. The method takes as inputs the MA geometry and MA signals and uses a multi-level encoder consisting of separate paths for geometry and signal data, where geometry features inform the signal encoder at each level. The method is validated in simulated anechoic and reverberant conditions with one and two sources. The results indicate improvement over conventional encoding across the whole frequency range for dry scenes, while for reverberant scenes the improvement is frequency-dependent.

Attachment language: English File type: PDF document Paper (.pdf)
Updated: 21-09-2025 17:04 - Size: 572.69 KB
Attachment language: English File type: BiBTex LaTeX BibTex record (.bib)
Updated: 21-09-2025 17:04 - Size: 398 B
BibTex Record (Popup)
Copy the citation