Javascript must be enabled to continue!

Publications


2026
Method and apparatus for training and using a microphone geometry assisted encoder model to generate spatial audio signals technological field

M. O. Heikkinen, K. Drosos, A. Politis, and T. Virtanen, “Method and apparatus for training and using a microphone geometry assisted encoder model to generate spatial audio signals,” U.S. Patent US20260065918A1, filed Aug. 28, 2025; published Mar. 05, 2026

A system for training a microphone geometry assisted encoder model and then utilizing the trained model to generate spatial audio signals that have been captured by a plurality of microphones. In a method for generating spatial audio signals, the method includes receiving geometry data related to a plurality of microphones of an audio capturing device and audio signal data captured by the plurality of microphones. The method also includes generating a spatial audio signal based on an output of a trained microphone geometry assisted encoder model. The trained microphone geometry assisted encoder model includes a geometry encoder configured to encode the geometry data and a signal encoder configured to encode the audio signal data. The trained microphone geometry assisted encoder model further includes a signal decoder having a plurality of layers and configured to generate the output upon which the spatial audio signal is based.

Speech and noise disentanglement for acoustic echo cancellation

K. Drosos, M. O. Heikkinen, S. Vesa, and M. T. Vilermo, “Speech and noise disentanglement for acoustic echo cancellation,” U.S. Patent US20260080885A1, filed Aug. 27 , 2025; published Mar. 19, 2026

The present disclosure relates to an apparatus, that obtains a far-end signal and a near-end microphone signal, determines, based on at least the far-end signal, a far-end speech signal estimate and a far-end noise signal estimate, determines, based on at least the near-end microphone signal, a near-end microphone speech signal estimate and a near-end microphone noise signal estimate, determines, based on at least the far-end speech signal estimate and the near-end microphone speech signal estimate, a predicted near-end speech signal, determines, based on at least the far-end noise signal estimate and the near-end microphone noise signal estimate, a predicted near-end noise signal and outputs at least the predicted near-end speech signal and predicted near-end noise signal.

2025
Apparatus, methods and computer programs for noise suppression

P. Tsiaflakis, M. T. Tammi, and K. Drosos, “Apparatus, methods and computer programs for noise suppression,” U.S. Patent US20250210055A1, filed Dec. 20, 2024; published Jun 26, 2025

Examples of the disclosure relate noise suppression for audio signals in a communication setting. An apparatus obtains at least one audio signal for a current frame or one or more previous frames, based on at least two microphone signals for the current frame or one or more previous frames. The apparatus uses a program code to predict an output signal for a future frame based, at least in part, on the at least one audio signal for the current frame or one or more previous frames and uses the output signal for processing the future frame of the at least two microphone signals in a first audio signal process and uses the output signal for processing the future frame of an output of the first audio signal process in a second audio signal process to enable noise suppression.

2023
Privacy-preserving sound representation

T. Virtanen, T. Heittola, S. Zhao, S. Gharib, and K. Drosos, “Privacy-preserving sound representation,” U.S. Patent US20230317086A1, filed Oct. 5, 2022; published Oct. 12, 2023

According to an example embodiment, a method (200) for audio-based monitoring is provided, the method (200) comprising: deriving (202), via usage of a predefined conversion model (M), based on audio data that represents sounds captured in a monitored space, one or more audio features that are descriptive of at least one characteristic of said sounds; identifying (204) respective occurrences of one or more predefined acoustic events in said space based on the one or more audio features; and carrying out (206), in response to identifying an occurrence of at least one of said one or more predefined acoustic events, one or more predefined actions associated with said at least one of said one or more predefined acoustic events, wherein said conversion model (M) is trained to provide said one or more audio features such that they include information that facilitates identification of respective occurrences of said one or more predefined acoustic events while preventing identification of speech characteristics.

Attachment language: English File type: PDF document Patent (pdf)
Updated: 06-01-2026 10:00 - Size: 1.56 MB
Attachment language: English File type: BiBTex LaTeX BibTex record (.bib)
Updated: 06-01-2026 10:03 - Size: 497 B
BibTex Record (Popup)
Copy the citation