Publications

All 2026 2025 2023

2026

Method and apparatus for training and using a microphone geometry assisted encoder model to generate spatial audio signals technological field

M. O. Heikkinen, K. Drosos, A. Politis, and T. Virtanen, “Method and apparatus for training and using a microphone geometry assisted encoder model to generate spatial audio signals,” U.S. Patent US20260065918A1, filed Aug. 28, 2025; published Mar. 05, 2026

A system for training a microphone geometry assisted encoder model and then utilizing the trained model to generate spatial audio signals that have been captured by a plurality of microphones. In a method for generating spatial audio signals, the method includes receiving geometry data related to a plurality of microphones of an audio capturing device and audio signal data captured by the plurality of microphones. The method also includes generating a spatial audio signal based on an output of a trained microphone geometry assisted encoder model. The trained microphone geometry assisted encoder model includes a geometry encoder configured to encode the geometry data and a signal encoder configured to encode the audio signal data. The trained microphone geometry assisted encoder model further includes a signal decoder having a plurality of layers and configured to generate the output upon which the spatial audio signal is based.

https://patents.google.com/patent/US20260065918A1/en

Model for speech enhancement

K. Drosos, M. O. Heikkinen, J. T. Vilkamo, P. Tsiaflakis, “Model for speech enhancement,” U.S. Patent US20260065922A1, filed Aug. 15, 2025; published Mar. 05, 2026

Examples of the disclosure relate to a model that can be used for speech enhancement. The model comprises an encoder part comprising a sequence of encoding layers and caused to receive input data. The input data is based on a current frame of a noisy speech signal and one or more past frames of the noisy speech signal. The sequence of encoding layers is caused to process the input data so that output data of the encoder part comprises a reduced number of the multiple frequency positions and a single temporal position. The model also comprises a decoder part comprising a sequence of decoding layers caused to receive data from a prior decoding layer. The output data of the decoder part comprises multiple frequency positions and a single temporal position. The output data of the decoder part is for post processing to provide an output signal for speech enhancement.

https://patents.google.com/patent/US20260065922A1/en

Speech and noise disentanglement for acoustic echo cancellation

K. Drosos, M. O. Heikkinen, S. Vesa, and M. T. Vilermo, “Speech and noise disentanglement for acoustic echo cancellation,” U.S. Patent US20260080885A1, filed Aug. 27 , 2025; published Mar. 19, 2026

The present disclosure relates to an apparatus, that obtains a far-end signal and a near-end microphone signal, determines, based on at least the far-end signal, a far-end speech signal estimate and a far-end noise signal estimate, determines, based on at least the near-end microphone signal, a near-end microphone speech signal estimate and a near-end microphone noise signal estimate, determines, based on at least the far-end speech signal estimate and the near-end microphone speech signal estimate, a predicted near-end speech signal, determines, based on at least the far-end noise signal estimate and the near-end microphone noise signal estimate, a predicted near-end noise signal and outputs at least the predicted near-end speech signal and predicted near-end noise signal.

https://patents.google.com/patent/US20260080885A1/en

Show/Hide All

Publications

Konstantinos Drossos®

Konstantinos Drossos^®