GENERATIVE SPEECH CODING WITH PREDICTIVE VARIANCE REGULARIZATION

Alejandro Luebs

Andrew Storus

Bastiaan Kleijn

Felicia Lim

Jan Skoglund

Michael Chinen

Tom Denton

Yero Yeh

ICASSP 2021 (2021)

Google Scholar

Abstract

The recent emergence of machine-learning based generative models for speech
suggests a significant reduction in bit rate for speech codecs is
possible. However, the performance of generative models deteriorates
significantly with the distortions present in real-world input signals. We argue
that this deterioration
is due to the sensitivity of the maximum likelihood criterion to outliers and
the ineffectiveness of modeling a sum of independent signals with a single
autoregressive model. We introduce predictive-variance regularization to reduce
the sensitivity to outliers, resulting in a significant increase in performance. We
show that noise reduction to remove unwanted signals can significantly
increase performance. We provide extensive subjective performance evaluations
that show that our system based on generative modeling provides state-of-the-art coding
performance at 3 kb/s for real-world speech signals at reasonable computational complexity.

Research Areas

Speech Processing

Defining the technology of today and tomorrow.

Philosophy

People

Teams

AI/ML Foundations  & Capabilities

Algorithms & Optimization

Computing Paradigms

Responsible Human-Centric Technology

Science & Societal Impact

Projects

Publications

Resources

Shaping the future, together.

Student programs

Faculty programs

Conferences & events

GENERATIVE SPEECH CODING WITH PREDICTIVE VARIANCE REGULARIZATION

Abstract

Research Areas

Learn more about how we conduct our research

Defining the technology of today and tomorrow.

Philosophy

People

Teams

AI/ML Foundations & Capabilities

Algorithms & Optimization

Computing Paradigms

Responsible Human-Centric Technology

Science & Societal Impact

Projects

Publications

Resources

Shaping the future, together.

Student programs

Faculty programs

Conferences & events

GENERATIVE SPEECH CODING WITH PREDICTIVE VARIANCE REGULARIZATION

Abstract

Research Areas

Learn more about how we conduct our research

AI/ML Foundations  & Capabilities