Complex-valued independent component analysis for online blind speech extraction
Författare
Summary, in English
This paper presents a theoretical analysis of a
certain criterion for complex-valued independent component
analysis (ICA) with a focus on blind speech extraction (BSE) of a
spatio–temporally nonstationary speech source. In the paper, the
proposed criteria denoted KSICA is related to the well-known FastICA
method with the Kurtosis contrast function. The proposed
method is shown to share the important fixed-point feature with
the FastICA method, although an improvement with the proposed
method is that it does not exhibit the divergent behavior for a
mixture of Gaussian-only sources that the FastICA method tends
to do, and it shows better performance in online implementations.
Compared to the FastICA, the KSICA method provides a 10 dB
higher source extraction performance and a 10 dB lower standard
deviation in a data batch approach when the data batch size is
less than 100 samples. For larger batch sizes, the KSICA metod
performs equally well. In an online application with spatially
stationary sources the KSICA method provides around 10 dB
higher interference suppression, and 1 MOS-unit lower speech
distortion compared to the FastICA for 0.15 s time constant in
the algorithm update parameter. Thus, the FastICA performance
matches the KSICA performance for a time constant above 1 s.
Finally, in an online application with a moving speech source, the
KSICA method provides 10 dB higher interference suppression,
compared to the FastICA for the same algorithm settings. All in
all, the proposed KSICA method is shown to be a viab
certain criterion for complex-valued independent component
analysis (ICA) with a focus on blind speech extraction (BSE) of a
spatio–temporally nonstationary speech source. In the paper, the
proposed criteria denoted KSICA is related to the well-known FastICA
method with the Kurtosis contrast function. The proposed
method is shown to share the important fixed-point feature with
the FastICA method, although an improvement with the proposed
method is that it does not exhibit the divergent behavior for a
mixture of Gaussian-only sources that the FastICA method tends
to do, and it shows better performance in online implementations.
Compared to the FastICA, the KSICA method provides a 10 dB
higher source extraction performance and a 10 dB lower standard
deviation in a data batch approach when the data batch size is
less than 100 samples. For larger batch sizes, the KSICA metod
performs equally well. In an online application with spatially
stationary sources the KSICA method provides around 10 dB
higher interference suppression, and 1 MOS-unit lower speech
distortion compared to the FastICA for 0.15 s time constant in
the algorithm update parameter. Thus, the FastICA performance
matches the KSICA performance for a time constant above 1 s.
Finally, in an online application with a moving speech source, the
KSICA method provides 10 dB higher interference suppression,
compared to the FastICA for the same algorithm settings. All in
all, the proposed KSICA method is shown to be a viab
Avdelning/ar
Publiceringsår
2008-11
Språk
Engelska
Sidor
1624-1632
Publikation/Tidskrift/Serie
IEEE Transactions on Audio, Speech, and Language Processing
Volym
16
Issue
8
Dokumenttyp
Artikel i tidskrift
Förlag
IEEE - Institute of Electrical and Electronics Engineers Inc.
Ämne
- Signal Processing
Status
Published
ISBN/ISSN/Övrigt
- ISSN: 1558-7916