|
[Print-friendly version]
These examples demonstrate a method for encoding spectral characteristics of
speech at rates below 180 b/s, using hierarchical temporal decomposition (HTD).
A set of the log-area-ratio (LAR) parameters, extracted from a given block of
speech, is approximated through Gaussian interpolation between the most-steady
frames detected by the HTD. This results in a smaller set of parameters which is
encoded using vector quantization. We have shown that the same spectral
distortion is obtained with the new coder at rate 180 b/s as that of a scalar
quantization, TD-based coder, at 600 b/s.
Coding Sample 1: Male Speaker (each 134K)
Coding Sample 2: Male Speaker (each 170K)
Coding Sample 3: Female Speaker (each 145K)
Coding Sample 4: Male Speaker (each 137K)
(Ref: S.Ghaemmaghami, S,Sridharan, "Very low rate speech coding using temporal
decomposition", IEE Electronic Letters, pp. 456-457, vol. 35, No.6, 1999.)
|