Voice transformation and speech synthesis

somdn_product_page

(Downloads - 0)

Catégorie :

For more info about our services contact : help@bestpfe.com

Table of contents

Abstract & Resumé
Notations, acronyms and usual expressions
1 Introduction
1.1 Problematics
1.1.1 Source-lter model vs. acoustic model
1.1.2 The chosen approach: the glottal model
1.1.3 Evaluation and validation of the proposed methods
1.2 Structure of the document
I Voice production and its model
2 The glottal source
2.1 Vocal folds & glottal area
2.2 Laryngeal mechanisms & voice quality
2.3 Glottal ow vs. Glottal source
2.4 Glottal models
2.5 Time and spectral characteristics of the glottal pulse
2.5.1 Time properties: glottal instants and shape parameters
2.5.2 Spectral properties: glottal formant and spectral tilt
2.5.3 Mixed-phase property of the glottal source
2.5.4 Vocal folds asymmetry, pulse scattering and limited band glottal model
2.6 Aspiration noise
Conclusions
3 Filtering elements and voice production model
3.1 The Vocal-Tract Filter (VTF)
3.1.1 Structures of the vocal-tract
3.1.2 Minimum-phase hypothesis
3.2 Lips and nostrils radiation
3.3 The complete voice production model
Conclusions
II Voice analysis
4 Source-lter separation
4.1 General forms of the vocal-tract lter and the glottal source
4.2 Estimation of glottal parameters
4.3 Spectral envelope models
4.4 The state of the art
4.4.1 Analysis-by-Synthesis
4.4.2 Pre-emphasis
4.4.3 Closed-phase analysis
4.4.4 Pole or pole-zero models with exogenous input (ARX/ARMAX)
4.4.5 Minimum/Maximum-phase decomposition, complex cepstrum and ZZT
4.4.6 Inverse ltering quality assessment
Conclusions
5 Joint estimation of the pulse shape and its position
5.1 Phase minimization
5.1.1 Conditions of convergence
5.1.2 Measure of condence
5.1.3 Polarity
5.1.4 The iterative algorithm using MSP
5.2 Dierence operator for phase distortion measure
5.2.1 The method using MSPD
5.3 Estimation of multiple shape parameters
5.3.1 Oq/m vs. Iq/Aq
Conclusions
6 Estimation of the shape parameter without pulse position
6.1 Parameter estimation using the 2nd order phase diérence
6.1.1 The method based on MSPD2
6.2 Parameter estimation using function of phase-distortion
6.2.1 The method FPD􀀀1 based on FPD inversion
6.2.2 Conditioning of the FPD inversion
Conclusions
7 Estimation of the Glottal Closure Instant
7.1 The minimum of the radiated glottal source
7.2 The method using a glottal shape estimate
7.2.1 Estimation of a GCI in one period
7.2.2 Estimation of GCIs in a complete utterance
7.3 Evaluation of the error related to the shape parameter
Conclusions
8 Evaluation of the proposed estimation methods
8.1 Evaluation with synthetic signals
8.1.1 Error related to the fundamental frequency
8.1.2 Error related to the noise levels
8.2 Comparison with electroglottographic signals
8.2.1 Evaluation of GCI estimates
8.2.2 Evaluation of the shape parameter estimate
8.3 Examples of Rd estimates on real signals
Conclusions
III Voice transformation and speech synthesis
9 Analysis/synthesis method
9.1 Methods for voice transformation and speech synthesis
9.2 Choice of approach for the proposed method
9.3 The analysis step: estimation of the SVLN parameters
9.3.1 The parameters of the deterministic source: f0, Rd, Ee
9.3.2 The parameter of the random source: g
9.3.3 The estimation of the vocal-tract lter
9.4 The synthesis step using SVLN parameters
9.4.1 Segment position and duration
9.4.2 The noise component: ltering, modulation and windowing
9.4.3 Glottal pulse and ltering elements
Conclusions
10 Evaluation of the SVLN method
10.1 Inuence of the estimated parameters on the SVLN results
10.2 Preference tests for pitch transposition
10.3 Evaluation of breathiness modication
10.4 Speech synthesis based on Hidden Markov Models (HMM)
Conclusions
11 General conclusions
11.1 Future directions
A Minimum, zero and maximum phase signals
A.1 The real cepstrum and the zero and minimum phase realizations
A.2 Generalized realization
B Error minimization and glottal parameters estimation
B.1 Error minimization
B.2 Parameter estimation based on glottal source estimate
B.2.1 Procedure for the Iterative Adaptive Inverse Filtering (IAIF)
B.2.2 Procedure for the Complex Cepstrum (CC) and the Zeros of the Z-Transform (ZZT)
C The glottal area estimation method
C.1 The proposed method for glottal area estimation
C.2 Spectral aliasing
C.3 Equipment
D Maeda’s voice synthesizer
D.1 Space and time sampling
D.2 Minimum-phase property of the vocal-tract
Bibliography
Publications during the study

Laisser un commentaire

Votre adresse e-mail ne sera pas publiée. Les champs obligatoires sont indiqués avec *