(Downloads - 0)
For more info about our services contact : help@bestpfe.com
Table of contents
Abstract
Acknowledgements
Table of Contents
List of Figures
List of Tables
1 Introduction
1.1 Automatic Speech Recognition
1.2 Pronunciation variation
1.3 Grapheme-to-phoneme conversion
1.4 The confusability problem
1.5 Motivation
1.6 Thesis outline
2 Background and State-of-the-art
2.1 Grapheme-to-phoneme conversion
2.2 Phonemic confusability
2.2.1 Confusability: an ASR error analysis
2.2.2 Moderating confusability
2.2.3 Speech-dependent lexicons
2.2.4 Combining g2p conversion and speech-dependent lexicons
2.2.5 Phonemic confusability in the Keyword-Spotting task
2.3 FST background
2.3.1 Generalities
2.3.2 Semiring
2.3.3 Weighted Finite-State Transducers
2.3.4 Some useful semirings
2.3.5 Algorithms
2.3.6 Entropy semiring
2.3.7 Matchers
2.3.8 FST-based speech recognition
3 SMT-inspired pronunciation generation
3.1 Introduction
3.2 Methodology
3.2.1 Moses as g2p and p2p converter
3.2.2 Pivot paraphrasing approach
3.3 Experimental setup
3.4 Evaluation
3.4.1 Definition of evaluation measures
3.4.2 G2P conversion results
3.4.3 P2P conversion results
3.5 Speech recognition experiments
3.6 Conclusion
4 Pronunciation confusability
4.1 Introduction
4.2 A new confusability measure
4.2.1 ASR decoding with FSTs
4.2.2 Decomposing the acoustic and linguistic modeling
4.2.3 Definition of pronunciation entropy
4.3 Phoneme recognition
4.4 Pronunciation entropy results
4.5 Conclusion
5 Phoneme confusion model in ASR
5.1 Introduction
5.2 Problem set-up
5.3 Training criteria
5.3.1 The CRF model
5.3.2 Soft-margin CRF
5.3.3 Large-margin methods
5.3.3.1 Perceptron
5.3.3.2 Max-margin
5.3.4 Optimization algorithm
5.4 An FST-based implementation
5.4.1 Preprocessing
5.4.2 Defining the input and output FSTs
5.4.3 Computing the edit distance with FSTs
5.4.4 Discriminative training algorithms
5.4.4.1 Perceptron
5.4.4.2 Max-margin
5.4.4.3 CRF
5.4.4.4 Soft-margin CRF
5.5 Experimental setup
5.6 Phonemic analysis
5.7 Evaluation
5.7.1 Computation of the objective
5.7.2 Phoneme Accuracy
5.7.3 Decoding process
5.7.4 Discussion of the results
5.8 Conclusion
6 Confusion model for KWS
6.1 Introduction
6.2 Keyword spotting system
6.2.1 Indexing and searching representation
6.2.2 Confusion model
6.2.3 Confusion model initialization
6.3 Confusion model training
6.3.1 The Figure of Merit
6.3.2 Discriminatively optimizing the Figure of Merit
6.4 Experimental setup
6.5 Results
6.6 Conclusion
7 Conclusion and Perspectives
7.1 Thesis summary
7.2 Perspectives
Appendix A Phoneme set for American English
Appendix B Publications
Bibliography




