Useful Links
Computer Science
Mobile Technologies
Voice Technologies
1. Introduction to Voice Technologies
2. Fundamentals of Sound and Speech
3. Digital Signal Processing for Speech
4. Automatic Speech Recognition
5. Text-to-Speech Synthesis
6. Spoken Language Understanding
7. Advanced Voice Technologies
8. Voice Technology Applications
9. Implementation Challenges
10. Future Directions and Research
Text-to-Speech Synthesis
TTS System Architecture
Frontend Processing
Text Analysis
Linguistic Processing
Phonetic Conversion
Backend Processing
Prosody Generation
Waveform Synthesis
Post-Processing
Text Analysis and Preprocessing
Text Normalization
Number Expansion
Abbreviation Handling
Symbol Processing
Currency and Date Formats
Linguistic Analysis
Part-of-Speech Tagging
Syntactic Parsing
Semantic Analysis
Discourse Analysis
Phonetic Transcription
Grapheme-to-Phoneme Conversion
Dictionary Lookup
Rule-Based Methods
Statistical Methods
Homograph Resolution
Context Analysis
Disambiguation Rules
Machine Learning Approaches
Prosody Modeling
Prosodic Features
Fundamental Frequency
Duration Modeling
Intensity Patterns
Pause Insertion
Prosody Prediction
Rule-Based Approaches
Statistical Models
Neural Prosody Models
Emotional Prosody
Emotion Categories
Prosodic Correlates
Style Transfer
Synthesis Techniques
Concatenative Synthesis
Unit Selection
Database Design
Unit Types
Selection Criteria
Concatenation Methods
Diphone Synthesis
Diphone Inventory
Smoothing Techniques
Prosody Modification
Parametric Synthesis
Source-Filter Models
Vocal Tract Modeling
Excitation Generation
Filter Design
HMM-Based Synthesis
Statistical Parameter Generation
Global Variance
Speaker Adaptation
Vocoder Technologies
WORLD Vocoder
STRAIGHT Vocoder
Neural Vocoders
Neural Synthesis
Sequence-to-Sequence Models
Tacotron Architecture
Attention Mechanisms
Decoder Design
Autoregressive Models
WaveNet Architecture
Dilated Convolutions
Conditioning Mechanisms
Non-Autoregressive Models
FastSpeech Architecture
Duration Prediction
Parallel Generation
Flow-Based Models
WaveGlow Architecture
Normalizing Flows
Invertible Transformations
Voice Quality and Adaptation
Speaker Modeling
Speaker Embeddings
Voice Conversion
Speaker Adaptation
Multi-Speaker Systems
Speaker Selection
Voice Cloning
Few-Shot Learning
Style Control
Speaking Style Transfer
Emotion Control
Prosodic Modification
TTS Evaluation
Subjective Evaluation
Mean Opinion Score
Preference Tests
Listening Test Design
Objective Evaluation
Intelligibility Measures
Naturalness Metrics
Prosody Evaluation
Automatic Evaluation
Mel Cepstral Distortion
Fundamental Frequency Error
Duration Accuracy
Previous
4. Automatic Speech Recognition
Go to top
Next
6. Spoken Language Understanding