Deep Learning for Audio Processing

  1. Key Applications and Tasks
    1. Speech Processing Applications
      1. Automatic Speech Recognition
        1. ASR System Components
          1. Acoustic Model
            1. Language Model
              1. Decoder
              2. Feature Extraction for ASR
                1. MFCC Features
                  1. Filter Bank Features
                    1. Raw Waveform Input
                    2. Acoustic Modeling
                      1. Phoneme Recognition
                        1. Subword Unit Modeling
                          1. Context-dependent Modeling
                          2. Language Modeling
                            1. N-gram Models
                              1. Neural Language Models
                                1. Transformer Language Models
                                2. End-to-End ASR Systems
                                  1. Connectionist Temporal Classification
                                    1. Attention-based Sequence-to-Sequence
                                      1. RNN-Transducer
                                      2. ASR Evaluation
                                        1. Word Error Rate
                                          1. Character Error Rate
                                            1. Real-time Factor
                                          2. Speaker Recognition Systems
                                            1. Speaker Identification
                                              1. Closed-set Identification
                                                1. Open-set Identification
                                                  1. Speaker Embedding Learning
                                                  2. Speaker Verification
                                                    1. Enrollment and Verification
                                                      1. Threshold Selection
                                                        1. Score Normalization
                                                        2. Speaker Diarization
                                                          1. Speaker Segmentation
                                                            1. Speaker Clustering
                                                              1. Overlap Detection
                                                              2. Anti-spoofing
                                                                1. Replay Attack Detection
                                                                  1. Synthetic Speech Detection
                                                                    1. Presentation Attack Detection
                                                                  2. Speech Synthesis
                                                                    1. Text-to-Speech Pipeline
                                                                      1. Text Analysis
                                                                        1. Phonetic Conversion
                                                                          1. Prosody Prediction
                                                                            1. Audio Generation
                                                                            2. Neural Vocoding
                                                                              1. WaveNet Vocoder
                                                                                1. WaveGlow
                                                                                  1. HiFi-GAN
                                                                                    1. Parallel WaveGAN
                                                                                    2. End-to-End TTS Systems
                                                                                      1. Tacotron Architecture
                                                                                        1. FastSpeech Models
                                                                                          1. VITS (Variational Inference TTS)
                                                                                          2. Voice Conversion
                                                                                            1. Parallel Voice Conversion
                                                                                              1. Non-parallel Voice Conversion
                                                                                                1. Many-to-Many Voice Conversion
                                                                                              2. Speech Enhancement
                                                                                                1. Noise Reduction
                                                                                                  1. Spectral Subtraction
                                                                                                    1. Wiener Filtering
                                                                                                      1. Deep Learning Denoising
                                                                                                      2. Speech Separation
                                                                                                        1. Single-channel Separation
                                                                                                          1. Multi-channel Separation
                                                                                                            1. Cocktail Party Problem
                                                                                                            2. Dereverberation
                                                                                                              1. Room Impulse Response Estimation
                                                                                                                1. Blind Dereverberation
                                                                                                                2. Bandwidth Extension
                                                                                                                  1. Artificial Bandwidth Extension
                                                                                                                    1. Super-resolution for Audio
                                                                                                                  2. Emotion and Paralinguistic Analysis
                                                                                                                    1. Emotion Recognition
                                                                                                                      1. Categorical Emotion Models
                                                                                                                        1. Dimensional Emotion Models
                                                                                                                          1. Feature Engineering for Emotion
                                                                                                                          2. Stress and Fatigue Detection
                                                                                                                            1. Vocal Stress Indicators
                                                                                                                              1. Cognitive Load Assessment
                                                                                                                              2. Age and Gender Recognition
                                                                                                                                1. Demographic Classification
                                                                                                                                  1. Biometric Applications
                                                                                                                                  2. Pathological Speech Analysis
                                                                                                                                    1. Disease Detection from Speech
                                                                                                                                      1. Speech Therapy Applications
                                                                                                                                  3. Music Information Retrieval
                                                                                                                                    1. Music Classification Tasks
                                                                                                                                      1. Genre Classification
                                                                                                                                        1. Genre Taxonomy
                                                                                                                                          1. Feature Engineering for Genre
                                                                                                                                            1. Deep Learning Approaches
                                                                                                                                            2. Mood and Emotion Classification
                                                                                                                                              1. Emotion Models in Music
                                                                                                                                                1. Valence-Arousal Mapping
                                                                                                                                                  1. Cultural Considerations
                                                                                                                                                  2. Instrument Recognition
                                                                                                                                                    1. Single Instrument Classification
                                                                                                                                                      1. Multi-instrument Detection
                                                                                                                                                        1. Timbre Analysis
                                                                                                                                                        2. Style and Artist Classification
                                                                                                                                                          1. Musical Style Analysis
                                                                                                                                                            1. Artist Identification
                                                                                                                                                              1. Similarity Metrics
                                                                                                                                                            2. Music Transcription and Analysis
                                                                                                                                                              1. Automatic Music Transcription
                                                                                                                                                                1. Monophonic Transcription
                                                                                                                                                                  1. Polyphonic Transcription
                                                                                                                                                                    1. Multi-F0 Estimation
                                                                                                                                                                    2. Chord Recognition
                                                                                                                                                                      1. Chord Vocabulary
                                                                                                                                                                        1. Harmonic Analysis
                                                                                                                                                                          1. Chord Progression Modeling
                                                                                                                                                                          2. Key Detection
                                                                                                                                                                            1. Tonal Analysis
                                                                                                                                                                              1. Key Profile Methods
                                                                                                                                                                                1. Statistical Approaches
                                                                                                                                                                                2. Structure Analysis
                                                                                                                                                                                  1. Segment Boundary Detection
                                                                                                                                                                                    1. Repetition Analysis
                                                                                                                                                                                      1. Form Analysis
                                                                                                                                                                                    2. Rhythm and Tempo Analysis
                                                                                                                                                                                      1. Beat Tracking
                                                                                                                                                                                        1. Onset Detection
                                                                                                                                                                                          1. Beat Period Estimation
                                                                                                                                                                                            1. Dynamic Programming Approaches
                                                                                                                                                                                            2. Downbeat Detection
                                                                                                                                                                                              1. Meter Analysis
                                                                                                                                                                                                1. Strong Beat Identification
                                                                                                                                                                                                2. Tempo Estimation
                                                                                                                                                                                                  1. Global Tempo Estimation
                                                                                                                                                                                                    1. Local Tempo Variations
                                                                                                                                                                                                      1. Tempo Curve Extraction
                                                                                                                                                                                                    2. Music Recommendation and Similarity
                                                                                                                                                                                                      1. Content-based Recommendation
                                                                                                                                                                                                        1. Audio Feature Similarity
                                                                                                                                                                                                          1. Semantic Similarity
                                                                                                                                                                                                          2. Collaborative Filtering
                                                                                                                                                                                                            1. User-based Methods
                                                                                                                                                                                                              1. Item-based Methods
                                                                                                                                                                                                              2. Hybrid Recommendation Systems
                                                                                                                                                                                                                1. Feature Combination
                                                                                                                                                                                                                  1. Deep Learning Approaches
                                                                                                                                                                                                                2. Music Generation and Synthesis
                                                                                                                                                                                                                  1. Symbolic Music Generation
                                                                                                                                                                                                                    1. MIDI Generation
                                                                                                                                                                                                                      1. Score Generation
                                                                                                                                                                                                                        1. Style Transfer
                                                                                                                                                                                                                        2. Audio Music Generation
                                                                                                                                                                                                                          1. Raw Audio Synthesis
                                                                                                                                                                                                                            1. Spectrogram Generation
                                                                                                                                                                                                                              1. Conditional Generation
                                                                                                                                                                                                                          2. Environmental Sound Analysis
                                                                                                                                                                                                                            1. Acoustic Scene Classification
                                                                                                                                                                                                                              1. Scene Taxonomy
                                                                                                                                                                                                                                1. Indoor vs Outdoor Scenes
                                                                                                                                                                                                                                  1. Urban vs Natural Environments
                                                                                                                                                                                                                                  2. Feature Engineering for Scenes
                                                                                                                                                                                                                                    1. Long-term Spectral Features
                                                                                                                                                                                                                                      1. Temporal Dynamics
                                                                                                                                                                                                                                      2. Multi-label Scene Classification
                                                                                                                                                                                                                                        1. Overlapping Scene Categories
                                                                                                                                                                                                                                          1. Hierarchical Classification
                                                                                                                                                                                                                                        2. Sound Event Detection
                                                                                                                                                                                                                                          1. Event Taxonomy
                                                                                                                                                                                                                                            1. Event Ontologies
                                                                                                                                                                                                                                              1. Hierarchical Event Structure
                                                                                                                                                                                                                                              2. Temporal Event Detection
                                                                                                                                                                                                                                                1. Onset and Offset Detection
                                                                                                                                                                                                                                                  1. Event Duration Modeling
                                                                                                                                                                                                                                                  2. Polyphonic Event Detection
                                                                                                                                                                                                                                                    1. Overlapping Events
                                                                                                                                                                                                                                                      1. Multi-label Classification
                                                                                                                                                                                                                                                      2. Weakly Supervised Learning
                                                                                                                                                                                                                                                        1. Weak Label Learning
                                                                                                                                                                                                                                                          1. Multiple Instance Learning
                                                                                                                                                                                                                                                        2. Audio Tagging and Annotation
                                                                                                                                                                                                                                                          1. Tag Vocabulary Design
                                                                                                                                                                                                                                                            1. Semantic Tag Hierarchies
                                                                                                                                                                                                                                                              1. Tag Relationships
                                                                                                                                                                                                                                                              2. Multi-label Tagging
                                                                                                                                                                                                                                                                1. Tag Co-occurrence
                                                                                                                                                                                                                                                                  1. Label Correlation
                                                                                                                                                                                                                                                                  2. Automatic Annotation
                                                                                                                                                                                                                                                                    1. Unsupervised Tagging
                                                                                                                                                                                                                                                                      1. Semi-supervised Learning
                                                                                                                                                                                                                                                                    2. Bioacoustic Analysis
                                                                                                                                                                                                                                                                      1. Animal Sound Recognition
                                                                                                                                                                                                                                                                        1. Species Classification
                                                                                                                                                                                                                                                                          1. Individual Recognition
                                                                                                                                                                                                                                                                          2. Biodiversity Monitoring
                                                                                                                                                                                                                                                                            1. Ecosystem Health Assessment
                                                                                                                                                                                                                                                                              1. Population Monitoring
                                                                                                                                                                                                                                                                              2. Marine Acoustics
                                                                                                                                                                                                                                                                                1. Whale Song Analysis
                                                                                                                                                                                                                                                                                  1. Underwater Sound Classification
                                                                                                                                                                                                                                                                                2. Urban Sound Analysis
                                                                                                                                                                                                                                                                                  1. Traffic Noise Analysis
                                                                                                                                                                                                                                                                                    1. Vehicle Classification
                                                                                                                                                                                                                                                                                      1. Traffic Flow Estimation
                                                                                                                                                                                                                                                                                      2. Construction Noise Monitoring
                                                                                                                                                                                                                                                                                        1. Noise Level Assessment
                                                                                                                                                                                                                                                                                          1. Source Identification
                                                                                                                                                                                                                                                                                          2. Public Safety Applications
                                                                                                                                                                                                                                                                                            1. Gunshot Detection
                                                                                                                                                                                                                                                                                              1. Emergency Sound Recognition