Transformer deep learning architecture

  1. Applications and Adaptations
    1. Natural Language Processing
      1. Machine Translation
        1. Sequence-to-Sequence Translation
          1. Multilingual Models
            1. Zero-shot Translation
            2. Text Summarization
              1. Extractive Summarization
                1. Abstractive Summarization
                  1. Multi-document Summarization
                  2. Question Answering
                    1. Reading Comprehension
                      1. Open-domain QA
                        1. Conversational QA
                        2. Language Generation
                          1. Text Completion
                            1. Creative Writing
                              1. Code Generation
                              2. Natural Language Understanding
                                1. Sentiment Analysis
                                  1. Named Entity Recognition
                                    1. Relation Extraction
                                      1. Text Classification
                                    2. Computer Vision
                                      1. Vision Transformer (ViT)
                                        1. Image Patch Tokenization
                                          1. Position Embeddings for Images
                                            1. Classification Token
                                              1. Comparison with CNNs
                                              2. Detection Transformer (DETR)
                                                1. Object Detection
                                                  1. Set Prediction
                                                    1. Bipartite Matching
                                                    2. Hybrid CNN-Transformer Models
                                                      1. Feature Extraction Combination
                                                        1. Multi-scale Processing
                                                        2. Video Understanding
                                                          1. Temporal Modeling
                                                            1. Action Recognition
                                                              1. Video Captioning
                                                            2. Speech and Audio Processing
                                                              1. Automatic Speech Recognition
                                                                1. Wav2Vec 2.0
                                                                  1. Whisper
                                                                    1. End-to-end ASR
                                                                    2. Text-to-Speech Synthesis
                                                                      1. FastSpeech
                                                                        1. Transformer TTS
                                                                          1. Neural Vocoding
                                                                          2. Audio Classification
                                                                            1. Environmental Sound Classification
                                                                              1. Music Information Retrieval
                                                                              2. Speech Translation
                                                                                1. Direct Speech-to-Speech
                                                                                  1. Cascaded Systems
                                                                                2. Multimodal Applications
                                                                                  1. Vision-Language Models
                                                                                    1. CLIP
                                                                                      1. DALL-E
                                                                                        1. Flamingo
                                                                                        2. Video-Language Understanding
                                                                                          1. Video Captioning
                                                                                            1. Video Question Answering
                                                                                            2. Audio-Visual Learning
                                                                                              1. Cross-modal Retrieval
                                                                                                1. Multimodal Fusion
                                                                                              2. Scientific and Technical Domains
                                                                                                1. Protein Structure Prediction
                                                                                                  1. AlphaFold
                                                                                                    1. Protein Language Models
                                                                                                      1. Structure-Function Relationships
                                                                                                      2. Drug Discovery
                                                                                                        1. Molecular Property Prediction
                                                                                                          1. Drug-Target Interaction
                                                                                                            1. Chemical Reaction Prediction
                                                                                                            2. Code Understanding
                                                                                                              1. Code Completion
                                                                                                                1. Bug Detection
                                                                                                                  1. Code Translation
                                                                                                                  2. Mathematical Reasoning
                                                                                                                    1. Theorem Proving
                                                                                                                      1. Equation Solving
                                                                                                                        1. Mathematical Language Understanding