Useful Links
Computer Science
Artificial Intelligence
Deep Learning
Deep Learning for Computer Vision
1. Foundations of Computer Vision and Deep Learning
2. Convolutional Neural Networks
3. Training Deep Vision Models
4. Classical CNN Architectures
5. Modern CNN Architectures
6. Core Computer Vision Tasks
7. Advanced Topics and Applications
8. Practical Implementation and Deployment
Advanced Topics and Applications
Generative Models for Vision
Autoencoders
Encoder-Decoder Architecture
Bottleneck Representation
Reconstruction Loss
Applications
Dimensionality Reduction
Denoising
Anomaly Detection
Variational Autoencoders
Probabilistic Encoding
Latent Space Distribution
Reparameterization Trick
KL Divergence Loss
Image Generation
Latent Space Interpolation
Generative Adversarial Networks
Two-player Game
Generator Network
Discriminator Network
Adversarial Loss
Training Dynamics
Nash Equilibrium
Mode Collapse
Training Instability
GAN Variants
DCGAN
Convolutional Architecture
Training Guidelines
Conditional GAN
Class-conditional Generation
Pix2Pix
Image-to-image Translation
CycleGAN
Unpaired Translation
StyleGAN
Style-based Generation
Progressive Growing
BigGAN
Large-scale Generation
Attention Mechanisms
Attention Concept
Selective Focus
Weighted Aggregation
Spatial Attention
Location-based Attention
Attention Maps
Channel Attention
Feature Channel Selection
Global Context
Self-Attention
Query-Key-Value Framework
Multi-head Attention
Non-local Operations
Cross-Attention
Multi-modal Attention
Feature Fusion
Vision Transformers
Transformer Architecture
Self-attention Mechanism
Multi-head Attention
Position Encoding
Feed-forward Networks
Vision Transformer (ViT)
Patch Embedding
Linear Projection
Classification Token
Positional Embeddings
Training Considerations
Large Dataset Requirements
Pre-training Strategies
ViT Variants
DeiT
Data-efficient Training
Distillation Token
Swin Transformer
Hierarchical Architecture
Shifted Windows
PVT
Pyramid Vision Transformer
Hybrid Architectures
CNN-Transformer Combinations
ConViT
CvT
Video Understanding
Video Data Characteristics
Temporal Dimension
Motion Information
Computational Challenges
Video Representation
Frame Sampling
Optical Flow
Motion Vectors
3D CNNs
Spatiotemporal Convolutions
3D Filters
C3D Architecture
I3D Networks
Two-stream Networks
RGB Stream
Optical Flow Stream
Late Fusion
Recurrent Approaches
LSTM for Video
ConvLSTM
Sequence Modeling
Video Tasks
Action Recognition
Temporal Action Localization
Spatio-temporal Detection
Video Object Detection
Temporal Consistency
Feature Aggregation
Video Segmentation
Temporal Propagation
Object Tracking
Few-shot and Meta-learning
Problem Definition
Limited Training Data
Quick Adaptation
Few-shot Learning
N-way K-shot Classification
Support and Query Sets
Meta-learning Approaches
Model-Agnostic Meta-Learning
Gradient-based Meta-learning
Metric Learning
Siamese Networks
Triplet Networks
Prototypical Networks
Memory-augmented Networks
External Memory
Attention Mechanisms
Applications
Rare Disease Diagnosis
New Class Recognition
Previous
6. Core Computer Vision Tasks
Go to top
Next
8. Practical Implementation and Deployment