Vector Search and Embeddings

  1. ANN Indexing Algorithms and Data Structures
    1. Tree-based Methods
      1. KD-Trees
        1. Binary Space Partitioning
          1. Construction Algorithm
            1. Search Process
              1. Performance in High Dimensions
              2. Ball Trees
                1. Hypersphere Partitioning
                  1. Distance Metric Flexibility
                  2. Annoy (Approximate Nearest Neighbors Oh Yeah)
                    1. Random Projection Trees
                      1. Forest of Trees
                        1. Memory Mapping
                          1. Use Cases and Limitations
                        2. Hashing-based Methods
                          1. Locality-Sensitive Hashing
                            1. Hash Function Families
                              1. Collision Probability Theory
                                1. Multi-hash Tables
                                  1. Parameter Tuning
                                  2. Random Projection Hashing
                                    1. Johnson-Lindenstrauss Lemma
                                      1. Gaussian Random Projections
                                        1. Sparse Random Projections
                                        2. Learning to Hash
                                          1. Supervised Hashing
                                            1. Deep Hashing Methods
                                          2. Graph-based Methods
                                            1. Hierarchical Navigable Small World
                                              1. Multi-layer Graph Structure
                                                1. Construction Algorithm
                                                  1. Search Algorithm
                                                    1. Parameter Optimization
                                                    2. NSW (Navigable Small World)
                                                      1. Single-layer Graphs
                                                        1. Connection Strategies
                                                        2. Vamana
                                                          1. Degree-bounded Graphs
                                                            1. Robust Pruning
                                                              1. Search Efficiency
                                                              2. Other Graph Methods
                                                                1. SPTAG
                                                                  1. FAISS Graph Indexes
                                                                2. Quantization-based Methods
                                                                  1. Product Quantization
                                                                    1. Vector Space Decomposition
                                                                      1. Codebook Learning
                                                                        1. Asymmetric Distance Computation
                                                                          1. Memory Compression
                                                                          2. Scalar Quantization
                                                                            1. Uniform Quantization
                                                                              1. Non-uniform Quantization
                                                                                1. Quantization Error Analysis
                                                                                2. Optimized Product Quantization
                                                                                  1. Optimized PQ
                                                                                    1. Additive Quantization
                                                                                  2. Inverted File Systems
                                                                                    1. IVF Index Structure
                                                                                      1. Voronoi Cell Partitioning
                                                                                        1. Centroid-based Clustering
                                                                                          1. Search Process
                                                                                          2. IVF-PQ Combination
                                                                                            1. Two-level Quantization
                                                                                              1. Memory and Speed Benefits
                                                                                                1. Parameter Selection
                                                                                                2. IVF-HNSW Hybrid
                                                                                                  1. Combining Partitioning and Graphs
                                                                                                    1. Performance Characteristics