Optimization Theory

  1. Unconstrained Optimization
    1. Theoretical Foundations
      1. Necessary Optimality Conditions
        1. First-Order Necessary Conditions
          1. Gradient Conditions
            1. Critical Points
            2. Second-Order Necessary Conditions
              1. Hessian Conditions
            3. Sufficient Optimality Conditions
              1. Second-Order Sufficient Conditions
                1. Strict Convexity Implications
              2. Line Search Strategies
                1. Exact Line Search
                  1. One-Dimensional Optimization
                    1. Golden Section Search
                      1. Fibonacci Search
                      2. Inexact Line Search
                        1. Armijo Rule (Sufficient Decrease)
                          1. Wolfe Conditions
                            1. Curvature Condition
                              1. Strong Wolfe Conditions
                              2. Goldstein Conditions
                              3. Step Size Selection
                                1. Fixed Step Size
                                  1. Adaptive Step Size
                                    1. Diminishing Step Size Rules
                                  2. Gradient-Based Methods
                                    1. Steepest Descent Method
                                      1. Algorithm Description
                                        1. Convergence Analysis
                                          1. Linear Convergence Rate
                                            1. Dependence on Condition Number
                                            2. Advantages and Limitations
                                            3. Conjugate Gradient Methods
                                              1. Linear Conjugate Gradient
                                                1. Conjugate Directions
                                                  1. Krylov Subspaces
                                                  2. Nonlinear Conjugate Gradient
                                                    1. Fletcher-Reeves Formula
                                                      1. Polak-Ribière Formula
                                                        1. Hestenes-Stiefel Formula
                                                      2. Momentum Methods
                                                        1. Heavy Ball Method
                                                          1. Nesterov Acceleration
                                                            1. Convergence Acceleration Properties
                                                          2. Newton-Type Methods
                                                            1. Newton's Method
                                                              1. Algorithm Derivation
                                                                1. Quadratic Convergence
                                                                  1. Computational Requirements
                                                                    1. Modified Newton Methods
                                                                    2. Quasi-Newton Methods
                                                                      1. Secant Equation
                                                                        1. BFGS Method
                                                                          1. Update Formula
                                                                            1. Positive Definiteness Preservation
                                                                            2. DFP Method
                                                                              1. Limited-Memory BFGS (L-BFGS)
                                                                                1. Broyden Family of Updates
                                                                                2. Trust Region Methods
                                                                                  1. Trust Region Subproblem
                                                                                    1. Cauchy Point
                                                                                      1. Dogleg Method
                                                                                        1. Trust Region Radius Updates
                                                                                      2. Specialized First-Order Methods
                                                                                        1. Stochastic Gradient Descent
                                                                                          1. Algorithm Variants
                                                                                            1. Convergence in Expectation
                                                                                              1. Mini-Batch Variants
                                                                                              2. Adaptive Gradient Methods
                                                                                                1. AdaGrad
                                                                                                  1. RMSprop
                                                                                                    1. Adam Optimizer
                                                                                                      1. AdaDelta
                                                                                                      2. Coordinate Descent Methods
                                                                                                        1. Cyclic Coordinate Descent
                                                                                                          1. Random Coordinate Descent
                                                                                                            1. Block Coordinate Descent