Optimization · Lecture 3

Multivariable Unconstrained Optimization

Classical Optimization Techniques

Dr. Mithun Mondal
SECTION 01

Introduction

Introduction
  • Multivariable optimization: Optimizes functions of multiple variables without constraints.
  • Key tools: Gradient and Hessian provide critical information for finding extrema.
  • Applications: Used in engineering, economics, and machine learning for multidimensional problems.
  • Objective: This lecture covers definitions, conditions, and methods for classifying extrema.
SECTION 02

Multivariable Optimization Concepts

Multivariable Optimization Concepts

Key Definitions

  • Function: \( f(\mathbf{x}) \), where \( \mathbf{x} = (x_1, x_2, \ldots, x_n)^T \).
  • Gradient vector: \( \nabla f = \left( \frac{\partial f}{\partial x_1}, \ldots, \frac{\partial f}{\partial x_n} \right)^T \).
  • Hessian matrix: \( H(f) = \left[ \frac{\partial^2 f}{\partial x_i \partial x_j} \right]_{n \times n} \).

Theorem: Necessary Condition

If \( f(\mathbf{x}) \) has a local extremum at \( \mathbf{x}^* \) and is differentiable, then \( \nabla f(\mathbf{x}^*) = \mathbf{0} \).

SECTION 03

Taylor Series Expansion

Taylor Series Expansion

Second-Order Approximation

For a function \( f(\mathbf{x}) \) about point \( \mathbf{x}^* \):

\[ f(\mathbf{x}) \approx f(\mathbf{x}^*) + \nabla f(\mathbf{x}^*)^T (\mathbf{x} - \mathbf{x}^*) + \frac{1}{2} (\mathbf{x} - \mathbf{x}^*)^T H(\mathbf{x}^*) (\mathbf{x} - \mathbf{x}^*) \]

  • First-order term: Gradient provides slope information.
  • Second-order term: Hessian provides curvature information.
  • Basis: Used in many optimization algorithms.
SECTION 04

Hessian Analysis Methods

Hessian Analysis Methods

Theorem: Second-Order Sufficient Condition

Let \( \nabla f(\mathbf{x}^*) = \mathbf{0} \). Then:

  • Positive definite Hessian: \( \mathbf{x}^* \) is a local minimum.
  • Negative definite Hessian: \( \mathbf{x}^* \) is a local maximum.
  • Indefinite Hessian: \( \mathbf{x}^* \) is a saddle point.

Checking Definiteness

  • Eigenvalue Test:
    • All \( \lambda_i > 0 \): Positive definite.
    • All \( \lambda_i < 0 \): Negative definite.
    • Mixed signs: Indefinite.
  • Sylvester’s Criterion:
    • All leading principal minors \( > 0 \): Positive definite.
    • Signs alternate starting with \( < 0 \): Negative definite.
SECTION 05

Saddle Point

Saddle Point
  • Definition: A point where the function is neither a maximum nor minimum.
  • Gradient Condition: All first partial derivatives are zero.
  • Surface Behavior: Curves upward in one direction, downward in another.

Example: For \( f(x, y) = x^2 - y^2 \):

  • Along \( x \)-axis: \( f(x, 0) = x^2 \) (upward curve).
  • Along \( y \)-axis: \( f(0, y) = -y^2 \) (downward curve).

Saddle point at \( (0,0) \): First partial derivatives are zero, Hessian is indefinite.

Saddle point of the function \(f(x,y)=x^2-y^2\)
Saddle point of the function \(f(x,y)=x^2-y^2\)
SECTION 06

Sylvester’s Criterion for Positive Definiteness

Sylvester’s Criterion for Positive Definiteness
  • Symmetric Matrix: \( \mathbf{x}^T A \mathbf{x} > 0 \) for all non-zero \( \mathbf{x} \).
  • Principal Minors: Determinants of leading principal submatrices.
  • Sylvester’s Criterion: All leading principal minors must be positive for positive definiteness.
  • Steps: Compute all leading principal minors; all must be positive.
SECTION 07

Example: Sylvester’s Criterion

Example: Sylvester’s Criterion

Example: Consider the matrix:

\[ A = \begin{pmatrix} 4 & 1 & 2 \\ 1 & 3 & 1 \\ 2 & 1 & 2 \end{pmatrix} \]

  • Principal Minors:
    • \( D_1 = 4 \).
    • \( D_2 = \det \begin{pmatrix} 4 & 1 \\ 1 & 3 \end{pmatrix} = 11 \).
    • \( D_3 = \det(A) = 10 \).
  • Conclusion: All minors positive (\( D_1 > 0, D_2 > 0, D_3 > 0 \)): Positive definite.
SECTION 08

Example: Quadratic Function

Example: Quadratic Function

Example

Find and classify critical points of \( f(x, y) = x^2 + 2y^2 - 2xy - 2y \).

Solution

  • Gradient: \( \nabla f = \begin{bmatrix} 2x - 2y \\ 4y - 2x - 2 \end{bmatrix} \).
  • Critical Point: Set \( \nabla f = \mathbf{0} \): \( 2x - 2y = 0 \), \( -2x + 4y = 2 \). Solution: \( x = 1, y = 1 \).
  • Hessian: \( H = \begin{bmatrix} 2 & -2 \\ -2 & 4 \end{bmatrix} \).
  • Definiteness:
    • Eigenvalues: \( \lambda_1 \approx 5.236 > 0 \), \( \lambda_2 \approx 0.764 > 0 \).
    • Principal minors: \( D_1 = 2 > 0 \), \( D_2 = 4 > 0 \).
  • Conclusion: Positive definite, so \( (1,1) \) is a local minimum.
SECTION 09

Summary

Summary
  • Key concepts: Gradient and Hessian are essential for multivariable optimization.
  • Optimality conditions: Identify and classify extrema using definiteness tests.
  • Sylvester’s Criterion: Determines Hessian definiteness via principal minors.
  • Applications: Examples illustrate practical use in engineering and data science.