Skip to content

Latest commit

 

History

History
124 lines (108 loc) · 4.94 KB

File metadata and controls

124 lines (108 loc) · 4.94 KB

;; This buffer is for notes you don’t want to save, and for Lisp evaluation. ;; If you want to create a file, visit that file with C-x C-f, ;; then enter the text in that file’s own buffer.

#-*- mode: org -*-

Testing unconstrained optimization in C++

“What I cannot create, I do not understand.” - Feynman

The goal is to test my own implementations of gradient descent and Quasi-Newton methods. In the process I also created my own Vector and Matrix classes.

Using: cmake, C++11 Testing with: armadillo, Eigen

Reading sources

Read C++ tips by Herbert Simon

Look into Numerical Recipes for efficiency

Look into decompositions

LU, Cholesky, QR, Eigendecomposition and SVD in particular

Look into optimization code

Read proof of Gradient Descent convergence with Line Search

The key element in the proof is to bound ∑_k cos(θ_k)^2 \|\grad f_k\|^2

The term appearing in the linear rate of convergence depends on the minimum and maximum eigenvalues of the curvature.

Read Newton’s Method convergence proof

The Newton’s method has a local convergence proof, which depends on the curvature being lipschitz continuous (as a result the function f if C^2)

The global convergence is ONLY achieved if the local convergence holds and from a far away starting point the descent directions can be guaranteed with a modified matrix factorization approach.

Read QN local convergence proof

The QN local convergence relies on the fact that the curvature on the QN descent directions p_k are small o(p_k).

Read QN global convergence proof

Implement linear algebra library

Implement vector and matrix types

Including typedefs for the vector matrix types

Allocate/deallocate vectors and matrices

Initially using malloc and free Checking leak with valgrind was very useful. Moved then to C++ with new/delete

Introduce class structure for vectors and matrices

Classes contain raw vector/matrix types These are std::vectors instead of arrays Removed destructors after introducing vectors

Introduce matrix-vector multiplication

To do that while not violating encapsulation I introduced inner_product method in Vector class. Let’s see if it is fast enough!

Compare speed with Eigen and Armadillo matrices

Comparing 1000x1000 matrix multiplication with 1000x1 vector Eigen seems to be the fastest on my laptop. My implementation is of course slower but not too bad!

Asserting condition that it should not be slower than 3 times the time it takes for ARMA/EIGEN!

Compare accuracy with Eigen and Armadillo

Comparing the accuracy in a unit test that I created. Requiring that the entries of the output vector be equal to the ARMADILLO result. Seeding both ARMA and OWN matrices and vectors with C++11 random number generator and seed.

Add matrix inversion

Compare speed of matrix inversion Eigen/Arma

Test accuracy of matrix inversion

Test gradient descent for an arbitrary function

Use a learning rate

Added a learning rate, xtol and ftol termination criteria. I had to also introduce -=, +=, *, and norm operations on vectors.

Add a common interface to grad descent

Added templated gradient descent function that can work will ARMADILLO and EIGEN libraries.

I had to wrap x.norm of Eigen to norm(x) for compatibility. In the end there are three specializations which have to be explicitly spelled out in the source file for the compiler.

Add unit tests comparing the results with three libs

Added a unit test for optimizing f(x) = x’*x in ten dimensions. I had to again introduce three different specializations for the cost function and its gradient.

Add line search

Added templates for gradient descent with line search and with fixed learning rate.

I had to additionally introduce vdot and vnorm(.) template functions that are specialized to Vector, arma::vec and Eigen::VectorXd vectors inside implemenation files (i.e. optim_impl_xx.cpp)

Test with various test functions

After failing attempts to introduce a concise way to template multiple test functions (with gradients) I now wrap them around a structure and introduce a vector of such structures for gradient descent.

Test Newton’s method

Compute Hessian matrix

Test with line search

Add trust region method (Nocedal et al.)

Add Quasi-Newton method (BFGS)

Add extensive test functions and evaluations

I can first check NLOPT to see what functions they are testing for unconstrained optim.

Check internet for unconstr. optim examples

Report termination info

Check other libraries

Extend vector & matrix computations with BLAS/LAPACK

Compare with Eigen unsupported module (MINPACK in C++)

Compare with NLOPT routines for speed