HLBFGS: A hybrid L-BFGS routine

Introduction

HLBFGS is a hybrid L-BFGS(Limited Memory Broyden�Fletcher�Goldfarb�Shanno Method) optimization framework which unifies L-BFGS method^[1], Preconditioned L-BFGS method^[2] and Preconditioned Conjugate Gradient method^[3,9]. With different options, HLBFGS also works like gradient-decent method, Newton method and Conjugate-gradient method. The original purpose I wrote this routine is for fast-computing large-scaled Centroidal Voronoi diagram^[4].

HLBFGS is used to minimize a multivariable function F(X) without constraints. The users only need to provide the initial guess of X and the routines which compute the function value F(X₀) and its gradient dF(X₀) . HLBFGS can speedup the optimization process if the users provide the true Hessian of F(X). For technical details and optimization theories behind HLBFGS, please consult the references listed at the end of this page.

Availability

HLBFGS is freely available for non-commercial purposes. The third-party software ICFS^[5] is integrated into HLBFGS to modify the true Hessian if the Hessian is available. The line-search routine^[6] is from the original L-BFGS^[1] code and TNPACK^[7].

You can download HLBFGS C++ package from here. The package includes HLBFGS routine, a sparse matrix class, line-search routine and ICFS library.

History

March 9, 2010: released HLBFGS 1.2 to public. Changes: OPENMP support and other minor adjustments.
July 13, 2009: released HLBFGS 1.0 to public.

Usages

add all the files of HLBFGS into your C++ project;
include "HLBFGS.h" into your source;
include "Lite_Sparse_Matrix.h" into your source if you would like to provide the hessian of F(X);
initialize HLBFGS routine by calling
INIT_HLBFGS(PARAMETER, INFO);
here PARAMETER is a double array (dimension >= 20) and INFO is an integer array (dimension >= 20). INIT_HLBFGS function sets the default values for PARAMETER and INFO;
customize the values of PARAMETER and INFO, see Section HLBFGS Setting;
optimize F(X) by calling
HLBFGS(N, M, INIT_X, EVALFUNC, EVALFUNC_H, HLBFGS_UPDATE_HESSIAN, NEWITERATION, PARAMETER, INFO);
here N is the number of variables, M is the number of LBFGS updates (typical choices: 3-20), INIT_X is the intial guess of variables. The users need to implement the functions:
void EVALFUNC(int N, double* X, double *PREV_X, double* F, double* G),
void NEWITERATION(int ITER, int CALL_ITER, double *X, double* F, double *G, double* GNORM).
EVALFUNC is used to evaluate the function value F(X) and its gradient G(X). PREV_X stores the variables from the last iteration.
NEWITERATION is used to collect internal information after each iteration, ITER stores the current number of iterations, CALL_ITER stores the number of EVALFUNC calls from the beginning, GNORM stores the Euclidean norm of the gradient. If the users utilize the Hessian of F(X), the function
void EVALFUNC_H(int N, double *X, double *PREV_X, double *F, double *G, HESSIAN_MATRIX& HESSIAN)
should be implemented. The users need to use Lite_Sparse_Matrix class to store the lower part of the Hessian. Please see the sample code for details.

A sample code is available (html verion, cpp version).

HLBFGS Setting

The values of PARAMETER and INFO define the behaviors of HLBGFS.

Entry	Default value	Meaning
PARAMETER[0]	1.0e-4	function tolerance used in line-search
PARAMETER[1]	1.0e-16	variable tolerance used in line-search
PARAMETER[2]	0.9	gradient tolerance used in line-search
PARAMETER[3]	1.0e-20	stpmin used in line-search
PARAMETER[4]	1.0e+20	stpmax used in line-search
PARAMETER[5]	1.0e-9	the stop criterion ( \|\|G\|\|/max(1,\|\|X\|\|) < PARAMETER[5] )
PARAMETER[6]	1.0e-10	the stop criterion ( \|\|G\|\| < PARAMETER[6] )
PARAMETER[7..19]		Reserved
INFO[0]	20	the max number of evaluation in line-search
INFO[1]	0	the total number of evalfunc calls
INFO[2]	0	the current number of iterations
INFO[3]	0	The lbfgs strategy. 0: standard, 1: M1QN3 strategy^[8](recommended).
INFO[4]	100000	the max number of iterations
INFO[5]	1	1: print message, 0: do nothing
INFO[6]	10	T: the update interval of Hessian. (typical choices: 0-200)
INFO[7]	0	0: without hessian, 1: with accurate hessian
INFO[8]	15	icfs parameter
INFO[9]	0	0: classical line-search; 1: modified line-search (it is not useful in practice)
INFO[10]	0	0: Disable preconditioned CG; 1: Enable preconditioned CG
INFO[11]	1	0 or 1 defines different methods for choosing beta in CG.
INFO[12]	1	internal usage. 0: only update the diag in USER_DEFINED_HLBFGS_UPDATE_H; 1: default.
INFO[13..19]		Reserved

In general the users only need to specify the values of INFO[3,4,5,6,7,10]. For different optimization methods, please check the following table.

Method	Setting
Gradient Decent	M=0, INFO[7]=0, INFO[10]=0
Conjugate Gradient	M=0, INFO[7]=0, INFO[10]=1
Newton's method	M=0, INFO[6]=0, INFO[7]=1, INFO[10]=0
L-BFGS	M>=1, INFO[3]=0, INFO[7]=0, INFO[10]=0
M1QN3	M>=1, INFO[3]=1, INFO[7]=0, INFO[10]=0
Preconditioned L-BFGS	M>=1, INFO[3]=0, INFO[6]=T>=0, INFO[7]=1, INFO[10]=0
Preconditioned M1QN3	M>=1, INFO[3]=1, INFO[6]=T>=0, INFO[7]=1, INFO[10]=0
Preconditioned Conjugate Gradient without Hessian	M>=1, INFO[3]=0 or 1, INFO[7]=0, INFO[10]=1
Preconditioned Conjugate Gradient with Hessian	M>=1, INFO[3]=0 or 1, INFO[6]=T>=0, INFO[7]=1, INFO[10]=1
Other variants...	...

To do

Make C++ interface and code friendly;
Understand and intergrate nonsmooth L-BFGS.

Feedback

If you find any bugs or have any suggestion, please send email to me (yangliu@microsoft.com).

References

D. C. Liu and J. Nocedal, On the Limited Memory Method for Large Scale Optimization, Mathematical Programming B, 45, 3, pp. 503-528, 1999.
L. Jiang, R. H. Byrd, E. Eskow and R. B. Schnabel, A Preconditioned L-BFGS Algorithm with Application to Molecular Energy Minimization, Technical Report CU-CS-982-04, Department of Computer Science, 2004.
R. Pytlak, Conjugate Gradient Algorithms in Nonconvex Optimization, Springer, 2008.
Y. Liu, W. Wang, B. L�vy, F. Sun, D-M Yan, L. Lu and Chenglei Yang: On Centroidal Voronoi Tessellation ― Energy Smoothness and Fast Computation, ACM Transactions on Graphics, 28(4), 1-17.
C-J Lin and J. J. Mor�, Incomplete Cholesky Factorizations with Limited Memory, SIAM Journal on Scientific Computing, 21, 1, pp. 1064-8275, 1999.
J. J. Mor� and D. J. Thuente, Line search algorithms with guaranteed sufficient decrease, ACM Trans. Math. Softw. 20, 3, pp. 286-307, 1994.
D. Xie and T. Schlick, Remark on the Updated Truncated Newton Minimization Package, Algorithm 702, ACM Trans. Math. Softw., 25, 1, pp. 108-122, 1999.
J. Ch. Gilbert and C. Lemar�chal, Some numerical experiments with variable-storage quasi-Newton algorithms. Mathematical Programming 45, pp. 407-435, 1989.
R. Pytlak and T. Tarnawski, Preconditioned conjugate gradient algorithms for nonconvex problems, Decision and Control, 2004. CDC. 43rd IEEE Conference on , vol.3, pp. 3191-3196, 2004.

Misc.

There are some good L-BFGS libraries I used before: L-BFGS (Fortran, C++), L-BFGS-B (Fortran, C++), M1QN3 (Fortran).
I am also interested in nonsmooth L-BFGS: LMBM (Limited Memory Bundle Method), Prof. Ladislav Luksan's code.

Updated on March 9, 2010