CGNM: Cluster Gauss-Newton Method

library(CGNM)
library(knitr)

When and when not to use CGNM

Use CGNM

Not to use CGNM

How to use CGNM

To illutrate the use of CGNM here we illustrate how CGNM can be used to estiamte two sets of the best fit parameters of the pharmacokinetics model when the drug is administered orally (known as flip-flop kinetics).

Prepare the model (\(\boldsymbol f\))

model_function=function(x){

  observation_time=c(0.1,0.2,0.4,0.6,1,2,3,6,12)
  Dose=1000
  F=1

  ka=x[1]
  V1=x[2]
  CL_2=x[3]
  t=observation_time

  Cp=ka*F*Dose/(V1*(ka-CL_2/V1))*(exp(-CL_2/V1*t)-exp(-ka*t))

  log10(Cp)
}

Prepare the data (\(\boldsymbol y^*\))

observation=log10(c(4.91, 8.65, 12.4, 18.7, 24.3, 24.5, 18.4, 4.66, 0.238))

Run Cluster_Gauss_Newton_method

Here we have specified the upper and lower range of the initial guess.

CGNM_result=Cluster_Gauss_Newton_method(nonlinearFunction=model_function,
targetVector = observation,
initial_lowerRange =rep(0.01,3),initial_upperRange =  rep(100,3),lowerBound = rep(0,3), saveLog=TRUE, num_minimizersToFind = 500, ParameterNames = c("Ka","V1","CL"))
#> [1] "checking if the nonlinearFunction can be evaluated at the initial_lowerRange"
#> [1] "Evaluation Successful"
#> [1] "checking if the nonlinearFunction can be evaluated at the initial_upperRange"
#> [1] "Evaluation Successful"
#> [1] "checking if the nonlinearFunction can be evaluated at the (initial_upperRange+initial_lowerRange)/2"
#> [1] "Evaluation Successful"
#> [1] "Generating initial cluster. 494 out of 500 done"
#> [1] "Generating initial cluster. 500 out of 500 done"
#> [1] "Iteration:1  Median sum of squares residual=5.78695374635533"
#> [1] "Iteration:2  Median sum of squares residual=2.78492629823195"
#> [1] "Iteration:3  Median sum of squares residual=1.30447583054192"
#> [1] "Iteration:4  Median sum of squares residual=0.926981167548036"
#> [1] "Iteration:5  Median sum of squares residual=0.870756157193424"
#> [1] "Iteration:6  Median sum of squares residual=0.456204091263342"
#> [1] "Iteration:7  Median sum of squares residual=0.0195680204393633"
#> [1] "Iteration:8  Median sum of squares residual=0.00742974026899623"
#> [1] "Iteration:9  Median sum of squares residual=0.0073500512787216"
#> [1] "Iteration:10  Median sum of squares residual=0.00734923654505464"
#> [1] "Iteration:11  Median sum of squares residual=0.00734923415124815"
#> [1] "Iteration:12  Median sum of squares residual=0.0073492340839065"
#> [1] "Iteration:13  Median sum of squares residual=0.00734923408294082"
#> [1] "Iteration:14  Median sum of squares residual=0.00734923408246023"
#> [1] "Iteration:15  Median sum of squares residual=0.00734923408238856"
#> [1] "Iteration:16  Median sum of squares residual=0.00734923408238727"
#> [1] "Iteration:17  Median sum of squares residual=0.0073492340823824"
#> [1] "Iteration:18  Median sum of squares residual=0.00734923408238163"
#> [1] "Iteration:19  Median sum of squares residual=0.00734923408238108"
#> [1] "Iteration:20  Median sum of squares residual=0.00734923408238098"
#> [1] "Iteration:21  Median sum of squares residual=0.00734923408238092"
#> [1] "Iteration:22  Median sum of squares residual=0.00734923408238089"
#> [1] "Iteration:23  Median sum of squares residual=0.00734923408238084"
#> [1] "Iteration:24  Median sum of squares residual=0.00734923408238084"
#> [1] "Iteration:25  Median sum of squares residual=0.00734923408238082"

Obtain the approximate minimizers

kable(head(acceptedApproximateMinimizers(CGNM_result)))
Ka V1 CL
0.5178956 10.66084 9.877326
0.9265056 19.07204 9.877326
0.9265056 19.07204 9.877326
0.9265056 19.07204 9.877326
0.5178844 10.66075 9.877737
0.5178956 10.66084 9.877326
kable(table_parameterSummary(CGNM_result))
CGNM: Minimum 25 percentile Median 75 percentile Maximum
Ka 0.5178466 0.5178956 0.5178956 0.9265056 0.9265984
V1 10.6593220 10.6608370 10.6608373 19.0720415 19.0742024
CL 9.8773041 9.8773257 9.8773258 9.8773258 9.8779889

Can run residual resampling bootstrap analyses using CGNM as well

CGNM_bootstrap=Cluster_Gauss_Newton_Bootstrap_method(CGNM_result, nonlinearFunction=model_function)
#> [1] "checking if the nonlinearFunction can be evaluated at the initial_lowerRange"
#> [1] "Evaluation Successful"
#> [1] "checking if the nonlinearFunction can be evaluated at the initial_upperRange"
#> [1] "Evaluation Successful"
#> [1] "checking if the nonlinearFunction can be evaluated at the (initial_upperRange+initial_lowerRange)/2"
#> [1] "Evaluation Successful"
#> [1] "Generating initial cluster. 200 out of 200 done"
#> [1] "Iteration:1  Median sum of squares residual=0.0109890485860532"
#> [1] "Iteration:2  Median sum of squares residual=0.0104515843897709"
#> [1] "Iteration:3  Median sum of squares residual=0.0102955416616394"
#> [1] "Iteration:4  Median sum of squares residual=0.0102486575619086"
#> [1] "Iteration:5  Median sum of squares residual=0.0102486569005819"
#> [1] "Iteration:6  Median sum of squares residual=0.0102486569005819"
#> [1] "Iteration:7  Median sum of squares residual=0.0102486569005819"
#> [1] "Iteration:8  Median sum of squares residual=0.0102486568873426"
#> [1] "Iteration:9  Median sum of squares residual=0.0102486568873426"
#> [1] "Iteration:10  Median sum of squares residual=0.0102486568873426"
#> [1] "Iteration:11  Median sum of squares residual=0.0102486568873426"
#> [1] "Iteration:12  Median sum of squares residual=0.0102486568873426"
#> [1] "Iteration:13  Median sum of squares residual=0.0102486568873426"
#> [1] "Iteration:14  Median sum of squares residual=0.0102486568873426"
#> [1] "Iteration:15  Median sum of squares residual=0.0102486568873426"
#> [1] "Iteration:16  Median sum of squares residual=0.0102486568873426"
#> [1] "Iteration:17  Median sum of squares residual=0.0102486568873426"
#> [1] "Iteration:18  Median sum of squares residual=0.0102486568873425"
#> [1] "Iteration:19  Median sum of squares residual=0.0102486568873415"
#> [1] "Iteration:20  Median sum of squares residual=0.0102486568873415"
#> [1] "Iteration:21  Median sum of squares residual=0.0102486568873415"
#> [1] "Iteration:22  Median sum of squares residual=0.0102486568873415"
#> [1] "Iteration:23  Median sum of squares residual=0.0102486568873415"
#> [1] "Iteration:24  Median sum of squares residual=0.0102486568873415"
#> [1] "Iteration:25  Median sum of squares residual=0.0102486568873415"
kable(table_parameterSummary(CGNM_bootstrap))
CGNM Bootstrap: Minimum 25 percentile Median 75 percentile Maximum RSE (%)
Ka 0.4827564 0.5157973 0.5420444 0.9219526 1.131742 29.94393
V1 9.4983428 10.6044787 11.4661871 18.7915696 22.693062 29.92585
CL 9.2841762 9.6574491 9.7964538 10.0802214 10.921496 2.93821

Visualize the CGNM modelfit analysis result

To use the plot functions the user needs to manually load ggplot2.

library(ggplot2)

Inspect the distribution of SSR of approximate minimizers found by CGNM

Despite the robustness of the algorithm not all approximate minimizers converge so here we visually inspect to see how many of the approximate minimizers we consider to have the similar SSR to the minimum SSR. Currently the algorithm automatically choose “acceptable” approximate minimizer based on Grubbs’ Test for Outliers. If for whatever the reason this criterion is not satisfactly the users can manually set the indicies of the acceptable approximat minimizers.

plot_Rank_SSR(CGNM_result)

plot_paraDistribution_byHistogram(CGNM_bootstrap, bins = 50)+scale_x_continuous(trans="log10")

visually inspect goodness of fit of top 50 approximate minimizers

plot_goodnessOfFit(CGNM_result, plotType = 1, independentVariableVector = c(0.1,0.2,0.4,0.6,1,2,3,6,12), plotRank = seq(1,50))

plot model prediction with uncertainties based on residual resampling bootstrap analysis

plot_goodnessOfFit(CGNM_bootstrap, plotType = 1, independentVariableVector = c(0.1,0.2,0.4,0.6,1,2,3,6,12))

plot profile likelihood

plot_profileLikelihood(c("CGNM_log","CGNM_log_bootstrap"))+scale_x_continuous(trans="log10")
#> [1] "log saved in /private/var/folders/n8/mrjc8j9j45x93m7t_hmvg9nm0000gn/T/RtmpiLeHBq/Rbuild37582cb5afd4/CGNM/vignettes/CGNM_log is used to draw SSR/likelihood surface"
#> Warning in prepSSRsurfaceData(logLocation, ParameterNames,
#> ReparameterizationDef, : the nonlinear function used in this log in /private/
#> var/folders/n8/mrjc8j9j45x93m7t_hmvg9nm0000gn/T/RtmpiLeHBq/Rbuild37582cb5afd4/
#> CGNM/vignettes/CGNM_log is not the same as /private/var/folders/n8/
#> mrjc8j9j45x93m7t_hmvg9nm0000gn/T/RtmpiLeHBq/Rbuild37582cb5afd4/CGNM/vignettes/
#> CGNM_log
#> [1] "log saved in /private/var/folders/n8/mrjc8j9j45x93m7t_hmvg9nm0000gn/T/RtmpiLeHBq/Rbuild37582cb5afd4/CGNM/vignettes/CGNM_log_bootstrap is used to draw SSR/likelihood surface"
#> Warning in prepSSRsurfaceData(logLocation, ParameterNames,
#> ReparameterizationDef, : the nonlinear function used in this log in /private/
#> var/folders/n8/mrjc8j9j45x93m7t_hmvg9nm0000gn/T/RtmpiLeHBq/Rbuild37582cb5afd4/
#> CGNM/vignettes/CGNM_log_bootstrap is not the same as /private/var/folders/n8/
#> mrjc8j9j45x93m7t_hmvg9nm0000gn/T/RtmpiLeHBq/Rbuild37582cb5afd4/CGNM/vignettes/
#> CGNM_log

kable(table_profileLikelihoodConfidenceInterval(c("CGNM_log","CGNM_log_bootstrap"), alpha = 0.25))
#> [1] "log saved in /private/var/folders/n8/mrjc8j9j45x93m7t_hmvg9nm0000gn/T/RtmpiLeHBq/Rbuild37582cb5afd4/CGNM/vignettes/CGNM_log is used to draw SSR/likelihood surface"
#> Warning in prepSSRsurfaceData(logLocation, ParameterNames,
#> ReparameterizationDef, : the nonlinear function used in this log in /private/
#> var/folders/n8/mrjc8j9j45x93m7t_hmvg9nm0000gn/T/RtmpiLeHBq/Rbuild37582cb5afd4/
#> CGNM/vignettes/CGNM_log is not the same as /private/var/folders/n8/
#> mrjc8j9j45x93m7t_hmvg9nm0000gn/T/RtmpiLeHBq/Rbuild37582cb5afd4/CGNM/vignettes/
#> CGNM_log
#> [1] "log saved in /private/var/folders/n8/mrjc8j9j45x93m7t_hmvg9nm0000gn/T/RtmpiLeHBq/Rbuild37582cb5afd4/CGNM/vignettes/CGNM_log_bootstrap is used to draw SSR/likelihood surface"
#> Warning in prepSSRsurfaceData(logLocation, ParameterNames,
#> ReparameterizationDef, : the nonlinear function used in this log in /private/
#> var/folders/n8/mrjc8j9j45x93m7t_hmvg9nm0000gn/T/RtmpiLeHBq/Rbuild37582cb5afd4/
#> CGNM/vignettes/CGNM_log_bootstrap is not the same as /private/var/folders/n8/
#> mrjc8j9j45x93m7t_hmvg9nm0000gn/T/RtmpiLeHBq/Rbuild37582cb5afd4/CGNM/vignettes/
#> CGNM_log
Parameter Name 25 percentile 75 percentile
Ka 0.4822585 1.090332
V1 8.9758355 21.911281
CL 9.2751727 10.488478

plot profile likelihood surface

plot_2DprofileLikelihood(CGNM_result, showInitialRange=FALSE, alpha = 0.05)+scale_x_continuous(trans="log10")+scale_y_continuous(trans="log10")
#> [1] "log saved in /private/var/folders/n8/mrjc8j9j45x93m7t_hmvg9nm0000gn/T/RtmpiLeHBq/Rbuild37582cb5afd4/CGNM/vignettes/CGNM_log is used to draw SSR/likelihood surface"
#> Warning in prepSSRsurfaceData(logLocation, ParameterNames,
#> ReparameterizationDef, : the nonlinear function used in this log in /private/
#> var/folders/n8/mrjc8j9j45x93m7t_hmvg9nm0000gn/T/RtmpiLeHBq/Rbuild37582cb5afd4/
#> CGNM/vignettes/CGNM_log is not the same as /private/var/folders/n8/
#> mrjc8j9j45x93m7t_hmvg9nm0000gn/T/RtmpiLeHBq/Rbuild37582cb5afd4/CGNM/vignettes/
#> CGNM_log
#> [1] "log saved in /private/var/folders/n8/mrjc8j9j45x93m7t_hmvg9nm0000gn/T/RtmpiLeHBq/Rbuild37582cb5afd4/CGNM/vignettes/CGNM_log_bootstrap is used to draw SSR/likelihood surface"
#> Warning in prepSSRsurfaceData(logLocation, ParameterNames,
#> ReparameterizationDef, : the nonlinear function used in this log in /private/
#> var/folders/n8/mrjc8j9j45x93m7t_hmvg9nm0000gn/T/RtmpiLeHBq/Rbuild37582cb5afd4/
#> CGNM/vignettes/CGNM_log_bootstrap is not the same as /private/var/folders/n8/
#> mrjc8j9j45x93m7t_hmvg9nm0000gn/T/RtmpiLeHBq/Rbuild37582cb5afd4/CGNM/vignettes/
#> CGNM_log

Parallel computation

Cluster Gauss Newton method implementation in CGNM package (above version 0.6) can use nonlinear function that takes multiple input vectors stored in matrix (each column as the input vector) and output matrix (each column as the output vector). This implementation was to be used to parallelize the computation. See below for the examples of parallelized implementation in various hardware. Cluster Gauss Newton method is embarrassingly parallelizable so the computation speed is almost proportional to the number of computation cores used especially for the nonlinear functions that takes time to compute (e.g. models with numerical method to solve a large system of ODEs).

model_matrix_function=function(X){
  Y_list=lapply(split(X, rep(seq(1:nrow(X)),ncol(X))), model_function)
  Y=t(matrix(unlist(Y_list),ncol=length(Y_list)))
}

testX=t(matrix(c(rep(0.01,3),rep(10,3),rep(100,3)), nrow = 3))
print("testX")
#> [1] "testX"
print(testX)
#>       [,1]  [,2]  [,3]
#> [1,] 1e-02 1e-02 1e-02
#> [2,] 1e+01 1e+01 1e+01
#> [3,] 1e+02 1e+02 1e+02
print("model_matrix_function(testX)")
#> [1] "model_matrix_function(testX)"
print(model_matrix_function(testX))
#>           [,1]      [,2]     [,3]      [,4]      [,5]      [,6]       [,7]
#> [1,] 1.9782455 2.2578754 2.517166 2.6529261 2.7982741 2.9311513  2.9684634
#> [2,] 1.7756978 1.8804296 1.860008 1.7832148 1.6114094 1.1771685  0.7428740
#> [3,] 0.9609136 0.9175059 0.830647 0.7437881 0.5700703 0.1357758 -0.2985186
#>            [,8]      [,9]
#> [1,]  2.9771626  2.952246
#> [2,] -0.5600094 -3.165776
#> [3,] -1.6014021 -4.207169

print("model_matrix_function(testX)-rbind(model_function(testX[1,]),model_function(testX[2,]),model_function(testX[3,]))")
#> [1] "model_matrix_function(testX)-rbind(model_function(testX[1,]),model_function(testX[2,]),model_function(testX[3,]))"
print(model_matrix_function(testX)-rbind(model_function(testX[1,]),model_function(testX[2,]),model_function(testX[3,])))
#>      [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
#> [1,]    0    0    0    0    0    0    0    0    0
#> [2,]    0    0    0    0    0    0    0    0    0
#> [3,]    0    0    0    0    0    0    0    0    0

an example of parallel implementation for Mac using parallel package


# library(parallel)
# 
#  model_matrix_function=function(X){
#   Y_list=mclapply(split(X, rep(seq(1:nrow(X)),ncol(X))), model_function,mc.cores = (parallel::detectCores()-1), mc.preschedule = FALSE)
#   
#   # sometimes the ODE solver quit prematurely and give partial result
#   # so need to replace these with vector of NAs 
#   obsLength=max(lengths(Y_list))  
#   failed_indicies=which(lengths(Y_list)!=obsLength)
#   for(i in failed_indicies){
#     Y_list[[i]]=rep(NA,obsLength)
#   }
#   Y=t(matrix(unlist(Y_list),ncol=length(Y_list)))
# 
#   return(Y)
#  }

an example of parallel implementation for Windows using foreach and doParllel packages

#library(foreach)
#library(doParallel)

#numCore=8
#registerDoParallel(numCore-1)
#cluster=makeCluster(numCore-1, type = "PSOCK")
#registerDoParallel(cl=cluster)

#model_matrix_function=function(X){
#  Y_list=foreach(i=1:dim(X)[1], .export = c("model_function"))%dopar%{ #make sure to include all related functions in .export and all used packages in .packages for more information read documentation of dopar
#      model_function((X[i,]))
#    }
  
#  Y=t(matrix(unlist(Y_list),ncol=length(Y_list)))
#}
CGNM_result=Cluster_Gauss_Newton_method(nonlinearFunction=model_matrix_function,
targetVector = observation,
initial_lowerRange =rep(0.01,3),initial_upperRange =  rep(100,3),lowerBound = rep(0,3), saveLog=TRUE, num_minimizersToFind = 500, ParameterNames = c("Ka","V1","CL"))
#> [1] "nonlinearFunction is given as matrix to matrix function"
#> [1] "NonlinearFunction evaluation at initial_lowerRange Successful."
#> [1] "NonlinearFunction evaluation at (initial_upperRange+initial_lowerRange)/2 Successful."
#> [1] "NonlinearFunction evaluation at initial_upperRange Successful."
#> Warning in dir.create(saveFolderName): 'CGNM_log' already exists
#> [1] "Generating initial cluster. 495 out of 500 done"
#> [1] "Generating initial cluster. 500 out of 500 done"
#> [1] "Iteration:1  Median sum of squares residual=6.01021661304427"
#> [1] "Iteration:2  Median sum of squares residual=2.33708287183305"
#> [1] "Iteration:3  Median sum of squares residual=0.930338686081594"
#> [1] "Iteration:4  Median sum of squares residual=0.92697891647959"
#> [1] "Iteration:5  Median sum of squares residual=0.846413496910536"
#> [1] "Iteration:6  Median sum of squares residual=0.510641710857903"
#> [1] "Iteration:7  Median sum of squares residual=0.234173581512939"
#> [1] "Iteration:8  Median sum of squares residual=0.0127010305067836"
#> [1] "Iteration:9  Median sum of squares residual=0.00735046937134848"
#> [1] "Iteration:10  Median sum of squares residual=0.00734923476977574"
#> [1] "Iteration:11  Median sum of squares residual=0.00734923409206861"
#> [1] "Iteration:12  Median sum of squares residual=0.0073492340832614"
#> [1] "Iteration:13  Median sum of squares residual=0.00734923408246989"
#> [1] "Iteration:14  Median sum of squares residual=0.0073492340824019"
#> [1] "Iteration:15  Median sum of squares residual=0.00734923408238841"
#> [1] "Iteration:16  Median sum of squares residual=0.00734923408238326"
#> [1] "Iteration:17  Median sum of squares residual=0.00734923408238241"
#> [1] "Iteration:18  Median sum of squares residual=0.00734923408238213"
#> [1] "Iteration:19  Median sum of squares residual=0.00734923408238134"
#> [1] "Iteration:20  Median sum of squares residual=0.00734923408238105"
#> [1] "Iteration:21  Median sum of squares residual=0.00734923408238092"
#> [1] "Iteration:22  Median sum of squares residual=0.0073492340823809"
#> [1] "Iteration:23  Median sum of squares residual=0.00734923408238084"
#> [1] "Iteration:24  Median sum of squares residual=0.00734923408238084"
#> [1] "Iteration:25  Median sum of squares residual=0.00734923408238084"

What is CGNM?

For the complete description and comparison with the conventional algorithm please see (https: //doi.org/10.1007/s11081-020-09571-2):

Aoki, Y., Hayami, K., Toshimoto, K., & Sugiyama, Y. (2020). Cluster Gauss–Newton method. Optimization and Engineering, 1-31.

The mathematical problem CGNM solves

Cluster Gauss-Newton method is an algorithm for obtaining multiple minimisers of nonlinear least squares problems \[ \min_{\boldsymbol{x}}|| \boldsymbol{f}(\boldsymbol x)-\boldsymbol{y}^*||_2^{\,2} \] which do not have a unique solution (global minimiser), that is to say, there exist \(\boldsymbol x^{(1)}\neq\boldsymbol x^{(2)}\) such that \[ \min_{\boldsymbol{x}}|| \boldsymbol{f}(\boldsymbol x)-\boldsymbol{y}^*||_2^{\,2}=|| \boldsymbol{f}(\boldsymbol x^{(1)})-\boldsymbol{y}^*||_2^{\,2}=|| \boldsymbol{f}(\boldsymbol x^{(2)})-\boldsymbol{y}^*||_2^{\,2} \,. \] Parameter estimation problems of mathematical models can often be formulated as nonlinear least squares problems. Typically these problems are solved numerically using iterative methods. The local minimiser obtained using these iterative methods usually depends on the choice of the initial iterate. Thus, the estimated parameter and subsequent analyses using it depend on the choice of the initial iterate. One way to reduce the analysis bias due to the choice of the initial iterate is to repeat the algorithm from multiple initial iterates (i.e. use a multi-start method). However, the procedure can be computationally intensive and is not always used in practice. To overcome this problem, we propose the Cluster Gauss-Newton method (CGNM), an efficient algorithm for finding multiple approximate minimisers of nonlinear-least squares problems. CGN simultaneously solves the nonlinear least squares problem from multiple initial iterates. Then, CGNM iteratively improves the approximations from these initial iterates similarly to the Gauss-Newton method. However, it uses a global linear approximation instead of the Jacobian. The global linear approximations are computed collectively among all the iterates to minimise the computational cost associated with the evaluation of the mathematical model.