MDS is a statistic tool for reduction of dimensionality, using as input a distance matrix of dimensions n × n. When n is large, classical algorithms suffer from computational problems and MDS configuration can not be obtained.
With this package, we address these problems by means of three algorithms:
The main idea of these methods is based on partitioning the dataset into small pieces, where classical methods can work. Fast MDS was designed by Yang, T., J. Liu, L. McMillan, and W. Wang (2006), whereas divide-and-conquer MDS and interpolation MDS were designed by Delicado P. and C. Pachón-García (2021).
To obtain more information, please read this paper.
You can install directly from CRAN with:
# install.packages("bigmds")
You can install the development version from GitHub with:
# install.packages("devtools")
::install_github("pachoning/bigmds") devtools
This is a basic example which shows you how to solve a common problem:
set.seed(42)
library(bigmds)
<- matrix(data = rnorm(4*10000), nrow = 10000) %*% diag(c(9, 4, 1, 1))
x
<- divide_conquer_mds(x = x, l = 200, c_points = 2*2, r = 2, n_cores = 1, dist_fn = stats::dist)
divide_mds_conf head(divide_mds_conf$points)
#> [,1] [,2]
#> [1,] -12.0029447 4.5482795
#> [2,] 5.3135571 -0.6207096
#> [3,] -3.0272576 -1.0857873
#> [4,] -6.5402649 -1.9113426
#> [5,] -3.3311073 2.8156667
#> [6,] 0.9705889 -6.5670390
$eigen
divide_mds_conf#> [1] 83.26941 16.27533
$GOF
divide_mds_conf#> [1] 0.9795777 0.9795777
<- fast_mds(x = x, l = 200, s_points = 2*2, r = 2, n_cores = 1, dist_fn = stats::dist)
fast_mds_conf head(fast_mds_conf$points)
#> [,1] [,2]
#> [1,] 13.439660 5.0882344
#> [2,] -5.180648 -0.6150152
#> [3,] 3.894922 -0.8759228
#> [4,] 5.248688 -1.6144764
#> [5,] 3.520470 3.1887151
#> [6,] -1.329876 -6.7787889
$eigen
fast_mds_conf#> [1] 81.72223 16.07915
$GOF
fast_mds_conf#> [1] 0.9796994 0.9796994
<- interpolation_mds(x = x, l = 200, r = 2, n_cores = 1, dist_fn = stats::dist)
interpolation_mds_conf head(interpolation_mds_conf$points)
#> [,1] [,2]
#> [1,] -12.3616929 -4.6878946
#> [2,] 4.9424093 0.7621167
#> [3,] -3.3580614 1.1415676
#> [4,] -5.7834592 1.5567990
#> [5,] -3.6974408 -2.8075217
#> [6,] 0.8118489 6.4465272
$eigen
interpolation_mds_conf#> [1] 80.06032 16.53691
$GOF
interpolation_mds_conf#> [1] 0.9785652 0.9785652
With the implementation of classical MDS, it takes much more time to obtain a MDS configuration due to computational problems. Try it yourself!
<- matrix(data = rnorm(4*10000, sd = 10), nrow = 10000)
x <- stats::dist(x = x)
dist_matrix <- stats::cmdscale(d = dist_matrix, k = 2, eig = TRUE) mds_result