R packages under analysis were retrieved from CRAN/Biocoductor on <%=readLines(url("https://pkgndep.github.io/date.txt"))[1]%>. There are <%=n_cran%> packages from CRAN and <%=n_bioc%> packages from Bioconductor (bioc version 3.15).
Legends:
High heaviness Packages with adjusted heaviness on child packages higher than <%=CUTOFF$adjusted_heaviness_on_children[2]%>.
Median heaviness Packages with adjusted heaviness on child packages between <%=CUTOFF$adjusted_heaviness_on_children[1]%> and <%=CUTOFF$adjusted_heaviness_on_children[2]%>.
reducible Packages whose parent's heaviness could be reduced, i.e. only a limited number of functions are imported from the heaviest parent.
Columns: Heaviness from parent packages Heaviness on child/downstream packages
The full table of dependency heaviness analysis can be obtained by df = pkgndep::all_pkg_stat_snapshot()
.
Depends
,
Imports
, LinkingTo
, Suggestes
and
Enhances
fields in its DESCRIPTION
file. We define the following dependency categories
for package P:
Depends
, Imports
, and LinkingTo
of P (red box in the figure). They are also called the strong direct dependency packages of P. Strong parent packages are mandatory to be installed when installing P.Suggest
and Enhances
of P (green box in the figure). They are optionally required when installing P.Various metrics for the heaviness are defined as follows:
Suggests
of P. Thus, the heaviness measures the number of additionally required strong dependencies that A brings to P and they are not brought by any other parent.
If package B is a weak parent of P, $n_2$ is defined as the number of strong dependencies of P after changing B to a strong parent of P, i.e., by moving B to Imports
of P. In this scenario, the heaviness of the weak parent is calculated as $n_2 - n_1$.If grouping packages by $K$ which can be the number of parent, child or downstream packages depending on the type of the heaviness metrics, the distributions of heaviness values always have long tails, and the tails are especially longer for smaller $K$. Thus, if simply ranking packages based on the original heaviness values, top packages are preferably associated with small $K$. In general, packages with small $K$ are of less interest because they only have very small impacts to the ecosystem. To prioritize packages with broader impacts to the ecosystem, the original definitions of various heaviness metrics are adjusted to decrease the weights of packages with smaller $K$. Please note, the designs of the adjusted heaviness metrics are empirical and the absolute values of adjusted heaviness are meaningless, which are only used for ranking packages. A detailed explanation of various adjusted heaviness metrics can be found in the tab "Heaviness analysis".
The co-heaviness measures the number of additional dependency packages simultaneously brought by two parent packages. Let A and B be two parents of P, denote $S_A$ as the set of reduced dependency packages when only changing A to a weak parent of P, denote $S_B$ as the set of reduced dependency packages when only changing B to a weak parent of P, and denote $S_{AB}$ as the set of reduced dependency packages when changing both A and B to weak parents of P, then the co-heaviness of A and B on P denoted as $h_{co}$ is defined as $h_{co} = \left|S_{AB}\setminus\cup(S_A,S_B)\right|$ where the symbol $X \setminus Y$ corresponds to the set of elements in $X$ but not in $Y$, and $|X|$ is the number of elements in set $X$. The co-heaviness measures the number of reduced packages only caused by co-action of A and B.
Loading content...