gof-statistics {btergm} | R Documentation |
Statistics for goodness-of-fit assessment of network models.
dsp(mat, ...) esp(mat, ...) nsp(mat, ...) deg(mat, ...) b1deg(mat, ...) b2deg(mat, ...) odeg(mat, ...) ideg(mat, ...) kstar(mat, ...) b1star(mat, ...) b2star(mat, ...) ostar(mat, ...) istar(mat, ...) kcycle(mat, ...) geodesic(mat, ...) triad.directed(mat, ...) triad.undirected(mat, ...) comemb(vec) walktrap.modularity(mat, ...) walktrap.roc(sim, obs, ...) walktrap.pr(sim, obs, ...) fastgreedy.modularity(mat, ...) fastgreedy.roc(sim, obs, ...) fastgreedy.pr(sim, obs, ...) louvain.modularity(mat, ...) louvain.roc(sim, obs, ...) louvain.pr(sim, obs, ...) maxmod.modularity(mat, ...) maxmod.roc(sim, obs, ...) maxmod.pr(sim, obs, ...) edgebetweenness.modularity(mat, ...) edgebetweenness.roc(sim, obs, ...) edgebetweenness.pr(sim, obs, ...) spinglass.modularity(mat, ...) spinglass.roc(sim, obs, ...) spinglass.pr(sim, obs, ...) rocpr(sim, obs, roc = TRUE, pr = TRUE, joint = TRUE, pr.impute = "poly4", ...)
mat |
A sparse network matrix as created by the |
... |
Additional arguments. This must be present in all auxiliary GOF statistics. |
vec |
A vector of community memberships in order to create a community co-membership matrix. |
sim |
A list of simulated networks. Each element in the list should be a
sparse matrix as created by the |
obs |
A list of observed (= target) networks. Each element in the list
should be a sparse matrix as created by the |
roc |
Compute receiver-operating characteristics (ROC)? |
pr |
Compute precision-recall curve (PR)? |
joint |
Merge all time steps into a single big prediction task and compute predictive fit (instead of computing GOF for all time steps separately)? |
pr.impute |
In some cases, the first precision value of the
precision-recall curve is undefined. The |
These functions can be plugged into the statistics
argument of the
gof
methods in order to compare observed with simulated networks (see
the gof-methods help page). There are three types of statistics:
Univariate statistics, which aggregate a network into a single
quantity. For example, modularity measures or density. The distribution
of statistics can be displayed using histograms, density plots, and
median bars. Univariate statistics take a sparse matrix (mat
)
as an argument and return a single numeric value that summarize a network
matrix.
Multivariate statistics, which aggregate a network into a vector of
quantities. For example, the distribution of geodesic distances, edgewise
shared partners, or indegree. These statistics typically have multiple
values, e.g., esp(1), esp(2), esp(3) etc. The results can be displayed
using multiple boxplots for simulated networks and a black curve for the
observed network(s). Multivariate statistics take a sparse matrix
(mat
) as an argument and return a vector of numeric values that
summarize a network matrix.
Tie prediction statistics, which predict dyad states the observed
network(s) by the dyad states in the simulated networks. For example,
receiver operating characteristics (ROC) or precision-recall curves (PR)
of simulated networks based on the model, or ROC or PR predictions of
community co-membership matrices of the simulated vs. the observed
network(s). Tie prediction statistics take a list of simulated sparse
network matrices and another list of observed sparse network matrices
(possibly containing only a single sparse matrix) as arguments and return
a rocpr
, roc
, or pr
object (as created by the
rocpr function).
Users can create their own statistics for use with the codegof methods. To
do so, one needs to write a function that accepts and returns the respective
objects described in the enumeration above. It is advisable to look at the
definitions of some of the existing functions to add custom functions. It is
also possible to add an attribute called label
to the return object,
which describes what is being returned by the function. This label will be
used as a descriptive label in the plot and for verbose output during
computations. The examples section contains an example of a custom user
statistic. Note that all statistics must contain the ...
argument to ensure that custom arguments of other statistics do not cause an
error.
To aid the development of custom statistics, the helper function
comemb
is available: it accepts a vector of community memberships and
converts it to a co-membership matrix. This function is also used internally
by statistics like walktrap.roc
and others.
dsp
: Multivariate GOF statistic: dyad-wise shared
partner distribution
esp
: Multivariate GOF statistic: edge-wise shared
partner distribution
nsp
: Multivariate GOF statistic: non-edge-wise shared
partner distribution
deg
: Multivariate GOF statistic: degree distribution
b1deg
: Multivariate GOF statistic: degree distribution
for the first mode
b2deg
: Multivariate GOF statistic: degree distribution
for the second mode
odeg
: Multivariate GOF statistic: outdegree distribution
ideg
: Multivariate GOF statistic: indegree distribution
kstar
: Multivariate GOF statistic: k-star distribution
b1star
: Multivariate GOF statistic: k-star distribution
for the first mode
b2star
: Multivariate GOF statistic: k-star distribution
for the second mode
ostar
: Multivariate GOF statistic: outgoing k-star
distribution
istar
: Multivariate GOF statistic: incoming k-star
distribution
kcycle
: Multivariate GOF statistic: k-cycle distribution
geodesic
: Multivariate GOF statistic: geodesic distance
distribution
triad.directed
: Multivariate GOF statistic: triad census in
directed networks
triad.undirected
: Multivariate GOF statistic: triad census in
undirected networks
comemb
: Helper function: create community co-membership
matrix
walktrap.modularity
: Univariate GOF statistic: Walktrap modularity
distribution
walktrap.roc
: Tie prediction GOF statistic: ROC of Walktrap
community detection. Receiver-operating characteristics of predicting the
community structure in the observed network(s) by the community structure
in the simulated networks, as computed by the Walktrap algorithm.
walktrap.pr
: Tie prediction GOF statistic: PR of Walktrap
community detection. Precision-recall curve for predicting the community
structure in the observed network(s) by the community structure in the
simulated networks, as computed by the Walktrap algorithm.
fastgreedy.modularity
: Univariate GOF statistic: fast and greedy
modularity distribution
fastgreedy.roc
: Tie prediction GOF statistic: ROC of fast and
greedy community detection. Receiver-operating characteristics of
predicting the community structure in the observed network(s) by the
community structure in the simulated networks, as computed by the fast and
greedy algorithm. Only sensible with undirected networks.
fastgreedy.pr
: Tie prediction GOF statistic: PR of fast and
greedy community detection. Precision-recall curve for predicting the
community structure in the observed network(s) by the community structure
in the simulated networks, as computed by the fast and greedy algorithm.
Only sensible with undirected networks.
louvain.modularity
: Univariate GOF statistic: Louvain clustering
modularity distribution
louvain.roc
: Tie prediction GOF statistic: ROC of Louvain
community detection. Receiver-operating characteristics of predicting the
community structure in the observed network(s) by the community structure
in the simulated networks, as computed by the Louvain algorithm.
louvain.pr
: Tie prediction GOF statistic: PR of Louvain
community detection. Precision-recall curve for predicting the community
structure in the observed network(s) by the community structure in the
simulated networks, as computed by the Louvain algorithm.
maxmod.modularity
: Univariate GOF statistic: maximal modularity
distribution
maxmod.roc
: Tie prediction GOF statistic: ROC of maximal
modularity community detection. Receiver-operating characteristics of
predicting the community structure in the observed network(s) by the
community structure in the simulated networks, as computed by the
modularity maximization algorithm.
maxmod.pr
: Tie prediction GOF statistic: PR of maximal
modularity community detection. Precision-recall curve for predicting the
community structure in the observed network(s) by the community structure
in the simulated networks, as computed by the modularity maximization
algorithm.
edgebetweenness.modularity
: Univariate GOF statistic: edge betweenness
modularity distribution
edgebetweenness.roc
: Tie prediction GOF statistic: ROC of edge
betweenness community detection. Receiver-operating characteristics of
predicting the community structure in the observed network(s) by the
community structure in the simulated networks, as computed by the
Girvan-Newman edge betweenness community detection method.
edgebetweenness.pr
: Tie prediction GOF statistic: PR of edge
betweenness community detection. Precision-recall curve for predicting the
community structure in the observed network(s) by the community structure
in the simulated networks, as computed by the Girvan-Newman edge
betweenness community detection method.
spinglass.modularity
: Univariate GOF statistic: spinglass modularity
distribution
spinglass.roc
: Tie prediction GOF statistic: ROC of spinglass
community detection. Receiver-operating characteristics of predicting the
community structure in the observed network(s) by the community structure
in the simulated networks, as computed by the Spinglass algorithm.
spinglass.pr
: Tie prediction GOF statistic: PR of spinglass
community detection. Precision-recall curve for predicting the community
structure in the observed network(s) by the community structure in the
simulated networks, as computed by the Spinglass algorithm.
rocpr
: Tie prediction GOF statistic: ROC and PR curves.
Receiver-operating characteristics (ROC) and precision-recall curve (PR).
Prediction of the dyad states of the observed network(s) by the dyad states
of the simulated networks.
Leifeld, Philip, Skyler J. Cranmer and Bruce A. Desmarais (2018): Temporal Exponential Random Graph Models with btergm: Estimation and Bootstrap Confidence Intervals. Journal of Statistical Software 83(6): 1–36. doi: 10.18637/jss.v083.i06.
# To see how these statistics are used, look at the examples section of # ?"gof-methods". The following example illustrates how custom # statistics can be created. Suppose one is interested in the density # of a network. Then a univariate statistic can be created as follows. dens <- function(mat, ...) { # univariate: one argument mat <- as.matrix(mat) # sparse matrix -> normal matrix d <- sna::gden(mat) # compute the actual statistic attributes(d)$label <- "Density" # add a descriptive label return(d) # return the statistic } # Note that the '...' argument must be present in all statistics. # Now the statistic can be used in the statistics argument of one of # the gof methods. # For illustrative purposes, let us consider an existing statistic, the # indegree distribution, a multivariate statistic. It also accepts a # single argument. Note that the sparse matrix is converted to a # normal matrix object when it is used. First, statnet's summary # method is used to compute the statistic. Names are attached to the # resulting vector for the different indegree values. Then the vector # is returned. ideg <- function(mat, ...) { d <- summary(mat ~ idegree(0:(nrow(mat) - 1))) names(d) <- 0:(length(d) - 1) attributes(d)$label <- "Indegree" return(d) } # See the gofstatistics.R file in the package for more complex examples.