-- This time, the Fruchterman-Reingold layout algorithm is computed
- within the plot function and thus applied to the “reduced” network
- without singletons
-- Labels are not scaled to node sizes
-- Single nodes are removed
-- Node sizes are scaled to the column sums of clr-transformed data
-- Node colors represent the determined clusters
-- Border color of hub nodes is changed from black to darkgray
-- Label size of hubs is enlarged
+
-``` r
-set.seed(123456)
-
-plot(props_genus,
- layout = "layout_with_fr",
- shortenLabels = "intelligent",
- labelLength = 10,
- labelPattern = c(5, "'", 3, "'", 3),
- labelScale = FALSE,
- rmSingles = TRUE,
- nodeSize = "clr",
- nodeColor = "cluster",
- hubBorderCol = "darkgray",
- cexNodes = 2,
- cexLabels = 1.5,
- cexHubLabels = 2,
- title1 = "Network on genus level with Pearson correlations",
- showTitle = TRUE,
- cexTitle = 2.3)
-
-legend(0.7, 1.1, cex = 2.2, title = "estimated correlation:",
- legend = c("+","-"), lty = 1, lwd = 3, col = c("#009900","red"),
- bty = "n", horiz = TRUE)
-```
-
-![](man/figures/readme/single_genus_3-1.png)
-
-Let’s check whether the largest nodes are actually those with highest
-column sums in the matrix with normalized counts returned by
-`netConstruct()`.
-
-``` r
-sort(colSums(net_genus$normCounts1), decreasing = TRUE)[1:10]
-```
-
- ## Bacteroides Klebsiella Faecalibacterium
- ## 1200.7971 1137.4928 708.0877
- ## 5_Clostridiales(O) 2_Ruminococcaceae(F) 3_Lachnospiraceae(F)
- ## 549.2647 502.1889 493.7558
- ## 6_Enterobacteriaceae(F) Roseburia Parabacteroides
- ## 363.3841 333.8737 328.0495
- ## Coprococcus
- ## 274.4082
-
-In order to further improve our plot, we use the following
-modifications:
-
-- This time, we choose the “spring” layout as part of `qgraph()` (the
- function is generally used for network plotting in NetCoMi)
-- A repulsion value below 1 places the nodes further apart
-- Labels are not shortened anymore
-- Nodes (bacteria on genus level) are colored according to the
- respective phylum
-- Edges representing positive associations are colored in blue, negative
- ones in orange (just to give an example for alternative edge coloring)
-- Transparency is increased for edges with high weight to improve the
- readability of node labels
-
-``` r
-# Get phyla names
-taxtab <- as(tax_table(amgut_genus_renamed), "matrix")
-phyla <- as.factor(gsub("p__", "", taxtab[, "Rank2"]))
-names(phyla) <- taxtab[, "Rank6"]
-#table(phyla)
-
-# Define phylum colors
-phylcol <- c("cyan", "blue3", "red", "lawngreen", "yellow", "deeppink")
-
-plot(props_genus,
- layout = "spring",
- repulsion = 0.84,
- shortenLabels = "none",
- charToRm = "g__",
- labelScale = FALSE,
- rmSingles = TRUE,
- nodeSize = "clr",
- nodeSizeSpread = 4,
- nodeColor = "feature",
- featVecCol = phyla,
- colorVec = phylcol,
- posCol = "darkturquoise",
- negCol = "orange",
- edgeTranspLow = 0,
- edgeTranspHigh = 40,
- cexNodes = 2,
- cexLabels = 2,
- cexHubLabels = 2.5,
- title1 = "Network on genus level with Pearson correlations",
- showTitle = TRUE,
- cexTitle = 2.3)
-
-# Colors used in the legend should be equally transparent as in the plot
-phylcol_transp <- colToTransp(phylcol, 60)
-
-legend(-1.2, 1.2, cex = 2, pt.cex = 2.5, title = "Phylum:",
- legend=levels(phyla), col = phylcol_transp, bty = "n", pch = 16)
-
-legend(0.7, 1.1, cex = 2.2, title = "estimated correlation:",
- legend = c("+","-"), lty = 1, lwd = 3, col = c("darkturquoise","orange"),
- bty = "n", horiz = TRUE)
-```
-
-![](man/figures/readme/single_genus_5-1.png)
-
-------------------------------------------------------------------------
-
-### Using an association matrix as input
-
-The QMP data set provided by the `SPRING` package is used to demonstrate
-how NetCoMi is used to analyze a precomputed network (given as
-association matrix).
-
-The data set contains quantitative count data (true absolute values),
-which SPRING can deal with. See `?QMP` for details.
-
-`nlambda` and `rep.num` are set to 10 for a decreased execution time,
-but should be higher for real data.
-
-``` r
-library(SPRING)
-
-# Load the QMP data set
-data("QMP")
-
-# Run SPRING for association estimation
-fit_spring <- SPRING(QMP,
- quantitative = TRUE,
- lambdaseq = "data-specific",
- nlambda = 10,
- rep.num = 10,
- seed = 123456,
- ncores = 1,
- Rmethod = "approx",
- verbose = FALSE)
-
-# Optimal lambda
-opt.K <- fit_spring$output$stars$opt.index
-
-# Association matrix
-assoMat <- as.matrix(SpiecEasi::symBeta(fit_spring$output$est$beta[[opt.K]],
- mode = "ave"))
-rownames(assoMat) <- colnames(assoMat) <- colnames(QMP)
-```
-
-The association matrix is now passed to `netConstruct` to start the
-usual NetCoMi workflow. Note that the `dataType` argument must be set
-appropriately.
-
-``` r
-# Network construction and analysis
-net_asso <- netConstruct(data = assoMat,
- dataType = "condDependence",
- sparsMethod = "none",
- verbose = 0)
-
-props_asso <- netAnalyze(net_asso, clustMethod = "hierarchical")
-```
-
-![](man/figures/readme/association_input_2-1.png)
-
-``` r
-plot(props_asso,
- layout = "spring",
- repulsion = 1.2,
- shortenLabels = "none",
- labelScale = TRUE,
- rmSingles = TRUE,
- nodeSize = "eigenvector",
- nodeSizeSpread = 2,
- nodeColor = "cluster",
- hubBorderCol = "gray60",
- cexNodes = 1.8,
- cexLabels = 2,
- cexHubLabels = 2.2,
- title1 = "Network for QMP data",
- showTitle = TRUE,
- cexTitle = 2.3)
-
-legend(0.7, 1.1, cex = 2.2, title = "estimated association:",
- legend = c("+","-"), lty = 1, lwd = 3, col = c("#009900","red"),
- bty = "n", horiz = TRUE)
-```
-
-![](man/figures/readme/association_input_3-1.png)
-
-------------------------------------------------------------------------
-
-### Network comparison
-
-Now let’s look how NetCoMi is used to compare two networks.
-
-#### Network construction
-
-The data set is split by `"SEASONAL_ALLERGIES"` leading to two subsets
-of samples (with and without seasonal allergies). We ignore the “None”
-group.
-
-``` r
-# Split the phyloseq object into two groups
-amgut_season_yes <- phyloseq::subset_samples(amgut2.filt.phy,
- SEASONAL_ALLERGIES == "yes")
-amgut_season_no <- phyloseq::subset_samples(amgut2.filt.phy,
- SEASONAL_ALLERGIES == "no")
-
-amgut_season_yes
-```
-
- ## phyloseq-class experiment-level object
- ## otu_table() OTU Table: [ 138 taxa and 121 samples ]
- ## sample_data() Sample Data: [ 121 samples by 166 sample variables ]
- ## tax_table() Taxonomy Table: [ 138 taxa by 7 taxonomic ranks ]
-
-``` r
-amgut_season_no
-```
-
- ## phyloseq-class experiment-level object
- ## otu_table() OTU Table: [ 138 taxa and 163 samples ]
- ## sample_data() Sample Data: [ 163 samples by 166 sample variables ]
- ## tax_table() Taxonomy Table: [ 138 taxa by 7 taxonomic ranks ]
-
-The 50 nodes with highest variance are selected for network construction
-to get smaller networks.
-
-We filter the 121 samples (sample size of the smaller group) with
-highest frequency to make the sample sizes equal and thus ensure
-comparability.
-
-``` r
-n_yes <- phyloseq::nsamples(amgut_season_yes)
-
-# Network construction
-net_season <- netConstruct(data = amgut_season_no,
- data2 = amgut_season_yes,
- filtTax = "highestVar",
- filtTaxPar = list(highestVar = 50),
- filtSamp = "highestFreq",
- filtSampPar = list(highestFreq = n_yes),
- measure = "spring",
- measurePar = list(nlambda = 10,
- rep.num = 10,
- Rmethod = "approx"),
- normMethod = "none",
- zeroMethod = "none",
- sparsMethod = "none",
- dissFunc = "signed",
- verbose = 2,
- seed = 123456)
-```
-
- ## Checking input arguments ... Done.
- ## Data filtering ...
- ## 42 samples removed in data set 1.
- ## 0 samples removed in data set 2.
- ## 96 taxa removed in each data set.
- ## 1 rows with zero sum removed in group 2.
- ## 42 taxa and 121 samples remaining in group 1.
- ## 42 taxa and 120 samples remaining in group 2.
- ##
- ## Calculate 'spring' associations ... Done.
- ##
- ## Calculate associations in group 2 ... Done.
-
-Alternatively, a group vector could be passed to `group`, according to
-which the data set is split into two groups:
-
-``` r
-# Get count table
-countMat <- phyloseq::otu_table(amgut2.filt.phy)
-
-# netConstruct() expects samples in rows
-countMat <- t(as(countMat, "matrix"))
-
-group_vec <- phyloseq::get_variable(amgut2.filt.phy, "SEASONAL_ALLERGIES")
-
-# Select the two groups of interest (level "none" is excluded)
-sel <- which(group_vec %in% c("no", "yes"))
-group_vec <- group_vec[sel]
-countMat <- countMat[sel, ]
-
-net_season <- netConstruct(countMat,
- group = group_vec,
- filtTax = "highestVar",
- filtTaxPar = list(highestVar = 50),
- filtSamp = "highestFreq",
- filtSampPar = list(highestFreq = n_yes),
- measure = "spring",
- measurePar = list(nlambda=10,
- rep.num=10,
- Rmethod = "approx"),
- normMethod = "none",
- zeroMethod = "none",
- sparsMethod = "none",
- dissFunc = "signed",
- verbose = 3,
- seed = 123456)
-```
-
-#### Network analysis
-
-The object returned by `netConstruct()` containing both networks is
-again passed to `netAnalyze()`. Network properties are computed for both
-networks simultaneously.
-
-To demonstrate further functionalities of `netAnalyze()`, we play around
-with the available arguments, even if the chosen setting might not be
-optimal.
-
-- `centrLCC = FALSE`: Centralities are calculated for all nodes (not
- only for the largest connected component).
-- `avDissIgnoreInf = TRUE`: Nodes with an infinite dissimilarity are
- ignored when calculating the average dissimilarity.
-- `sPathNorm = FALSE`: Shortest paths are not normalized by average
- dissimilarity.
-- `hubPar = c("degree", "eigenvector")`: Hubs are nodes with highest
- degree and eigenvector centrality at the same time.
-- `lnormFit = TRUE` and `hubQuant = 0.9`: A log-normal distribution is
- fitted to the centrality values to identify nodes with “highest”
- centrality values. Here, a node is identified as hub if for each of
- the three centrality measures, the node’s centrality value is above
- the 90% quantile of the fitted log-normal distribution.
-- The non-normalized centralities are used for all four measures.
-
-**Note! The arguments must be set carefully, depending on the research
-questions. NetCoMi’s default values are not generally preferable in all
-practical cases!**
-
-``` r
-props_season <- netAnalyze(net_season,
- centrLCC = FALSE,
- avDissIgnoreInf = TRUE,
- sPathNorm = FALSE,
- clustMethod = "cluster_fast_greedy",
- hubPar = c("degree", "eigenvector"),
- hubQuant = 0.9,
- lnormFit = TRUE,
- normDeg = FALSE,
- normBetw = FALSE,
- normClose = FALSE,
- normEigen = FALSE)
-```
-
-![](man/figures/readme/netcomp_spring_3-1.png)
-
-``` r
-summary(props_season)
-```
-
- ##
- ## Component sizes
- ## ```````````````
- ## group '1':
- ## size: 28 1
- ## #: 1 14
- ## group '2':
- ## size: 31 8 1
- ## #: 1 1 3
- ## ______________________________
- ## Global network properties
- ## `````````````````````````
- ## Largest connected component (LCC):
- ## group '1' group '2'
- ## Relative LCC size 0.66667 0.73810
- ## Clustering coefficient 0.15161 0.27111
- ## Modularity 0.62611 0.45823
- ## Positive edge percentage 86.66667 100.00000
- ## Edge density 0.07937 0.12473
- ## Natural connectivity 0.04539 0.04362
- ## Vertex connectivity 1.00000 1.00000
- ## Edge connectivity 1.00000 1.00000
- ## Average dissimilarity* 0.67251 0.68178
- ## Average path length** 3.40008 1.86767
- ##
- ## Whole network:
- ## group '1' group '2'
- ## Number of components 15.00000 5.00000
- ## Clustering coefficient 0.15161 0.29755
- ## Modularity 0.62611 0.55684
- ## Positive edge percentage 86.66667 100.00000
- ## Edge density 0.03484 0.08130
- ## Natural connectivity 0.02826 0.03111
- ## -----
- ## *: Dissimilarity = 1 - edge weight
- ## **: Path length = Sum of dissimilarities along the path
- ##
- ## ______________________________
- ## Clusters
- ## - In the whole network
- ## - Algorithm: cluster_fast_greedy
- ## ````````````````````````````````
- ## group '1':
- ## name: 0 1 2 3 4 5
- ## #: 14 7 6 5 4 6
- ##
- ## group '2':
- ## name: 0 1 2 3 4 5
- ## #: 3 5 14 4 8 8
- ##
- ## ______________________________
- ## Hubs
- ## - In alphabetical/numerical order
- ## - Based on log-normal quantiles of centralities
- ## ```````````````````````````````````````````````
- ## group '1' group '2'
- ## 307981 322235
- ## 363302
- ##
- ## ______________________________
- ## Centrality measures
- ## - In decreasing order
- ## - Computed for the complete network
- ## ````````````````````````````````````
- ## Degree (unnormalized):
- ## group '1' group '2'
- ## 307981 5 2
- ## 9715 5 5
- ## 364563 4 4
- ## 259569 4 5
- ## 322235 3 9
- ## ______ ______
- ## 322235 3 9
- ## 363302 3 9
- ## 158660 2 6
- ## 188236 3 5
- ## 259569 4 5
- ##
- ## Betweenness centrality (unnormalized):
- ## group '1' group '2'
- ## 307981 231 0
- ## 331820 170 9
- ## 158660 162 80
- ## 188236 161 85
- ## 322235 159 126
- ## ______ ______
- ## 322235 159 126
- ## 363302 74 93
- ## 188236 161 85
- ## 158660 162 80
- ## 326792 17 58
- ##
- ## Closeness centrality (unnormalized):
- ## group '1' group '2'
- ## 307981 18.17276 7.80251
- ## 9715 15.8134 9.27254
- ## 188236 15.7949 23.24055
- ## 301645 15.30177 9.01509
- ## 364563 14.73566 21.21352
- ## ______ ______
- ## 322235 13.50232 26.36749
- ## 363302 12.30297 24.19703
- ## 158660 13.07106 23.31577
- ## 188236 15.7949 23.24055
- ## 326792 14.61391 22.52157
- ##
- ## Eigenvector centrality (unnormalized):
- ## group '1' group '2'
- ## 307981 0.53313 0.06912
- ## 9715 0.44398 0.10788
- ## 301645 0.41878 0.08572
- ## 326792 0.27033 0.15727
- ## 188236 0.25824 0.21162
- ## ______ ______
- ## 322235 0.01749 0.29705
- ## 363302 0.03526 0.28512
- ## 188236 0.25824 0.21162
- ## 194648 0.00366 0.19448
- ## 184983 0.0917 0.1854
-
-#### Visual network comparison
-
-First, the layout is computed separately in both groups (qgraph’s
-“spring” layout in this case).
-
-Node sizes are scaled according to the mclr-transformed data since
-`SPRING` uses the mclr transformation as normalization method.
-
-Node colors represent clusters. Note that by default, two clusters have
-the same color in both groups if they have at least two nodes in common
-(`sameColThresh = 2`). Set `sameClustCol` to `FALSE` to get different
-cluster colors.
-
-``` r
-plot(props_season,
- sameLayout = FALSE,
- nodeColor = "cluster",
- nodeSize = "mclr",
- labelScale = FALSE,
- cexNodes = 1.5,
- cexLabels = 2.5,
- cexHubLabels = 3,
- cexTitle = 3.7,
- groupNames = c("No seasonal allergies", "Seasonal allergies"),
- hubBorderCol = "gray40")
-
-legend("bottom", title = "estimated association:", legend = c("+","-"),
- col = c("#009900","red"), inset = 0.02, cex = 4, lty = 1, lwd = 4,
- bty = "n", horiz = TRUE)
-```
-
-![](man/figures/readme/netcomp_spring_4-1.png)
-
-Using different layouts leads to a “nice-looking” network plot for each
-group, however, it is difficult to identify group differences at first
-glance.
-
-Thus, we now use the same layout in both groups. In the following, the
-layout is computed for group 1 (the left network) and taken over for
-group 2.
-
-`rmSingles` is set to `"inboth"` because only nodes that are unconnected
-in both groups can be removed if the same layout is used.
-
-``` r
-plot(props_season,
- sameLayout = TRUE,
- layoutGroup = 1,
- rmSingles = "inboth",
- nodeSize = "mclr",
- labelScale = FALSE,
- cexNodes = 1.5,
- cexLabels = 2.5,
- cexHubLabels = 3,
- cexTitle = 3.8,
- groupNames = c("No seasonal allergies", "Seasonal allergies"),
- hubBorderCol = "gray40")
-
-legend("bottom", title = "estimated association:", legend = c("+","-"),
- col = c("#009900","red"), inset = 0.02, cex = 4, lty = 1, lwd = 4,
- bty = "n", horiz = TRUE)
-```
-
-![](man/figures/readme/netcomp_spring_5-1.png)
-
-In the above plot, we can see clear differences between the groups. The
-OTU “322235”, for instance, is more strongly connected in the “Seasonal
-allergies” group than in the group without seasonal allergies, which is
-why it is a hub on the right, but not on the left.
-
-However, if the layout of one group is simply taken over to the other,
-one of the networks (here the “seasonal allergies” group) is usually not
-that nice-looking due to the long edges. Therefore, NetCoMi (\>= 1.0.2)
-offers a further option (`layoutGroup = "union"`), where a union of the
-two layouts is used in both groups. In doing so, the nodes are placed as
-optimal as possible equally for both networks.
-
-*The idea and R code for this functionality were provided by [Christian
-L. Müller](https://github.com/muellsen?tab=followers) and [Alice
-Sommer](https://www.iq.harvard.edu/people/alice-sommer)*
-
-``` r
-plot(props_season,
- sameLayout = TRUE,
- repulsion = 0.95,
- layoutGroup = "union",
- rmSingles = "inboth",
- nodeSize = "mclr",
- labelScale = FALSE,
- cexNodes = 1.5,
- cexLabels = 2.5,
- cexHubLabels = 3,
- cexTitle = 3.8,
- groupNames = c("No seasonal allergies", "Seasonal allergies"),
- hubBorderCol = "gray40")
-
-legend("bottom", title = "estimated association:", legend = c("+","-"),
- col = c("#009900","red"), inset = 0.02, cex = 4, lty = 1, lwd = 4,
- bty = "n", horiz = TRUE)
-```
-
-![](man/figures/readme/netcomp_spring_6-1.png)
-
-#### Quantitative network comparison
-
-Since runtime is considerably increased if permutation tests are
-performed, we set the `permTest` parameter to `FALSE`. See the
-`tutorial_createAssoPerm` file for a network comparison including
-permutation tests.
-
-Since permutation tests are still conducted for the Adjusted Rand Index,
-a seed should be set for reproducibility.
-
-``` r
-comp_season <- netCompare(props_season,
- permTest = FALSE,
- verbose = FALSE,
- seed = 123456)
-
-summary(comp_season,
- groupNames = c("No allergies", "Allergies"),
- showCentr = c("degree", "between", "closeness"),
- numbNodes = 5)
-```
-
- ##
- ## Comparison of Network Properties
- ## ----------------------------------
- ## CALL:
- ## netCompare(x = props_season, permTest = FALSE, verbose = FALSE,
- ## seed = 123456)
- ##
- ## ______________________________
- ## Global network properties
- ## `````````````````````````
- ## Largest connected component (LCC):
- ## No allergies Allergies difference
- ## Relative LCC size 0.667 0.738 0.071
- ## Clustering coefficient 0.152 0.271 0.120
- ## Modularity 0.626 0.458 0.168
- ## Positive edge percentage 86.667 100.000 13.333
- ## Edge density 0.079 0.125 0.045
- ## Natural connectivity 0.045 0.044 0.002
- ## Vertex connectivity 1.000 1.000 0.000
- ## Edge connectivity 1.000 1.000 0.000
- ## Average dissimilarity* 0.673 0.682 0.009
- ## Average path length** 3.400 1.868 1.532
- ##
- ## Whole network:
- ## No allergies Allergies difference
- ## Number of components 15.000 5.000 10.000
- ## Clustering coefficient 0.152 0.298 0.146
- ## Modularity 0.626 0.557 0.069
- ## Positive edge percentage 86.667 100.000 13.333
- ## Edge density 0.035 0.081 0.046
- ## Natural connectivity 0.028 0.031 0.003
- ## -----
- ## *: Dissimilarity = 1 - edge weight
- ## **: Path length = Sum of dissimilarities along the path
- ##
- ## ______________________________
- ## Jaccard index (similarity betw. sets of most central nodes)
- ## ```````````````````````````````````````````````````````````
- ## Jacc P(<=Jacc) P(>=Jacc)
- ## degree 0.556 0.957578 0.144846
- ## betweenness centr. 0.333 0.650307 0.622822
- ## closeness centr. 0.231 0.322424 0.861268
- ## eigenvec. centr. 0.100 0.017593 * 0.996692
- ## hub taxa 0.000 0.296296 1.000000
- ## -----
- ## Jaccard index in [0,1] (1 indicates perfect agreement)
- ##
- ## ______________________________
- ## Adjusted Rand index (similarity betw. clusterings)
- ## ``````````````````````````````````````````````````
- ## wholeNet LCC
- ## ARI 0.232 0.355
- ## p-value 0.000 0.000
- ## -----
- ## ARI in [-1,1] with ARI=1: perfect agreement betw. clusterings
- ## ARI=0: expected for two random clusterings
- ## p-value: permutation test (n=1000) with null hypothesis ARI=0
- ##
- ## ______________________________
- ## Graphlet Correlation Distance
- ## `````````````````````````````
- ## wholeNet LCC
- ## GCD 1.577 1.863
- ## -----
- ## GCD >= 0 (GCD=0 indicates perfect agreement between GCMs)
- ##
- ## ______________________________
- ## Centrality measures
- ## - In decreasing order
- ## - Computed for the whole network
- ## ````````````````````````````````````
- ## Degree (unnormalized):
- ## No allergies Allergies abs.diff.
- ## 322235 3 9 6
- ## 363302 3 9 6
- ## 469709 0 4 4
- ## 158660 2 6 4
- ## 223059 0 4 4
- ##
- ## Betweenness centrality (unnormalized):
- ## No allergies Allergies abs.diff.
- ## 307981 231 0 231
- ## 331820 170 9 161
- ## 259569 137 34 103
- ## 158660 162 80 82
- ## 184983 92 12 80
- ##
- ## Closeness centrality (unnormalized):
- ## No allergies Allergies abs.diff.
- ## 469709 0 21.203 21.203
- ## 541301 0 20.942 20.942
- ## 181016 0 19.498 19.498
- ## 361496 0 19.349 19.349
- ## 223059 0 19.261 19.261
- ##
- ## _________________________________________________________
- ## Significance codes: ***: 0.001, **: 0.01, *: 0.05, .: 0.1
-
-------------------------------------------------------------------------
-
-### Differential networks
-
-We now build a differential association network, where two nodes are
-connected if they are differentially associated between the two groups.
-
-Due to its very short execution time, we use Pearson’s correlations for
-estimating associations between OTUs.
-
-Fisher’s z-test is applied for identifying differentially correlated
-OTUs. Multiple testing adjustment is done by controlling the local false
-discovery rate.
-
-Note: `sparsMethod` is set to `"none"`, just to be able to include all
-differential associations in the association network plot (see below).
-However, the differential network is always based on the estimated
-association matrices before sparsification (the `assoEst1` and
-`assoEst2` matrices returned by `netConstruct()`).
-
-``` r
-net_season_pears <- netConstruct(data = amgut_season_no,
- data2 = amgut_season_yes,
- filtTax = "highestVar",
- filtTaxPar = list(highestVar = 50),
- measure = "pearson",
- normMethod = "clr",
- sparsMethod = "none",
- thresh = 0.2,
- verbose = 3)
-```
-
- ## Checking input arguments ... Done.
- ## Infos about changed arguments:
- ## Zero replacement needed for clr transformation. "multRepl" used.
- ##
- ## Data filtering ...
- ## 95 taxa removed in each data set.
- ## 1 rows with zero sum removed in group 1.
- ## 1 rows with zero sum removed in group 2.
- ## 43 taxa and 162 samples remaining in group 1.
- ## 43 taxa and 120 samples remaining in group 2.
- ##
- ## Zero treatment in group 1:
- ## Execute multRepl() ... Done.
- ##
- ## Zero treatment in group 2:
- ## Execute multRepl() ... Done.
- ##
- ## Normalization in group 1:
- ## Execute clr(){SpiecEasi} ... Done.
- ##
- ## Normalization in group 2:
- ## Execute clr(){SpiecEasi} ... Done.
- ##
- ## Calculate 'pearson' associations ... Done.
- ##
- ## Calculate associations in group 2 ... Done.
-
-``` r
-# Differential network construction
-diff_season <- diffnet(net_season_pears,
- diffMethod = "fisherTest",
- adjust = "lfdr")
-```
-
- ## Checking input arguments ...
- ## Done.
- ## Adjust for multiple testing using 'lfdr' ...
- ## Execute fdrtool() ...
-
- ## Step 1... determine cutoff point
- ## Step 2... estimate parameters of null distribution and eta0
- ## Step 3... compute p-values and estimate empirical PDF/CDF
- ## Step 4... compute q-values and local fdr
-
- ## Done.
-
-``` r
-# Differential network plot
-plot(diff_season,
- cexNodes = 0.8,
- cexLegend = 3,
- cexTitle = 4,
- mar = c(2,2,8,5),
- legendGroupnames = c("group 'no'", "group 'yes'"),
- legendPos = c(0.7,1.6))
-```
-
-![](man/figures/readme/diffnet_1-1.png) In the differential
-network shown above, edge colors represent the direction of associations
-in the two groups. If, for instance, two OTUs are positively associated
-in group 1 and negatively associated in group 2 (such as ‘191541’ and
-‘188236’), the respective edge is colored in cyan.
-
-We also take a look at the corresponding associations by constructing
-association networks that include only the differentially associated
-OTUs.
-
-``` r
-props_season_pears <- netAnalyze(net_season_pears,
- clustMethod = "cluster_fast_greedy",
- weightDeg = TRUE,
- normDeg = FALSE,
- gcmHeat = FALSE)
-```
-
-``` r
-# Identify the differentially associated OTUs
-diffmat_sums <- rowSums(diff_season$diffAdjustMat)
-diff_asso_names <- names(diffmat_sums[diffmat_sums > 0])
-
-plot(props_season_pears,
- nodeFilter = "names",
- nodeFilterPar = diff_asso_names,
- nodeColor = "gray",
- highlightHubs = FALSE,
- sameLayout = TRUE,
- layoutGroup = "union",
- rmSingles = FALSE,
- nodeSize = "clr",
- edgeTranspHigh = 20,
- labelScale = FALSE,
- cexNodes = 1.5,
- cexLabels = 3,
- cexTitle = 3.8,
- groupNames = c("No seasonal allergies", "Seasonal allergies"),
- hubBorderCol = "gray40")
-
-legend(-0.15,-0.7, title = "estimated correlation:", legend = c("+","-"),
- col = c("#009900","red"), inset = 0.05, cex = 4, lty = 1, lwd = 4,
- bty = "n", horiz = TRUE)
-```
-
-![](man/figures/readme/diffnet_3-1.png)
-
-We can see that the correlation between the aforementioned OTUs ‘191541’
-and ‘188236’ is strongly positive in the left group and negative in the
-right group.
-
-------------------------------------------------------------------------
-
-### Dissimilarity-based Networks
-
-If a dissimilarity measure is used for network construction, nodes are
-subjects instead of OTUs. The estimated dissimilarities are transformed
-into similarities, which are used as edge weights so that subjects with
-a similar microbial composition are placed close together in the network
-plot.
-
-We construct a single network using Aitchison’s distance being suitable
-for the application on compositional data.
-
-Since the Aitchison distance is based on the clr-transformation, zeros
-in the data need to be replaced.
-
-The network is sparsified using the k-nearest neighbor (knn) algorithm.
-
-``` r
-net_diss <- netConstruct(amgut1.filt,
- measure = "aitchison",
- zeroMethod = "multRepl",
- sparsMethod = "knn",
- kNeighbor = 3,
- verbose = 3)
-```
-
- ## Checking input arguments ... Done.
- ## Infos about changed arguments:
- ## Counts normalized to fractions for measure "aitchison".
- ##
- ## 127 taxa and 289 samples remaining.
- ##
- ## Zero treatment:
- ## Execute multRepl() ... Done.
- ##
- ## Normalization:
- ## Counts normalized by total sum scaling.
- ##
- ## Calculate 'aitchison' dissimilarities ... Done.
- ##
- ## Sparsify dissimilarities via 'knn' ... Registered S3 methods overwritten by 'proxy':
- ## method from
- ## print.registry_field registry
- ## print.registry_entry registry
- ## Done.
-
-For cluster detection, we use hierarchical clustering with average
-linkage. Internally, `k=3` is passed to
-[`cutree()`](https://www.rdocumentation.org/packages/dendextend/versions/1.13.4/topics/cutree)
-from `stats` package so that the tree is cut into 3 clusters.
-
-``` r
-props_diss <- netAnalyze(net_diss,
- clustMethod = "hierarchical",
- clustPar = list(method = "average", k = 3),
- hubPar = "eigenvector")
-```
-
-![](man/figures/readme/example14-1.png)
-
-``` r
-plot(props_diss,
- nodeColor = "cluster",
- nodeSize = "eigenvector",
- hubTransp = 40,
- edgeTranspLow = 60,
- charToRm = "00000",
- shortenLabels = "simple",
- labelLength = 6,
- mar = c(1, 3, 3, 5))
-
-# get green color with 50% transparency
-green2 <- colToTransp("#009900", 40)
-
-legend(0.4, 1.1,
- cex = 2.2,
- legend = c("high similarity (low Aitchison distance)",
- "low similarity (high Aitchison distance)"),
- lty = 1,
- lwd = c(3, 1),
- col = c("darkgreen", green2),
- bty = "n")
-```
-
-![](man/figures/readme/example15-1.png)
-
-In this dissimilarity-based network, hubs are interpreted as samples
-with a microbial composition similar to that of many other samples in
-the data set.
-
-------------------------------------------------------------------------
-
-### Soil microbiome example
-
-Here is the code for reproducing the network plot shown at the
-beginning.
-
-``` r
-data("soilrep")
-
-soil_warm_yes <- phyloseq::subset_samples(soilrep, warmed == "yes")
-soil_warm_no <- phyloseq::subset_samples(soilrep, warmed == "no")
-
-net_seas_p <- netConstruct(soil_warm_yes, soil_warm_no,
- filtTax = "highestVar",
- filtTaxPar = list(highestVar = 500),
- zeroMethod = "pseudo",
- normMethod = "clr",
- measure = "pearson",
- verbose = 0)
-
-netprops1 <- netAnalyze(net_seas_p, clustMethod = "cluster_fast_greedy")
-
-nclust <- as.numeric(max(names(table(netprops1$clustering$clust1))))
-
-col <- c(topo.colors(nclust), rainbow(6))
-
-plot(netprops1,
- sameLayout = TRUE,
- layoutGroup = "union",
- colorVec = col,
- borderCol = "gray40",
- nodeSize = "degree",
- cexNodes = 0.9,
- nodeSizeSpread = 3,
- edgeTranspLow = 80,
- edgeTranspHigh = 50,
- groupNames = c("Warming", "Non-warming"),
- showTitle = TRUE,
- cexTitle = 2.8,
- mar = c(1,1,3,1),
- repulsion = 0.9,
- labels = FALSE,
- rmSingles = "inboth",
- nodeFilter = "clustMin",
- nodeFilterPar = 10,
- nodeTransp = 50,
- hubTransp = 30)
-```
-
-------------------------------------------------------------------------
-
-### References
-
-
-
-
-
-Badri, Michelle, Zachary D. Kurtz, Richard Bonneau, and Christian L.
-Müller. 2020. “Shrinkage Improves Estimation of Microbial Associations
-Under Different Normalization Methods.” *NAR Genomics and
-Bioinformatics* 2 (December). .
-
-
-
-
-
-Martin-Fernández, Josep A, M Bren, Carles Barceló-Vidal, and Vera
-Pawlowsky-Glahn. 1999. “A Measure of Difference for Compositional Data
-Based on Measures of Divergence.” In *Proceedings of IAMG*, 99:211–16.
-
-
-
-
-
-Yoon, Grace, Christian L. Müller, and Irina Gaynanova. 2020. “Fast
-Computation of Latent Correlations.” *Journal of Computational and
-Graphical Statistics*, June. .
+Peschel, Stefanie, Christian L Müller, Erika von Mutius, Anne-Laure
+Boulesteix, and Martin Depner. 2020. “NetCoMi:
+network construction and comparison for microbiome data in R.”
+*Briefings in Bioinformatics* 22 (4): bbaa290.
+.
diff --git a/references.bib b/references.bib
index c49564b..64e7974 100644
--- a/references.bib
+++ b/references.bib
@@ -32,6 +32,21 @@ @inproceedings{martin1999measure
year={1999}
}
+@article{peschel2020netcomi,
+ author = {Peschel, Stefanie and Müller, Christian L and von Mutius, Erika and Boulesteix, Anne-Laure and Depner, Martin},
+ title = {{NetCoMi: network construction and comparison for microbiome data in R}},
+ journal = {Briefings in Bioinformatics},
+ volume = {22},
+ number = {4},
+ pages = {bbaa290},
+ year = {2020},
+ month = {12},
+ issn = {1477-4054},
+ doi = {10.1093/bib/bbaa290},
+ url = {https://doi.org/10.1093/bib/bbaa290}
+}
+
+
@article{yoon2020fast,
author = {Grace Yoon and Christian L. Müller and Irina Gaynanova},
journal = {Journal of Computational and Graphical Statistics},