Large negative cross-validated eigenvalues? #6

alexpghayes · 2022-05-02T20:45:36Z

I'm working with a symmetric graph and assuming that somewhere that is causing an error. This is not intended behavior, correct?

karlrohe · 2023-11-27T17:05:14Z

i keep getting this issue too...

I'm having trouble making a simulation that re-creates this issue, but in multiple different data sources, I often see the Z-score going negative.

In a simulation, I can kinda recreate it via fastRG by making the non-zero values of A all 1's. but it is not as severe as I see here.

set.seed(1)
library(fastRG)
mod = fastRG::dcsbm(n = 10000, k = 10,expected_degree = 50)
diag(mod$S) = diag(mod$S) + mean(mod$S)
A = sample_sparse(mod)+0
mean(rowSums(A))

out_A = gdim::eigcv(A, k_max = 50, num_bootstraps = 1, laplacian = FALSE)
# notice how the first 5 p-values are small,
#   then they become uniform.  
# This is what we expect.
out_A$summary$pvals %>% plot


#  if you threshold non-zero entries of A to be one...
A1 = A
A1@x[] = 1
out_A1 = gdim::eigcv(A1, k_max = 50, num_bootstraps = 1, laplacian = FALSE)

# then you get 4 small p-values... then they jump to 1!
#   they should be uniform, but the thresholding
#   makes the test biased. 
out_A1$summary$pvals %>% plot

I think we understand from a different perspective... we will want to give clear guidance to users on how to interpret this. It is safe/conservative to simply select the significant p-values. There might be more games to play (e.g. picking dimMax too big and looking backwards).

karlrohe mentioned this issue Nov 27, 2023

negative Z scores #12

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Large negative cross-validated eigenvalues? #6

Large negative cross-validated eigenvalues? #6

alexpghayes commented May 2, 2022

karlrohe commented Nov 27, 2023

Large negative cross-validated eigenvalues? #6

Large negative cross-validated eigenvalues? #6

Comments

alexpghayes commented May 2, 2022

karlrohe commented Nov 27, 2023