Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Signatures Contributing to Driver Genes #36

Open
mkinnaman opened this issue Feb 18, 2020 · 2 comments
Open

Signatures Contributing to Driver Genes #36

mkinnaman opened this issue Feb 18, 2020 · 2 comments

Comments

@mkinnaman
Copy link

I keep throwing the same error every time I try to plot the drivers vs signatures plot:

matprob <- matrix(nrow=length(drivers),ncol=length(SBS_OSCE1),dimnames=list(drivers, SBS_OSCE1))
sig.cols <- paste0(rownames(SBS_OSCE1_sigs),".prob")#grep("prob",colnames(vcf.cod))
for(i in 1:nrow(matprob)){

  • g <- rownames(matprob)[i]
  • ind <- which(vcf.cod$gene_name==g)
  • matprob[i,] <- apply(vcf.cod[ind,sig.cols],2,sum,na.rm=T)
  • }
    Error in matprob[i, ] <- apply(vcf.cod[ind, sig.cols], 2, sum, na.rm = T) :
    number of items to replace is not a multiple of replacement length

Any thoughts?

@mkinnaman
Copy link
Author

In addition - with my last couple of runs - have been throwing errors during the denovo signature command: Is this due to small sample size?

Error: NMF::nmf - invalid argument 'rank': must be a single numeric value
In addition: Warning messages:
1: In (function (...) :
NAs were produced due to errors in some of the runs:
-#4[r=5] -> elements of 'k' must be between 1 and 4 [in call to 'cutree']
-#5[r=6] -> elements of 'k' must be between 1 and 4 [in call to 'cutree']
-#6[r=7] -> elements of 'k' must be between 1 and 4 [in call to 'cutree']
-#7[r=8] -> elements of 'k' must be between 1 and 4 [in call to 'cutree']
-#8[r=9] -> elements of 'k' must be between 1 and 4 [in call to 'cutree']
-#9[r=10] -> elements of 'k' must be between 1 and 4 [in call to 'cutree']
2: Removed 25 rows containing missing values (geom_path).
3: Removed 62 rows containing missing values (geom_point).
4: In max(abs(diff(z))) : no non-missing arguments to max; returning -Inf

@FunGeST
Copy link
Owner

FunGeST commented Feb 28, 2020

Hi,

Thanks for getting in touch, and I apologise for not getting back to you sooner.

Regarding your first issue, I think it may just be caused by SBS_OSCE1 and rownames(SBS_OSCE1_sigs) being different lengths, although without your data I can't be 100% sure.

I'm less sure about your second issue - out of interest how small is your sample size? NMF extractions should work for smaller sample sizes. To get around this error you could try specifying the num_of_sigs = argument in the NMF_Extraction() function as an integer.

If the file NMF_Rank_Estimates.pdf has been plotted in your results directory, the value in the x-axis for the first minima of the cophenetic plot is a good estimate for the optimal number of signatures in your input data. In this example, Palimpsest would choose 5 as the optimal number of signatures in the data.

Please let us know if this doesn't work and I'll try to find another solution!

Best wishes,
Benedict

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants