You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I performed the enrichment analysis using my background set, and repeated it restricting the background set to only include genes with at least one annotation term (based on annotation file that the analysis is using).
I realized that GOATOOLS takes into account all background set genes without considering whether or not each gene has an annotation term. As a result, the findings of these two analyses had different P-values and GO term significance levels.
I'm wondering to know why GOATOOLS does not apply this filter by default in order to do a more accurate enrichment study.
Thanks!
The text was updated successfully, but these errors were encountered:
Maryam-Haghani
changed the title
Different GO enrichments when filtering background set to those genes having at least one annotation term
Different results when restricting background set to only include genes with at least one annotation term
Jan 27, 2023
Changing the background population genes will most likely result in different pvalues than if using the original background population genes. This is correct behavior.
If the background population genes are reduced by removing unannotated genes, the same should be done with the study genes.
Even with the reduction in both the population and study set of genes, the pvalues will still likely to be different than not removing any genes due to the random chance that the distribution of unannotated genes in the total population and the distribution of unannotated total study population will differ from gene set to gene set. This is expected behavior.
GOA Tools keeps all study genes and population genes by default. However, reseachers wishing to develop criteria to remove population genes are able to do so due to the GOA Tools architecture that separates managing the databases (GO ontology DAG and annotations) from running the statistical tests.
Please feel free to apply any filtering functions on the population genes, but also ensure the same filter is applied to the study gene sets.
Hi,
I performed the enrichment analysis using my background set, and repeated it restricting the background set to only include genes with at least one annotation term (based on annotation file that the analysis is using).
I realized that GOATOOLS takes into account all background set genes without considering whether or not each gene has an annotation term. As a result, the findings of these two analyses had different P-values and GO term significance levels.
I'm wondering to know why GOATOOLS does not apply this filter by default in order to do a more accurate enrichment study.
Thanks!
The text was updated successfully, but these errors were encountered: