-
Notifications
You must be signed in to change notification settings - Fork 6
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
1 changed file
with
2 additions
and
13 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,19 +1,8 @@ | ||
# Gene Expression Experiment Quality (Geeq) | ||
|
||
Geeq is a method of measuring the quality and suitability of a dataset. | ||
Geeq is Gemma's method of giving users an indicator of dataset quality. Our definition of "quality" refers to data quality, wherein the same study could have been done twice with the same technical parameters and in one case yield bad quality data, and in another high quality data. | ||
|
||
**Quality** refers to data quality, wherein the same study could have been done twice with the same technical parameters and in one case yield bad quality data, and in another high quality data. | ||
|
||
**Suitability** mostly refers to technical aspects which, if we were doing the study ourselves, we would have altered to make it optimal for analyses of the sort used in Gemma. | ||
|
||
## Mechanism | ||
|
||
The suitability and quality scores are calculated based on several factors. Different factors contribute to suitability and to quality. Each factor is evaluated separately, but some are dependent on each other. (e.g. batch effect can not be evaluated if there is not batch information, and will have a default value). The final score is an arithmetic average of the all the factors. | ||
|
||
The scores of **datasets in curation** can change significantly, as the curators fill in some missing pieces or improve | ||
some of the measured factors. We make the socre public for these datasets, but it should be taken into accoutn that the score is not final and not fully representative of the dataset. | ||
|
||
### Visual representation | ||
The quality scores are calculated based on several factors deemed reflective of dataset quality, such as the presence of batch confounds. | ||
|
||
On the Gemma website, we use colored emoticons to give intuitive visual representation of both the quality and suitability scores. In most places, hovering over the emoticon will reveal the numerical value of the score. | ||
|