-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HWE output issues: need to accommodate longer colon-delimited allele names #19
Comments
This is actually a more pressing issue than Issue #18, as the first allele is left-truncated, so that the allele-family and protein fields are being truncated. |
Also, allele-lengths appear to be properly accommodated by the Chen/Diff test results:
Although these Chen/diff results would be easier to read if they were structured more like the other HW results:
However, this latter would be a "nice to have". |
Got it, so we'll make this dynamic too. How often is the plain text output used these days? The TSV/CSV output obviously doesn't have this problem since there is no attempt to format things. Everything is always generated from the complete "output of record" (which is the XML), i.e. both the TSV/CSV files and the plain text are derived from the XML, and so can be regenerated at any time without having to redo the analysis. In case you want to try this, you just need the
|
…use this as the column width (#19)
|
Common Genotypes look good, but as I commented in Issue #18, the common genotypes are right aligned, which looks weird. I can't really say how frequently PyPop users are relying on the .txt output as opposed to the .xml output. Personally, I always look at the .txt output first, because I want to know if I need to rerun the data because I set something wrong, or need to change a setting. I use popmeta to generate .dat files from the .xml output, but then I have to look through each .dat file to see what's what. I do know that I get questions from the community asking how to get the d'ij values out of PyPop; I direct the questioners to the .xml and popmeta, so that suggests to me that many others are relying primarily on the .txt outputs. |
Great, if this is fixed from your POV, please close. I opened up a new issue on the Chen/diff stat alignment issue in #20. There is also the |
Okay. I'll close. I hadn't been using --generate-tsv, but it is good to be reminded of it. |
… the labels even if the allele names are short (#19).
Hi @sjmack , in testing the new code with dynamic columns based on the allele lengths, I noticed that if the allele names were short it could sometimes cutoff the labels (using the USAFEL example). Try running the USAFEL example before you update to see what I mean. (I'm going to move this to the test suite now so we catch this). In any case, I just committed a fix to master that should address these, try running again. You might want to look at the output to make sure it looks OK. I had to hardcode the dashed row separator as being 90 characters. it could be made dynamic, but it doesn't seem to me to be a big priority. |
Okay. I see.
Generated results like this before a pull:
and generated these results with the current master branch:
I think that those fixes generate acceptable presentations. Nice catch! |
Great. |
:pypop sjmack$ ./bin/pypop.py -c WS_BDCtrl_Test_HW.ini BIGDAWG_SynthControl_Data.pop
is generating the following output for the Hardy Weinberg common genotypes test:As with Issue #18 the increased size of the colon-delimited allele names is larger than the hard-coded 18 characters allowed for a + delimited allele pair.
With digit delimited allele names, the maximum length for a pair would have been 20 characters [but in retrospect, I think that PyPop is stripping expression variant suffixes like N and L, as those are optional)], so the maximum length would have been 17 characters (01010101+01010101).
As with Issue #18 the size of the Common genotypes allele pair field needs to be increased to at least 25 characters (e.g., 104:01:01:01+06:127:01:01), and may possibly need to be made flexible to accommodate longer allele names.
The text was updated successfully, but these errors were encountered: