Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

missinIBS clustering constraint: g genotype data (--ibm) #12

Open
davidonlaptop opened this issue Apr 21, 2015 · 0 comments
Open

missinIBS clustering constraint: g genotype data (--ibm) #12

davidonlaptop opened this issue Apr 21, 2015 · 0 comments

Comments

@davidonlaptop
Copy link
Member

Description

This feature adds the --ibm constraint(s) on the --cluster option described in issue #7. For more info, check: http://pngu.mgh.harvard.edu/~purcell/plink/strat.shtml#options

The input file is the model created in #3.

Analysis

Add a comment to this issue with:

  • plink version used as reference (1.07 or 1.90 beta 3)
  • relevant C++ function name(s) and the file name(s) where they appear
  • Document the structure of the hh file generated by plink (not always present).
  • Document the algorithm and/or mathematical formula to compute:
    • --ibm option

Design

Add a comment to this issue describing how this will be implemented in Spark, and how it differs from plink.

Implementation

The implementation should use:

  • Scala
  • Spark RDD
  • Spark MLlib / GraphX (if appropriate)
@davidonlaptop davidonlaptop modified the milestone: 0.5 Apr 23, 2015
@naceurMh naceurMh changed the title IBS clustering constraint: missing genotype data (--ibm) missinIBS clustering constraint: g genotype data (--ibm) Jun 23, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant