Skip to content

Eigen tools by Nick Patterson and Alkes Price lab

License

Notifications You must be signed in to change notification settings

300bps/Eigensoft-EIG-

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

61 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

## -- Forked Branch Modifications  --
Modification Version: M1.0.0

This is a fork of the original EIGENSOFT package to implement the modifications listed below. The motivation for these modifications is to produce text-based output that can be easily parsed to allow ease-of-access to the data for custom programs.
PROGRAM: convertf
This program converts between the following formats: ANCESTRYMAP, PACKEDANCESTRYMAP, EIGENSTRAT, PED, PACKEDPED. The modifications introduced only affect the ANCESTRYMAP output format.
    * Added build flag to prevent ANCESTRYMAP output from automatically switching to PACKEDANCESTRYMAP for large files.
        - This will force the output to be the text ANCESTRYMAP format, but may result in HUGE file sizes.
    * Added build flag to enable tab-separated ANCESTRYMAP geno file output instead of whitespace-separated data.
        - This is to reduce ANCESTRYMAP output file size and to simplify parsing by custom programs.
    * Added build flag to enable tab-separated ANCESTRYMAP snp file output instead of whitespace-separated data.
        - This is to reduce output file size and to simplify parsing by custom programs.
    * Added build flag to enable tab-separated ANCESTRYMAP ind file output instead of whitespace-separated data.
        - This is to reduce output file size and to simplify parsing by custom programs.
    

Build Notes:
    * See the build information in the pre-fork project's comments below for specifics.
    * Summary: 
        - Install gcc compiler
        - Install the following development librarys:
            - build-essential
            - libopenblas-dev
            - liblapacke-dev
            - libgsl-dev
        - To compile, switch to the 'src' directory and enter 'make'.
            - If there are errors mentioning missing lapacke symbols (see details below), use this command instead: make LDLIBS="-llapacke"
        - After a successful make, type 'make install'. The compiled programs will be placed in the '../bin' directory.


## -- ORIGINAL (PRE-FORK) REPOSITORY/CREATOR COMMENTS ARE BELOW THIS LINE -- ##

## shrinkmode added.  3/15
EIGENSOFT version 8.0.0, 03/30/21 (for Linux only)

The EIGENSOFT package implements methods from the following 2 papers:
Patterson et al. 2006 PLoS Genet 2:e190 (population structure)
Price et al. 2006 Nat Genet 38:904-9 (EIGENSTRAT stratification correction)

NEW features of EIGENSOFT version 7.2.0
-- shrinkmode

NEW features of EIGENSOFT version 6.1.4 include:
-- pcaselection was omitted from 6.1.3 by accident
-- Statically linked GSL/openblas
-- Fixed memory allocation bug in pcaselection
-- Some routines moved into nicklib
-- Error message on allocate failure now prints length as "%ld"
   supporting long values.

NEW features of EIGENSOFT version 6.1.3 include:
-- Restored script file extensions to .perl instead of .pl
-- Added updated ploteig script that disappeared from the repository

NEW features of EIGENSOFT version 6.1.2 include:
-- Updated license info to be GPL compliant required by linking the GSL

NEW features of EIGENSOFT version 6.1.1 include:
-- Minor bug fix to correctly merge version 6.0.2 and version 6.1 changes.
-- pcaselection operates on evec files. Added examples. 
-- Backported twtable.c/h from EIGENSOFT 7alpha

NEW features of EIGENSOFT version 6.1 include:
-- The range finding step of PCA fastmode only scales the multiplied matrix,
   as orthogonalization is unnecessary. This appears to improve accuracy. 

NEW features of EIGENSOFT version 6.0.2 include:
-- Fixed Makefile and documentation to build eigenstrat properly
-- Moved Tracy-Widom table into a header file for easier building

NEW features of EIGENSOFT version 6.0.1 include:
-- Minor bug fix which prevents smartpca from trying to print out eigenvalues
  if fastmode is set.

NEW features of EIGENSOFT version 6.0.0beta included:
-- New option fastmode which implements a very fast pca approximation.
   See POPGEN/README and Galinsky 2014 ASHG talk.
-- Changes to external packages required.  EIGENSOFT version 5.0.2 required
   lapack + blas.  On the other hand, EIGENSOFT version 6.0beta requires 
   GSL + lapack + OpenBLAS (but does not require the native version of blas).
   The Makefile has been changed accordingly.
-- EIGENSOFT version 6.0beta supports multi-threading.  See POPGEN/README.
-- Bug fix for ldregress option.

See CONVERTF/README for documentation of programs for converting file formats.
See POPGEN/README for documentation of population structure programs.
See EIGENSTRAT/README for documentation of EIGENSTRAT programs.

Questions?
See https://www.hsph.harvard.edu/alkes-price/eigensoft-frequently-asked-questions/
https://github.com/DReichLab/EIG

For questions about building this software: 
Matthew Mah <matthew_mah@hms.harvard.edu>

For questions about smartpca:
Nick Patterson <nickp@broadinstitute.org>

For questions about eigenstrat:
Alkes Price <aprice@hsph.harvard.edu>

Executables and source code:
----------------------------
All C executables are in the bin/ directory.

We have placed source code for all C executables in the src/ directory, 
for users who wish to modify and recompile our programs.
"cd src"
"make"      
"make install"

Note that some of our software will only compile if your system has the
GSL + lapack + OpenBLAS packages installed.

On Mac OSX, you can install gsl and OpenBLAS with lapack using homebrew:
"brew install gsl"
"brew install openblas"

If these packages are not in standard directories, you can specify the 
appropriate include and library directories with the CFLAGS and LDFLAGS 
make variables. 
For example, on the Harvard Medical School O2 cluster, the command is:
'make CFLAGS="-I/n/app/openblas/0.2.19/include -I/n/app/gsl/2.3/include" LDFLAGS="-L/n/app/openblas/0.2.19/lib -L/n/app/gsl/2.3/lib/"'
On Mac OSX:
'make CFLAGS="-I/usr/local/opt/openblas/include -I/usr/local/opt/gsl/include" LDFLAGS="-L/usr/local/opt/openblas/lib -L/usr/local/opt/gsl/lib"'

If you have issues with missing lapacke symbols, for example "undefined reference to `LAPACKE_dlange'", run make with the corresponding additional libraries linked:
'make LDLIBS="-llapacke"'
This has been encountered on Linux Mint 18. 

If you have trouble compiling and running our code, try compiling and
running the pcatoy program in the src directory:
"cd src"
"make pcatoy"
"./pcatoy"
If you are unable to run the pcatoy program successfully, please contact
your system administrator for help, as this is a systems issue which is
beyond our scope.  Your system administrator will be able to troubleshoot
your systems issue using this trivial program.  [You can also try running
the pcatoy program in the bin directory, which we have already compiled.]

To remake the entire package:
"cd src"
"make clobber"
"make install"

To remake EIG7.2 it is necessary to link to the OpenBLAS library. On orchestra, 
the path is /opt/openblas and should work automatically. On Broad institute machines,
the user should execute "use .openblas-0.2.8" and "use GCC-4.9" at the command 
prompt before attempting to remake. All other users should install OpenBLAS and 
set the variable OPENBLAS to the path at the make command line, 
e.g. "make install OPENBLAS=/usr/local/openblas"

----------------------------
Acknowledgements:
EIGENSOFT was written by Nick Patterson, Alkes Price, Samuela Pollack,
Kevin Galinsky, Chris Chang, and Sasha Gusev.

We thank John Novembre and Mike Boursnell for code improvements, Matt Hanna 
for the first implementation of multi-threading, and Angela Yu for a bugfix. 

----------------------------
SOFTWARE COPYRIGHT NOTICE AGREEMENT
This software and its documentation are copyright (2010) by Harvard University 
and The Broad Institute. All rights are reserved. This software is supplied 
without any warranty or guaranteed support whatsoever. Neither Harvard 
University nor The Broad Institute can be responsible for its use, misuse, or 
functionality. The software may be freely copied for non-commercial purposes, 
provided this copyright notice is retained.

About

Eigen tools by Nick Patterson and Alkes Price lab

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • C 97.4%
  • Perl 2.4%
  • Makefile 0.2%