Skip to content

Commit

Permalink
updated build process to pyproject.toml (#103)
Browse files Browse the repository at this point in the history
* v1.6.6 updated README

* Update main.yml

* Added pyproject.toml

* Updated pyproject.toml and setup.py
  • Loading branch information
kcleal authored Jul 16, 2024
1 parent c2ed41e commit 9e3479f
Show file tree
Hide file tree
Showing 4 changed files with 151 additions and 158 deletions.
10 changes: 6 additions & 4 deletions .github/workflows/main.yml
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,8 @@ jobs:

steps:
- uses: actions/checkout@v4
- name: Install project dependencies
run: which python; python -m pip install -r requirements.txt
- name: Build wheels
uses: pypa/cibuildwheel@v2.19.2
env:
Expand All @@ -26,15 +28,15 @@ jobs:
CIBW_BEFORE_ALL_LINUX: bash ci/manylinux-deps
CIBW_BEFORE_BUILD_MACOS: |
ln -s /Library/Frameworks/Python.framework/Versions/3.11/include/python3.11/cpython/longintrepr.h /Library/Frameworks/Python.framework/Versions/3.11/include/python3.11
pip install -r requirements.txt
which python3
python3 -m pip install -r requirements.txt
CIBW_BEFORE_BUILD_LINUX: pip install -r requirements.txt
CIBW_REPAIR_WHEEL_COMMAND_MACOS: delocate-wheel --require-archs x86_64 -w {dest_dir} -v {wheel}
CIBW_REPAIR_WHEEL_COMMAND_MACOS: delocate-wheel --require-archs x86_64 -w {dest_dir} -v {wheel} --require-target-macos-version 13.0
CIBW_TEST_SKIP: "*-macosx_arm64"
CIBW_TEST_REQUIRES: cython click>=8.0 numpy scipy pandas pysam>=0.22.0 networkx>=2.4 scikit-learn>=0.22 sortedcontainers lightgbm
CIBW_TEST_COMMAND: dysgu test --verbose



- uses: actions/upload-artifact@v4
with:
name: wheelhouse-${{ matrix.os }}-${{ github.run_id }}
path: ./wheelhouse/*.whl
26 changes: 25 additions & 1 deletion README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -153,6 +153,31 @@ uses < 6 GB memory. Also note that when `fetch` is utilized (or using run comman

🚦Filtering SVs
----------------
The filtering command is quite flexible and can be used to filter a single sample,
or filter against a normal sample, or a panel of normals/cohort.
If the filter command is used with a single input vcf (no normals or cohort), filtering will remove lower quality events::

dysgu filter input.vcf > output.vcf

Filtering is recommended after any merging has been performed, or if you are analysing only a single sample.

If a normal vcf is supplied, then input calls will be removed if they overlap with events in the normal vcf::

dysgu filter --normal-vcf normal.vcf input.vcf > output.vcf

Additionally, you can provide bam files to filter against. This will make the filtering much more stringent as each
alignment file you provide will be checked for reads that match your input calls. If supporting reads are found then
the input call will be removed. Note, this also makes filtering much slower. For large cohorts a random sample of
bams can be used for filtering using the `--random-bam-sample Int` option::

dysgu filter input.vcf normal.bam > output.vcf # normal bam only
dysgu filter --normal-vcf normal.vcf input.vcf normal.bam > output.vcf

Dysgu will understand the sample-name in vcf and bam files, so if you use "*.bam" syntax, then the input sample will
not be used for filtering.
Other filtering option are detailed below.

Remove events with low probability::

dysgu filter --min-prob 0.2 input.vcf > output.vcf
Expand All @@ -167,7 +192,6 @@ Re-label events with probability >= 0.3 as PASS::

Use normal bams to filter common/germline structural variants::

dysgu filter input.vcf normal.bam > output.vcf
dysgu filter input.vcf normals/*.bam > output.vcf
dysgu filter input.vcf list_of_normals.txt > output.vcf

Expand Down
51 changes: 51 additions & 0 deletions pyproject.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
[build-system]
requires = [
"setuptools >= 61.0",
"wheel",
"cython",
"pysam",
"numpy < 2",
]
build-backend = "setuptools.build_meta"

[project]
name = "dysgu"
version = "1.6.6"
description = "Structural variant calling"
authors = [
{ name = "Kez Cleal", email = "clealk@cardiff.ac.uk" }
]
license = { text = "MIT" }
requires-python = ">=3.10"
dependencies = [
"setuptools >= 61.0",
"cython",
"click >= 8.0",
"numpy < 2",
"scipy",
"pandas",
"pysam >= 0.22",
"networkx >= 2.4",
"scikit-learn >= 0.22",
"sortedcontainers",
"lightgbm"
]

[project.urls]
Homepage = "https://github.com/kcleal/dysgu"

[project.optional-dependencies]
test = [
"pytest",
"pytest-cov",
"pytest-mock",
]

[project.scripts]
dysgu = "dysgu.main:cli"

[tool.setuptools]
packages = ["dysgu", "dysgu.tests", "dysgu.scikitbio", "dysgu.edlib", "dysgu.sortedintersect"]

[tool.setuptools.package-data]
"dysgu" = ["*.pxd", "*.pyx"]
222 changes: 69 additions & 153 deletions setup.py
Original file line number Diff line number Diff line change
@@ -1,26 +1,19 @@
from setuptools import setup, find_packages
from setuptools import setup, Extension
import setuptools
from setuptools.extension import Extension
from Cython.Build import cythonize
import numpy
from distutils import ccompiler
import os
import pysam
import sys
import glob
from sys import argv
import pysam
import sysconfig


cfg_vars = sysconfig.get_config_vars()
for key, value in cfg_vars.items():
if type(value) == str:
if isinstance(value, str):
cfg_vars[key] = value.replace("-Wstrict-prototypes", "")


def has_flag(compiler, flagname):
"""Return a boolean indicating whether a flag name is supported on
the specified compiler.
"""
import tempfile
with tempfile.NamedTemporaryFile('w', suffix='.cpp') as f:
f.write('int main (int argc, char **argv) { return 0; }')
Expand All @@ -30,17 +23,13 @@ def has_flag(compiler, flagname):
return False
return True


def cpp_flag(compiler, flags):
"""Return the -std=c++[11/14/17,20] compiler flag.
The newer version is prefered over c++11 (when it is available).
"""
for flag in flags:
if has_flag(compiler, flag):
return flag


def get_extra_args():
from distutils import ccompiler
compiler = ccompiler.new_compiler()
extra_compile_args = []
flags = ['-std=c++17', '-std=c++14', '-std=c++11']
Expand All @@ -52,159 +41,86 @@ def get_extra_args():
f = cpp_flag(compiler, flags)
if f:
extra_compile_args.append(f)

return extra_compile_args


extras = get_extra_args() + ["-Wno-sign-compare", "-Wno-unused-function",
"-Wno-unused-result", '-Wno-ignored-qualifiers',
"-Wno-deprecated-declarations", "-fpermissive",
"-Wno-unreachable-code-fallthrough",
]

ext_modules = list()

root = os.path.abspath(os.path.dirname(__file__))

if "--conda-prefix" in argv or os.getenv('PREFIX'):
prefix = None
if "--conda-prefix" in argv:
idx = argv.index("--conda-prefix")
h = argv[idx + 1]
argv.remove("--conda-prefix")
argv.remove(h)
else:
h = os.getenv('PREFIX')

if h and os.path.exists(h):
if any("libhts" in i for i in glob.glob(h + "/lib/*")):
print("Using htslib at {}".format(h))
prefix = h
if prefix[-1] == "/":
htslib = prefix[:-1]
def get_extension_modules():
ext_modules = []

root = os.path.abspath(os.path.dirname(__file__))
libraries, library_dirs, include_dirs, runtime_dirs = [], [], [], []

if "--conda-prefix" in sys.argv or os.getenv('PREFIX'):
prefix = os.getenv('PREFIX') if not "--conda-prefix" in sys.argv else sys.argv[sys.argv.index("--conda-prefix") + 1]
if prefix and os.path.exists(prefix):
if any("libhts" in i for i in glob.glob(prefix + "/lib/*")):
print(f"Using htslib at {prefix}")
if prefix[-1] == "/":
prefix = prefix[:-1]
else:
raise ValueError(f"libhts not found at {prefix}/lib/*")
else:
raise ValueError("libhts not found at ", h + "/lib/*")
raise ValueError("prefix path does not exists")
libraries = ["hts"]
library_dirs = [f"{prefix}/lib", numpy.get_include()] + pysam.get_include()
include_dirs = [numpy.get_include(), root,
f"{prefix}/include/htslib", f"{prefix}/include"] + pysam.get_include()
runtime_dirs = [f"{prefix}/lib"]
else:
raise ValueError("prefix path does not exists")

libraries = ["hts"]
library_dirs = [f"{prefix}/lib", numpy.get_include()] + pysam.get_include()
include_dirs = [numpy.get_include(), root,
f"{prefix}/include/htslib", f"{prefix}/include"] + pysam.get_include()
runtime_dirs = [f"{prefix}/lib"]

else:
# Try and link dynamically to htslib
htslib = None
if "--htslib" in argv:
idx = argv.index("--htslib")
h = argv[idx + 1]
if h and os.path.exists(h):
if any("libhts" in i for i in glob.glob(h + "/*")):
print("Using --htslib at {}".format(h))
htslib = h
htslib = os.getenv('HTSLIB_DIR') if not "--htslib" in sys.argv else sys.argv[sys.argv.index("--htslib") + 1]
if htslib and os.path.exists(htslib):
if any("libhts" in i for i in glob.glob(htslib + "/*")):
print(f"Using --htslib at {htslib}")
if htslib[-1] == "/":
htslib = htslib[:-1]
argv.remove("--htslib")
argv.remove(h)
else:
raise ValueError("--htslib path does not exists")
else:
raise ValueError("--htslib path does not exists")

if htslib is None:
print("Using packaged htslib")
htslib = os.path.join(root, "dysgu/htslib")

libraries = [f"{htslib}/hts"]
library_dirs = [htslib, numpy.get_include(), f"{htslib}/htslib"] + pysam.get_include()
include_dirs = [numpy.get_include(), root,
f"{htslib}/htslib", f"{htslib}/cram"] + pysam.get_include()
runtime_dirs = [htslib]


print("Libs", libraries)
print("Library dirs", library_dirs)
print("Include dirs", include_dirs)
print("Runtime dirs", runtime_dirs)
print("Extras compiler args", extras)

# Scikit-bio module
ssw_extra_compile_args = ["-Wno-deprecated-declarations", '-std=c99', '-I.']


ext_modules.append(Extension(f"dysgu.scikitbio._ssw_wrapper",
[f"dysgu/scikitbio/_ssw_wrapper.pyx", f"dysgu/scikitbio/ssw.c"],
include_dirs=[f"{root}/dysgu/scikitbio", numpy.get_include()],
extra_compile_args=ssw_extra_compile_args,
language="c"))

ext_modules.append(Extension(f"dysgu.edlib.edlib",
[f"dysgu/edlib/edlib.pyx", f"dysgu/edlib/src/edlib.cpp"],
include_dirs=[f"{root}/dysgu/edlib", numpy.get_include()],
extra_compile_args=["-O3", "-std=c++11"],
language="c++"))

ext_modules.append(Extension(f"dysgu.sortedintersect.sintersect",
[f"dysgu/sortedintersect/sintersect.pyx"],
extra_compile_args=["-O3", "-std=c++11"],
language="c++"))

# Dysgu modules
for item in ["sv2bam", "io_funcs", "graph", "coverage", "assembler", "call_component",
"map_set_utils", "cluster", "sv_category", "extra_metrics"]:
print("Using packaged htslib")
htslib = os.path.join(root, "dysgu/htslib")
libraries = ["hts"]
library_dirs = [htslib, numpy.get_include(), f"{htslib}/htslib"] + pysam.get_include()
include_dirs = [numpy.get_include(), root,
f"{htslib}/htslib", f"{htslib}/cram"] + pysam.get_include()
runtime_dirs = [htslib]

ext_modules.append(Extension("dysgu.scikitbio._ssw_wrapper",
["dysgu/scikitbio/_ssw_wrapper.pyx", "dysgu/scikitbio/ssw.c"],
include_dirs=["dysgu/scikitbio", numpy.get_include()],
extra_compile_args=["-Wno-deprecated-declarations", '-std=c99', '-I.'],
language="c"))

ext_modules.append(Extension("dysgu.edlib.edlib",
["dysgu/edlib/edlib.pyx", "dysgu/edlib/src/edlib.cpp"],
include_dirs=["dysgu/edlib", numpy.get_include()],
extra_compile_args=["-O3", "-std=c++11"],
language="c++"))

ext_modules.append(Extension(f"dysgu.{item}",
[f"dysgu/{item}.pyx"],
libraries=libraries,
library_dirs=library_dirs,
include_dirs=include_dirs,
runtime_library_dirs=runtime_dirs,
extra_compile_args=extras,
define_macros=[("NPY_NO_DEPRECATED_API", "NPY_1_7_API_VERSION")],
ext_modules.append(Extension("dysgu.sortedintersect.sintersect",
["dysgu/sortedintersect/sintersect.pyx"],
extra_compile_args=["-O3", "-std=c++11"],
language="c++"))

for item in ["sv2bam", "io_funcs", "graph", "coverage", "assembler", "call_component",
"map_set_utils", "cluster", "sv_category", "extra_metrics"]:
ext_modules.append(Extension(f"dysgu.{item}",
[f"dysgu/{item}.pyx"],
libraries=libraries,
library_dirs=library_dirs,
include_dirs=include_dirs,
runtime_library_dirs=runtime_dirs,
extra_compile_args=extras,
define_macros=[("NPY_NO_DEPRECATED_API", "NPY_1_7_API_VERSION")],
language="c++"))

return cythonize(ext_modules)

print("Found packages", find_packages(where="."))
setup(
name="dysgu",
author="Kez Cleal",
author_email="clealk@cardiff.ac.uk",
url="https://github.com/kcleal/dysgu",
description="Structural variant calling",
license="MIT",
version='1.6.6',
python_requires='>=3.10',
install_requires=[ # runtime requires
'setuptools>=63.0',
'cython',
'click>=8.0',
'numpy>=1.18',
'scipy',
'pandas',
'pysam>=0.22',
'networkx>=2.4',
'scikit-learn>=0.22',
'sortedcontainers',
'lightgbm',
],
setup_requires=[
'setuptools>=63.0',
'cython',
'click>=8.0',
'numpy>=1.18',
'scipy',
'pandas',
'pysam>=0.22',
'networkx>=2.4',
'scikit-learn>=0.22',
'sortedcontainers',
'lightgbm',
],
packages=["dysgu", "dysgu.tests", "dysgu.scikitbio", "dysgu.edlib", "dysgu.sortedintersect"],
ext_modules=cythonize(ext_modules),
include_package_data=True,
zip_safe=False,
entry_points='''
[console_scripts]
dysgu=dysgu.main:cli
''',
)
ext_modules=get_extension_modules(),
)

0 comments on commit 9e3479f

Please sign in to comment.