Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update H. flu database to new sklearn version #276

Closed
conmeehan opened this issue Jul 18, 2023 · 3 comments
Closed

Update H. flu database to new sklearn version #276

conmeehan opened this issue Jul 18, 2023 · 3 comments
Assignees
Labels
model Changes to the model

Comments

@conmeehan
Copy link

Versions
poppunk 2.6.0
zsh: command not found: poppunk_sketch
poppunk_assign 2.6.0

Conda list:

packages in environment at /Users/cmeehan/opt/miniconda3/envs/poppunk:

Name Version Build Channel

aom 3.5.0 hf0c8a7f_0 conda-forge
atk-1.0 2.38.0 h1d18e73_1 conda-forge
biopython 1.81 py310h90acd4f_0 conda-forge
boost 1.78.0 py310h3e792ce_4 conda-forge
boost-cpp 1.78.0 hf5ba120_3 conda-forge
brotli 1.0.9 hb7f2c08_9 conda-forge
brotli-bin 1.0.9 hb7f2c08_9 conda-forge
brotli-python 1.0.9 py310h7a76584_9 conda-forge
bzip2 1.0.8 h0d85af4_4 conda-forge
c-ares 1.19.1 h0dc2134_0 conda-forge
ca-certificates 2023.5.7 h8857fd0_0 conda-forge
cached-property 1.5.2 hd8ed1ab_1 conda-forge
cached_property 1.5.2 pyha770c72_1 conda-forge
cairo 1.16.0 h09dd18c_1016 conda-forge
cairomm-1.0 1.14.4 h5b44118_1 conda-forge
certifi 2023.5.7 pyhd8ed1ab_0 conda-forge
cffi 1.15.1 py310ha78151a_3 conda-forge
charset-normalizer 3.2.0 pyhd8ed1ab_0 conda-forge
colorama 0.4.6 pyhd8ed1ab_0 conda-forge
contourpy 1.1.0 py310h88cfcbd_0 conda-forge
cycler 0.11.0 pyhd8ed1ab_0 conda-forge
cython 3.0.0 py310h9e9d8ca_0 conda-forge
dav1d 1.2.1 h0dc2134_0 conda-forge
dendropy 4.6.1 pyhdfd78af_0 bioconda
docopt 0.6.2 py_1 conda-forge
epoxy 1.5.10 h5eb16cf_1 conda-forge
expat 2.5.0 hf0c8a7f_1 conda-forge
ffmpeg 6.0.0 gpl_h74aebd8_103 conda-forge
font-ttf-dejavu-sans-mono 2.37 hab24e00_0 conda-forge
font-ttf-inconsolata 3.000 h77eed37_0 conda-forge
font-ttf-source-code-pro 2.038 h77eed37_0 conda-forge
font-ttf-ubuntu 0.83 hab24e00_0 conda-forge
fontconfig 2.14.2 h5bb23bf_0 conda-forge
fonts-conda-ecosystem 1 0 conda-forge
fonts-conda-forge 1 0 conda-forge
fonttools 4.41.0 py310h6729b98_0 conda-forge
freetype 2.12.1 h3f81eb7_1 conda-forge
fribidi 1.0.10 hbcb3906_0 conda-forge
gdk-pixbuf 2.42.10 hff535ac_2 conda-forge
gettext 0.21.1 h8a4c099_0 conda-forge
gfortran_impl_osx-64 12.2.0 h158f68b_31 conda-forge
glib-tools 2.76.4 h7d26f99_0 conda-forge
gmp 6.2.1 h2e338ed_0 conda-forge
gnutls 3.7.8 h207c4f0_0 conda-forge
graph-tool 2.48 py310h6327fc9_0 conda-forge
graph-tool-base 2.48 py310h65f7fc8_0 conda-forge
graphite2 1.3.13 h2e338ed_1001 conda-forge
gtk3 3.24.38 h5a9695a_0 conda-forge
h5py 3.8.0 nompi_py310h5555e59_100 conda-forge
harfbuzz 7.3.0 h413ba03_0 conda-forge
hdbscan 0.8.29 py310h936d966_2 conda-forge
hdf5 1.12.2 nompi_h48135f9_101 conda-forge
hicolor-icon-theme 0.17 h694c41f_2 conda-forge
icu 72.1 h7336db1_0 conda-forge
idna 3.4 pyhd8ed1ab_0 conda-forge
isl 0.25 hb486fe8_0 conda-forge
joblib 1.3.0 pyhd8ed1ab_1 conda-forge
kiwisolver 1.4.4 py310ha23aa8a_1 conda-forge
krb5 1.21.1 hb884880_0 conda-forge
lame 3.100 hb7f2c08_1003 conda-forge
lcms2 2.15 h2dcdeff_1 conda-forge
lerc 4.0.0 hb486fe8_0 conda-forge
libaec 1.0.6 hf0c8a7f_1 conda-forge
libass 0.17.1 h66d2fa1_0 conda-forge
libblas 3.9.0 17_osx64_openblas conda-forge
libbrotlicommon 1.0.9 hb7f2c08_9 conda-forge
libbrotlidec 1.0.9 hb7f2c08_9 conda-forge
libbrotlienc 1.0.9 hb7f2c08_9 conda-forge
libcblas 3.9.0 17_osx64_openblas conda-forge
libcurl 8.1.2 h5f667d7_1 conda-forge
libcxx 16.0.6 hd57cbcb_0 conda-forge
libdeflate 1.18 hac1461d_0 conda-forge
libedit 3.1.20191231 h0678c8f_2 conda-forge
libev 4.33 haf1e3a3_1 conda-forge
libexpat 2.5.0 hf0c8a7f_1 conda-forge
libffi 3.4.2 h0d85af4_5 conda-forge
libgfortran 5.0.0 11_3_0_h97931a8_31 conda-forge
libgfortran-devel_osx-64 12.2.0 hf0fd499_31 conda-forge
libgfortran5 12.2.0 he409387_31 conda-forge
libgirepository 1.76.1 he30e17e_0 conda-forge
libglib 2.76.4 hc62aa5d_0 conda-forge
libiconv 1.17 hac89ed1_0 conda-forge
libidn2 2.3.4 hb7f2c08_0 conda-forge
libjpeg-turbo 2.1.5.1 hb7f2c08_0 conda-forge
liblapack 3.9.0 17_osx64_openblas conda-forge
libnghttp2 1.52.0 he2ab024_0 conda-forge
libopenblas 0.3.23 openmp_h429af6e_0 conda-forge
libopus 1.3.1 hc929b4f_1 conda-forge
libpng 1.6.39 ha978bb4_0 conda-forge
librsvg 2.56.1 hec3db73_0 conda-forge
libsqlite 3.42.0 h58db7d2_0 conda-forge
libssh2 1.11.0 hd019ec5_0 conda-forge
libtasn1 4.19.0 hb7f2c08_0 conda-forge
libtiff 4.5.1 hf955e92_0 conda-forge
libunistring 0.9.10 h0d85af4_0 conda-forge
libvpx 1.13.0 hf0c8a7f_0 conda-forge
libwebp-base 1.3.1 h0dc2134_0 conda-forge
libxcb 1.15 hb7f2c08_0 conda-forge
libxml2 2.11.4 hd95e348_0 conda-forge
libzlib 1.2.13 h8a1eda9_5 conda-forge
llvm-openmp 16.0.6 hff08bdf_0 conda-forge
mandrake 1.2.2 py310heea2105_2 conda-forge
matplotlib-base 3.7.2 py310h475a17b_0 conda-forge
mpc 1.3.1 h81bd1dd_0 conda-forge
mpfr 4.2.0 h4f9bd69_0 conda-forge
munkres 1.0.7 py_1 bioconda
ncurses 6.4 hf0c8a7f_0 conda-forge
nettle 3.8.1 h96f3785_1 conda-forge
networkx 3.1 pyhd8ed1ab_0 conda-forge
numpy 1.25.1 py310h7451ae0_0 conda-forge
openblas 0.3.23 openmp_hbefa662_0 conda-forge
openh264 2.3.1 hf0c8a7f_2 conda-forge
openjpeg 2.5.0 h13ac156_2 conda-forge
openssl 3.1.1 h8a1eda9_1 conda-forge
p11-kit 0.24.1 h65f8906_0 conda-forge
packaging 23.1 pyhd8ed1ab_0 conda-forge
pandas 2.0.3 py310h5e4fcda_1 conda-forge
pango 1.50.14 hbce5e75_1 conda-forge
pcre2 10.40 h1c4e4bc_0 conda-forge
pillow 10.0.0 py310hd63a8c7_0 conda-forge
pip 23.2 pyhd8ed1ab_0 conda-forge
pixman 0.40.0 hbcb3906_0 conda-forge
platformdirs 3.9.1 pyhd8ed1ab_0 conda-forge
plotly 5.15.0 pyhd8ed1ab_0 conda-forge
pooch 1.7.0 pyha770c72_3 conda-forge
poppunk 2.6.0 py310h4862987_1 bioconda
pp-sketchlib 2.1.1 py310hda06942_1 conda-forge
pthread-stubs 0.4 hc929b4f_1001 conda-forge
pycairo 1.24.0 py310h0b97775_0 conda-forge
pycparser 2.21 pyhd8ed1ab_0 conda-forge
pygobject 3.44.1 py310ha8dcd3d_0 conda-forge
pyparsing 3.0.9 pyhd8ed1ab_0 conda-forge
pysocks 1.7.1 pyha2e5f31_6 conda-forge
python 3.10.12 had23ca6_0_cpython conda-forge
python-dateutil 2.8.2 pyhd8ed1ab_0 conda-forge
python-tzdata 2023.3 pyhd8ed1ab_0 conda-forge
python_abi 3.10 3_cp310 conda-forge
pytz 2023.3 pyhd8ed1ab_0 conda-forge
rapidnj 2.3.2 h85dcccf_4 bioconda
readline 8.2 h9e318b2_1 conda-forge
requests 2.31.0 pyhd8ed1ab_0 conda-forge
scikit-learn 1.3.0 py310hd2c063c_0 conda-forge
scipy 1.11.1 py310h3900cf1_0 conda-forge
setuptools 68.0.0 pyhd8ed1ab_0 conda-forge
sigcpp-2.0 2.10.8 hf0c8a7f_0 conda-forge
six 1.16.0 pyh6c4a22f_0 conda-forge
sparsehash 2.0.4 hf0c8a7f_1 conda-forge
svt-av1 1.6.0 he965462_0 conda-forge
tenacity 8.2.2 pyhd8ed1ab_0 conda-forge
threadpoolctl 3.2.0 pyha21a80b_0 conda-forge
tk 8.6.12 h5dbffcc_0 conda-forge
tqdm 4.65.0 pyhd8ed1ab_1 conda-forge
treeswift 1.1.37 pyh7cba7a3_0 bioconda
typing-extensions 4.7.1 hd8ed1ab_0 conda-forge
typing_extensions 4.7.1 pyha770c72_0 conda-forge
tzdata 2023c h71feb2d_0 conda-forge
unicodedata2 15.0.0 py310h90acd4f_0 conda-forge
urllib3 2.0.3 pyhd8ed1ab_1 conda-forge
wheel 0.40.0 pyhd8ed1ab_1 conda-forge
x264 1!164.3095 h775f41a_2 conda-forge
x265 3.5 hbb4e6a2_3 conda-forge
xorg-compositeproto 0.4.2 h0d85af4_1001 conda-forge
xorg-damageproto 1.2.1 h0d85af4_1002 conda-forge
xorg-fixesproto 5.0 h0d85af4_1002 conda-forge
xorg-inputproto 2.3.2 h35c211d_1002 conda-forge
xorg-kbproto 1.0.7 h35c211d_1002 conda-forge
xorg-libice 1.0.10 h0d85af4_0 conda-forge
xorg-libsm 1.2.3 h0d85af4_1000 conda-forge
xorg-libx11 1.8.6 hbd0b022_0 conda-forge
xorg-libxau 1.0.11 h0dc2134_0 conda-forge
xorg-libxaw 1.0.14 h0d85af4_1 conda-forge
xorg-libxcomposite 0.4.6 hb7f2c08_1 conda-forge
xorg-libxcursor 1.2.0 hb7f2c08_1 conda-forge
xorg-libxdamage 1.1.5 h0d85af4_1 conda-forge
xorg-libxdmcp 1.1.3 h35c211d_0 conda-forge
xorg-libxext 1.3.4 hb7f2c08_2 conda-forge
xorg-libxfixes 5.0.3 h0d85af4_1004 conda-forge
xorg-libxi 1.7.10 h0d85af4_0 conda-forge
xorg-libxinerama 1.1.5 hf0c8a7f_0 conda-forge
xorg-libxmu 1.1.3 h0d85af4_0 conda-forge
xorg-libxpm 3.5.16 h0dc2134_0 conda-forge
xorg-libxrandr 1.5.2 h0d85af4_1 conda-forge
xorg-libxrender 0.9.11 h0dc2134_0 conda-forge
xorg-libxt 1.3.0 h0dc2134_0 conda-forge
xorg-randrproto 1.5.0 h0d85af4_1001 conda-forge
xorg-renderproto 0.11.1 h0d85af4_1002 conda-forge
xorg-util-macros 1.19.3 h35c211d_0 conda-forge
xorg-xextproto 7.3.0 hb7f2c08_1003 conda-forge
xorg-xproto 7.0.31 h35c211d_1007 conda-forge
xz 5.2.6 h775f41a_0 conda-forge
zlib 1.2.13 h8a1eda9_5 conda-forge
zstandard 0.19.0 py310h151724a_2 conda-forge
zstd 1.5.2 h829000d_7 conda-forge

Command used and output returned
poppunk_assign --db Haemophilus_influenzae_v1_refs --query input.txt --output poppunk_clusters --threads 7
Input.txt:
AP022846 AP022846.1.fa
SRR11108932 SRR11108932_1.fastq.gz SRR11108932_1.fastq.gz

Describe the bug
Get the following error when running on Apple M1 macOS 13.4.1 16GB memory:

PopPUNK: assign
(with backend: sketchlib v2.1.1
sketchlib: /Users/cmeehan/opt/miniconda3/envs/poppunk/lib/python3.10/site-packages/pp_sketchlib.cpython-310-darwin.so)
Mode: Assigning clusters of query sequences

Graph-tools OpenMP parallelisation enabled: with 7 threads
Sketching 1 genomes using 1 thread(s)
Progress (CPU): 1 / 1
Writing sketches to file
Traceback (most recent call last):
File "/Users/cmeehan/opt/miniconda3/envs/poppunk/bin/poppunk_assign", line 11, in
sys.exit(main())
File "/Users/cmeehan/opt/miniconda3/envs/poppunk/lib/python3.10/site-packages/PopPUNK/assign.py", line 211, in main
assign_query(dbFuncs,
File "/Users/cmeehan/opt/miniconda3/envs/poppunk/lib/python3.10/site-packages/PopPUNK/assign.py", line 307, in assign_query
isolateClustering = assign_query_hdf5(dbFuncs,
File "/Users/cmeehan/opt/miniconda3/envs/poppunk/lib/python3.10/site-packages/PopPUNK/assign.py", line 357, in assign_query_hdf5
from .models import loadClusterFit
File "/Users/cmeehan/opt/miniconda3/envs/poppunk/lib/python3.10/site-packages/PopPUNK/models.py", line 19, in
import hdbscan
File "/Users/cmeehan/opt/miniconda3/envs/poppunk/lib/python3.10/site-packages/hdbscan/init.py", line 1, in
from .hdbscan_ import HDBSCAN, hdbscan
File "/Users/cmeehan/opt/miniconda3/envs/poppunk/lib/python3.10/site-packages/hdbscan/hdbscan_.py", line 40, in
FAST_METRICS = KDTree.valid_metrics + BallTree.valid_metrics + ["cosine", "arccos"]
TypeError: unsupported operand type(s) for +: 'builtin_function_or_method' and 'builtin_function_or_method'

Note: Ran on an UBUNTU server and do not get this error.

@conmeehan conmeehan changed the title PopPUNK assign error on M1 Applce laptop PopPUNK assign error on M1 Apple laptop Jul 18, 2023
@johnlees
Copy link
Member

Sorry about this, I think this looks like it's due to scikit-learn changing their API, which I couldn't make backwards compatible, see: https://github.com/bacpop/PopPUNK#2022-08-04

The change in scikit-learn's API in v1.0.0 and above mean that HDBSCAN models fitted with sklearn <=v0.24 will give an error when loaded. If you run into this, the solution is one of:

  • Downgrade sklearn to v0.24.
  • Run model refinement to turn your model into a boundary model instead (this will change clusters).
  • Refit your model in an environment with sklearn >=v1.0.

If this is a common problem let us know, as we could write a script to 'upgrade' HDBSCAN models. See issue #213 for more details.

Was the Haemophilus_influenzae_v1_refs database from our website? I should update it to fix this if so

@conmeehan
Copy link
Author

Ah sorry, I didnt see that bit in the README. Surprised that it worked on the Ubuntu box, must have used a different scikit-learn and I just didn't notice.

The database was from your website, yes. I didn't try any other ones, just that one.

C

@johnlees johnlees changed the title PopPUNK assign error on M1 Apple laptop Update H. flu database to new sklearn version Jul 21, 2023
@johnlees johnlees self-assigned this Jul 21, 2023
@johnlees johnlees added the model Changes to the model label Jul 21, 2023
@johnlees
Copy link
Member

I'm really sorry @conmeehan but I totally forgot about this!!

I remembered as I'd seen this pubished: https://www.microbiologyresearch.org/content/journal/mgen/10.1099/mgen.0.001281
Just made a compatible poppunk scheme (without the error reported here) and uploaded it as v2.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
model Changes to the model
Projects
None yet
Development

No branches or pull requests

2 participants