Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Convert kover to python3 #60

Open
wants to merge 14 commits into
base: kover2
Choose a base branch
from
Open

Convert kover to python3 #60

wants to merge 14 commits into from

Conversation

goepp
Copy link
Collaborator

@goepp goepp commented May 4, 2021

Working notes:

  • I ran 2to3 on the repo and copied the files popcount.pyx and popcount.c as is.
  • Right now, compilation works, but building wheel for kover returns an error.

@aldro61
Copy link
Owner

aldro61 commented May 4, 2021

Thanks @goepp!

I was able to install Kover completely in a docker image with Ubuntu 20.04 and Python 3.8. I then tried to run this tutorial: http://aldro61.github.io/kover/doc_tut_scm.html, which failed. I think the code is using methods that are no longer supported in Python 3.

I propose that we work with this tutorial and fix any issues that come up along the way.

@aldro61
Copy link
Owner

aldro61 commented May 5, 2021

@goepp I did a bit of progress. The kover dataset create command in the tutorial now works. It fails at the dataset splitting phase now. I'm done working on this for today.

@goepp
Copy link
Collaborator Author

goepp commented May 6, 2021

I am sorry, I don't manage to reproduce your results. I am new to docker, so I may be missing something simple. Running

docker build -t kover-python3 . # create image
docker run kover-python3 sh -c 'kover --version' # run container

returns the error

cli-2.0.1
Traceback (most recent call last):
  File "/kover/bin/kover", line 1192, in <module>
    CommandLineInterface()
  File "/kover/bin/kover", line 1099, in __init__
    print("core-%s" % get_distribution('kover').version)
  File "/usr/lib/python3/dist-packages/pkg_resources/__init__.py", line 482, in get_distribution
    dist = get_provider(dist)
  File "/usr/lib/python3/dist-packages/pkg_resources/__init__.py", line 358, in get_provider
    return working_set.find(moduleOrReq) or require(str(moduleOrReq))[0]
  File "/usr/lib/python3/dist-packages/pkg_resources/__init__.py", line 901, in require
    needed = self.resolve(parse_requirements(requirements))
  File "/usr/lib/python3/dist-packages/pkg_resources/__init__.py", line 787, in resolve
    raise DistributionNotFound(req, requirers)
pkg_resources.DistributionNotFound: The 'kover' distribution was not found and is required by the application

It seems like the installation of kover using pip has failed. To check that, I run the installation of kover while running the container:

docker run kover-python3 sh -c 'cd kover/ && ./install.sh'

which returns Building wheel for kover (setup.py): finished with status 'error', confirming that I must have a problem with installing of the module.
Am I missing something obvious?

@aldro61
Copy link
Owner

aldro61 commented May 7, 2021

No worries! Also, you don't necessarily need to use docker. I'm using it to avoid any issues due to the fact that I'm on macOS (for now). Here are the commands I use:

Build the image docker build . -t aldro61/kover -f Dockerfile.py3

Run bash to start an interactive terminal in the container: docker run -it aldro61/kover bash

Then I run the tutorial commands. Whenever a command fails, I open up the python file with nano and try to fix it. Whenever it works, I update the file on my local machine and push the fix to github. It's not the most efficient way to do things, but it works :D

You could completely skip docker if the installation works on your local machine.

@goepp
Copy link
Collaborator Author

goepp commented May 10, 2021

Thanks!

  1. af0238e allows me to pip install kover
  2. 43d0094 converts integer division in the format of python3. The dataset splitting from the tutorial now works.
  3. I somehow have some indentation errors. I converted tabs as 4 spaces where relevant (e347102).

For reference: I am now at the point where kover learn scm returns TypeError: Simple selection can't process <HDF5 dataset "train_genome_idx": shape (75,), type "|u1">
I will look into it soon.


EDIT:
Full error traceback is:

Traceback (most recent call last):
  File "/usr/lib/python3.8/multiprocessing/pool.py", line 125, in worker
    result = (True, func(*args, **kwds))
  File "/usr/local/lib/python3.8/dist-packages/kover/learning/experiments/experiment_scm.py", line 118, in _cv_score_hp
    positive_example_idx = train_example_idx[dataset.phenotype.metadata[train_example_idx] == 1].reshape(-1)
  File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
  File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
  File "/usr/local/lib/python3.8/dist-packages/h5py/_hl/dataset.py", line 780, in __getitem__
    selection = sel.select(self.shape, args, dataset=self)
  File "/usr/local/lib/python3.8/dist-packages/h5py/_hl/selections.py", line 82, in select
    return selector.make_selection(args)
  File "h5py/_selector.pyx", line 272, in h5py._selector.Selector.make_selection
  File "h5py/_selector.pyx", line 213, in h5py._selector.Selector.apply_args
TypeError: Simple selection can't process <HDF5 dataset "train_genome_idx": shape (75,), type "|u1">
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "../../kover/bin/kover", line 1192, in <module>
    CommandLineInterface()
  File "../../kover/bin/kover", line 1150, in __init__
    getattr(self, args.command)()
  File "../../kover/bin/kover", line 1189, in learn
    getattr(learning_tool, args.command)()
  File "../../kover/bin/kover", line 561, in scm
    classifications = learn_SCM(dataset_file=args.dataset,
  File "/usr/local/lib/python3.8/dist-packages/kover/learning/experiments/experiment_scm.py", line 507, in learn_SCM
    best_hp_score, best_hp = _cross_validation(dataset_file=dataset_file,
  File "/usr/local/lib/python3.8/dist-packages/kover/learning/experiments/experiment_scm.py", line 172, in _cross_validation
    for hp, score in pool.imap_unordered(hp_eval_func, product(model_types, p_values)):
  File "/usr/lib/python3.8/multiprocessing/pool.py", line 868, in next
    raise value
TypeError: Simple selection can't process <HDF5 dataset "train_genome_idx": shape (75,), type "|u1">

So the indexing of h5py dataset does not work. I must admit I feel lost here. @aldro61 any clue?

@aldro61
Copy link
Owner

aldro61 commented May 10, 2021

Thanks a lot @goepp! The inconsistent indentation issues are due to an old IDE that I was using at the time. It gave me a lot of headaches! If you want a quick and easy way to fix this, you can install Black and simply run black file.py. It will reformat the whole file to meet its standards. Just make sure you run the command from the root of the repo, since I'll add a config file there.

@goepp
Copy link
Collaborator Author

goepp commented May 17, 2021

Thanks! Using black would ruin the git blaming, shall I carry on, or tell git to ignore the revisions (as explained here)? As explained in the link, this does not work for the GitHub UI.

@aldro61
Copy link
Owner

aldro61 commented May 17, 2021

Good point. I don't really use git blame, especially here, since I was working alone in the past. If you don't find it too tedious to do manual reformatting, I have no issues doing that. As long as we get rid of these mixed tabs and space issues when we encounter them.

@goepp
Copy link
Collaborator Author

goepp commented May 21, 2021

Done. Running git blame <file.py> --ignore-revs-file .git-blame-ignore-revs will print a git blame where the black formatting modifications (commit aaedc6f) are ignored.
For reference:

  • Note that this does not work perfectly, for instance the latter command does not ignore the modifications of aaedc6f at line 26 of file command_line.py.
  • Moreover, the GitHub GUI of git blame does not support the --ignore-revs-file option.

@goepp
Copy link
Collaborator Author

goepp commented May 21, 2021

Still stuck at the latter point (#60 (comment)) though.

@aldro61
Copy link
Owner

aldro61 commented May 21, 2021

Thanks @goepp! I will take some time today to investigate.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants