-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update from gudhi #20
Conversation
Linked with GUDHI/gudhi-devel#976. Should we wait that this PR is merged before merging this one to avoid unnecessary work? I'll test this update soon. I'll also deal with the notebooks, as I'd like to deprecate/change some interface. |
Updating Multipers while working on GUDHI/gudhi-devel#976 helps me to find some bugs, so I will do both in parallel in any case. So you can either merge it step by step or all at once, whatever you think is less work (my guess is that if you don't do big changes in Multipers for the time being, it does not make a big difference. Otherwise, it would be better to merge things before to avoid having to update the new code again). The most important thing is that the update is properly tested as some behaviour changes, so the update is not trivial. That is the main reason why this PR is a draft. The good news is that the notebooks cover most methods. The problem is that they cover them usually with just one set of parameters, so a lot of combination and backend versions are not tested at all...
Ok! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Quick comments
multipers/mma_structures.pyx.tp
Outdated
if len(births) == 1 and births[0].size == 1 and isinf(births[0][0]): | ||
if len(deaths) == 1 and deaths[0].size == 1 and isinf(deaths[0][0]): | ||
return [] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These checks should be done in c++ (we have access to v
). Each One_critical_filtration
has the is inf method
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, should I add is_plus/minus_inf()
to the Cython interface of One_critical_filtration
? Is not there for now.
? box.get_lower_corner()[i] | ||
: negInf; | ||
value_type t_death = death.is_plus_inf() ? max_i : (death.is_minus_inf() ? -inf : std::min(death[i], max_i)); | ||
value_type t_birth = birth.is_plus_inf() ? inf : (birth.is_minus_inf() ? min_i : std::max(birth[i], min_i)); | ||
s = std::min(s, t_death - t_birth); | ||
} | ||
} else { | ||
unsigned int dim = std::max(birth.size(), death.size()); | ||
for (unsigned int i = 0; i < dim; i++){ | ||
//if they don't have the same size, then one of them has to (+/-)infinite. | ||
value_type t_death = death.size() > i ? death[i] : death[0]; //assumes death is never empty | ||
value_type t_birth = birth.size() > i ? birth[i] : birth[0]; //assumes birth is never empty |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not a big fan of all of this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you be more precise? Is it more or less the same than before with just an extra tests for infinity values...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This assumes inf is of the form {inf} + there is too many ..?..: ..
imbriqué, which is not really readable. I haven't though about an alternative though
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
From the time series notebook, if we take import multipers as mp
from gudhi.point_cloud.timedelay import TimeDelayEmbedding
from os.path import expanduser
import pandas as pd
import numpy as np
import multipers.ml.point_clouds as mmp
DATASET_PATH=expanduser("~/Datasets/UCR/")
dataset_path = DATASET_PATH + "Coffee/Coffee"
xtrain = np.array(pd.read_csv(dataset_path+"_TRAIN.tsv", delimiter='\t', header=None, index_col=None))
TDE = TimeDelayEmbedding(dim=3, delay=1, skip=1)
xtrain = TDE.transform(xtrain)
sts = mmp.PointCloud2FilteredComplex(bandwidths=[-.1], num_collapses=-2, expand_dim=2).fit_transform(xtrain) Then according to mp.module_approximation(sts[0][0], threshold=True, box=[[0,1],[1,3]]).plot(box=[[0,1],[1,4]]) The summands whose death curve are mp.module_approximation(sts[0][0], verbose=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great! This fixes some nice stuff, I still have this segfaulting though:
import multipers as mp
st = mp.SimplexTreeMulti(num_parameters=2)
st.insert([0], [0, 1])
st.insert([1], [1, 0])
st.insert([0, 1], [1, 1])
mp.module_approximation(st,verbose=True)
I'll analyze the core dump when I've got time.
bool allInf = true; | ||
for (std::size_t i = 0U; i < birth_container.num_parameters(); i++) { | ||
auto t = box_.get_lower_corner()[i]; | ||
if (birth_container[i] < t - 1e-10) birth_container[i] = threshold_in ? t : -filtration_type::T_inf; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reminder for later, the 1e-10 is funny
I can't reproduce the segfault... |
I think it is because of my environment; I don't have this issue on my personal laptop |
LGTM |
Hmm this is not compiling on macOS, it seems that my "PR" workflow doesn't properly test the PR folder ? |
As Multipers is starting to get integrated into Gudhi, I will start to update the C++ files such that they match those in Gudhi.
I will make sure that the unit tests (run with pytest) and all notebooks (in docs/notebooks) work before pushing. But as they are not exhaustive, it would be good to run your own notebooks too @DavidLapous to ensure that I forgot nothing (I did not went through all the python/cython code line by line...)