Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update from gudhi #20

Merged
merged 10 commits into from
Sep 23, 2024
Merged

Conversation

hschreiber
Copy link
Contributor

As Multipers is starting to get integrated into Gudhi, I will start to update the C++ files such that they match those in Gudhi.

I will make sure that the unit tests (run with pytest) and all notebooks (in docs/notebooks) work before pushing. But as they are not exhaustive, it would be good to run your own notebooks too @DavidLapous to ensure that I forgot nothing (I did not went through all the python/cython code line by line...)

@DavidLapous
Copy link
Owner

DavidLapous commented Sep 17, 2024

Linked with GUDHI/gudhi-devel#976. Should we wait that this PR is merged before merging this one to avoid unnecessary work? I'll test this update soon. I'll also deal with the notebooks, as I'd like to deprecate/change some interface.

@hschreiber
Copy link
Contributor Author

Linked with GUDHI/gudhi-devel#976. Should we wait that this PR is merged before merging this one to avoid unnecessary work?

Updating Multipers while working on GUDHI/gudhi-devel#976 helps me to find some bugs, so I will do both in parallel in any case. So you can either merge it step by step or all at once, whatever you think is less work (my guess is that if you don't do big changes in Multipers for the time being, it does not make a big difference. Otherwise, it would be better to merge things before to avoid having to update the new code again).

The most important thing is that the update is properly tested as some behaviour changes, so the update is not trivial. That is the main reason why this PR is a draft.

The good news is that the notebooks cover most methods. The problem is that they cover them usually with just one set of parameters, so a lot of combination and backend versions are not tested at all...

I'll test this update soon. I'll also deal with the notebooks, as I'd like to deprecate/change some interface.

Ok!

Copy link
Owner

@DavidLapous DavidLapous left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quick comments

_tempita_grid_gen.py Outdated Show resolved Hide resolved
multipers/filtration_conversions.pxd.tp Outdated Show resolved Hide resolved
multipers/filtrations.pxd Outdated Show resolved Hide resolved
multipers/filtrations.pxd Show resolved Hide resolved
multipers/filtrations.pxd Show resolved Hide resolved
Comment on lines 107 to 109
if len(births) == 1 and births[0].size == 1 and isinf(births[0][0]):
if len(deaths) == 1 and deaths[0].size == 1 and isinf(deaths[0][0]):
return []
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These checks should be done in c++ (we have access to v). Each One_critical_filtration has the is inf method

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, should I add is_plus/minus_inf() to the Cython interface of One_critical_filtration? Is not there for now.

Comment on lines 2343 to 2354
? box.get_lower_corner()[i]
: negInf;
value_type t_death = death.is_plus_inf() ? max_i : (death.is_minus_inf() ? -inf : std::min(death[i], max_i));
value_type t_birth = birth.is_plus_inf() ? inf : (birth.is_minus_inf() ? min_i : std::max(birth[i], min_i));
s = std::min(s, t_death - t_birth);
}
} else {
unsigned int dim = std::max(birth.size(), death.size());
for (unsigned int i = 0; i < dim; i++){
//if they don't have the same size, then one of them has to (+/-)infinite.
value_type t_death = death.size() > i ? death[i] : death[0]; //assumes death is never empty
value_type t_birth = birth.size() > i ? birth[i] : birth[0]; //assumes birth is never empty
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not a big fan of all of this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you be more precise? Is it more or less the same than before with just an extra tests for infinity values...

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This assumes inf is of the form {inf} + there is too many ..?..: .. imbriqué, which is not really readable. I haven't though about an alternative though

multipers/tests/test_mma.py Outdated Show resolved Hide resolved
Copy link
Owner

@DavidLapous DavidLapous left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

@DavidLapous
Copy link
Owner

DavidLapous commented Sep 19, 2024

From the time series notebook, if we take

import multipers as mp
from gudhi.point_cloud.timedelay import TimeDelayEmbedding
from os.path import expanduser
import pandas as pd
import numpy as np
import multipers.ml.point_clouds as mmp
DATASET_PATH=expanduser("~/Datasets/UCR/")
dataset_path = DATASET_PATH + "Coffee/Coffee"
xtrain = np.array(pd.read_csv(dataset_path+"_TRAIN.tsv", delimiter='\t', header=None, index_col=None))
TDE = TimeDelayEmbedding(dim=3, delay=1, skip=1)
xtrain = TDE.transform(xtrain)
sts = mmp.PointCloud2FilteredComplex(bandwidths=[-.1], num_collapses=-2, expand_dim=2).fit_transform(xtrain)

Then according to

mp.module_approximation(sts[0][0], threshold=True, box=[[0,1],[1,3]]).plot(box=[[0,1],[1,4]])

The summands whose death curve are [inf] are not thresholded properly.
Also, the verbose seems to lead to a segmentation fault

mp.module_approximation(sts[0][0], verbose=True)

Copy link
Owner

@DavidLapous DavidLapous left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK!

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

Copy link
Owner

@DavidLapous DavidLapous left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great! This fixes some nice stuff, I still have this segfaulting though:

import multipers as mp
st = mp.SimplexTreeMulti(num_parameters=2)
st.insert([0], [0, 1])
st.insert([1], [1, 0])
st.insert([0, 1], [1, 1])
mp.module_approximation(st,verbose=True)

I'll analyze the core dump when I've got time.

bool allInf = true;
for (std::size_t i = 0U; i < birth_container.num_parameters(); i++) {
auto t = box_.get_lower_corner()[i];
if (birth_container[i] < t - 1e-10) birth_container[i] = threshold_in ? t : -filtration_type::T_inf;
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reminder for later, the 1e-10 is funny

@hschreiber
Copy link
Contributor Author

Great! This fixes some nice stuff, I still have this segfaulting though:

import multipers as mp
st = mp.SimplexTreeMulti(num_parameters=2)
st.insert([0], [0, 1])
st.insert([1], [1, 0])
st.insert([0, 1], [1, 1])
mp.module_approximation(st,verbose=True)

I'll analyze the core dump when I've got time.

I can't reproduce the segfault...

@DavidLapous
Copy link
Owner

DavidLapous commented Sep 23, 2024

I can't reproduce the segfault...

I think it is because of my environment; I don't have this issue on my personal laptop

@DavidLapous
Copy link
Owner

LGTM

@DavidLapous DavidLapous marked this pull request as ready for review September 23, 2024 14:18
@DavidLapous DavidLapous merged commit 28c25ad into DavidLapous:main Sep 23, 2024
9 checks passed
@DavidLapous
Copy link
Owner

DavidLapous commented Sep 23, 2024

Hmm this is not compiling on macOS, it seems that my "PR" workflow doesn't properly test the PR folder ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants