Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PersistenceLengths and its unitary tests #1117

Conversation

VincentRouvreau
Copy link
Contributor

No description provided.

@VincentRouvreau VincentRouvreau marked this pull request as ready for review August 14, 2024 14:31
pers_lengths = []
for pd in X:
# Sort in reverse order persistence lengths (where length = death - birth)
lengths = np.flip(np.sort(pd[:,1] - pd[:,0]))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To get the top k elements (with k significantly smaller than n), a slightly more efficient algorithm is to use partition(*, -k)[-k:] (all the numbers I give are ±1, to be checked) and then you only need to sort those k elements. But that can come later if you prefer.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did so on 768774a, but I had to assert on num_lengths <= 0 in the constructor. To be discussed, cf. this discussion in Ripser PR

@@ -912,28 +918,29 @@ def fit(self, X, y=None):

def transform(self, X):
"""
Compute the persistence landscape for each persistence diagram individually and concatenate the results.
Compute the persistence lengths for each persistence diagram individually and concatenate the results.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I read "concatenate", I expect the output to be a 1d array of length N*k.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed the "concatenate" on 3bbd3af


Parameters:
X (list of n x 2 numpy arrays): input persistence diagrams.

Returns:
numpy array with shape (number of diagrams) x (num_lengths): output persistence lengths.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would be nice, but isn't the current output a list?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right, so I took the opportunity to rewrite it with a first result array filled with zeros, that is instantiated rows by rows on 3bbd3af

Comment on lines 930 to 931
idx = 0
for pd in X:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the usual way to do that in Python is

for idx, pd in enumerate(X):

(so you don't need idx = idx + 1)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, nice ! I did it on af41754

useful when PersistenceLengths is included in a scikit-learn Pipeline).

Parameters:
X (list of n x 2 or n x 1 numpy arrays): input persistence diagrams.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why nx1?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I missed this copy /paste that has to be changed. I fixed it on e984496


Parameters:
X (list of n x 2 or n x 1 numpy arrays): input persistence diagrams.
y (n x 1 array): persistence diagram lengths (unused).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have to be that specific? Sklearn seems to say just

y (None): Ignored.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good to me. I changed it on 009748c

Copy link
Member

@mglisse mglisse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks ok to me, give @MathieuCarriere a bit of time to comment.

@VincentRouvreau VincentRouvreau merged commit 1670055 into GUDHI:master Sep 23, 2024
7 checks passed
@VincentRouvreau VincentRouvreau deleted the persistence_lengths_representation branch September 23, 2024 07:28
@VincentRouvreau VincentRouvreau added the 3.11.0 GUDHI version 3.11.0 label Sep 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.11.0 GUDHI version 3.11.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants