-
Notifications
You must be signed in to change notification settings - Fork 66
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PersistenceLengths and its unitary tests #1117
PersistenceLengths and its unitary tests #1117
Conversation
pers_lengths = [] | ||
for pd in X: | ||
# Sort in reverse order persistence lengths (where length = death - birth) | ||
lengths = np.flip(np.sort(pd[:,1] - pd[:,0])) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To get the top k elements (with k significantly smaller than n), a slightly more efficient algorithm is to use partition(*, -k)[-k:] (all the numbers I give are ±1, to be checked) and then you only need to sort those k elements. But that can come later if you prefer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did so on 768774a, but I had to assert on num_lengths <= 0
in the constructor. To be discussed, cf. this discussion in Ripser PR
@@ -912,28 +918,29 @@ def fit(self, X, y=None): | |||
|
|||
def transform(self, X): | |||
""" | |||
Compute the persistence landscape for each persistence diagram individually and concatenate the results. | |||
Compute the persistence lengths for each persistence diagram individually and concatenate the results. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When I read "concatenate", I expect the output to be a 1d array of length N*k.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I removed the "concatenate" on 3bbd3af
|
||
Parameters: | ||
X (list of n x 2 numpy arrays): input persistence diagrams. | ||
|
||
Returns: | ||
numpy array with shape (number of diagrams) x (num_lengths): output persistence lengths. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That would be nice, but isn't the current output a list?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right, so I took the opportunity to rewrite it with a first result array filled with zeros, that is instantiated rows by rows on 3bbd3af
idx = 0 | ||
for pd in X: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the usual way to do that in Python is
for idx, pd in enumerate(X):
(so you don't need idx = idx + 1
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, nice ! I did it on af41754
useful when PersistenceLengths is included in a scikit-learn Pipeline). | ||
|
||
Parameters: | ||
X (list of n x 2 or n x 1 numpy arrays): input persistence diagrams. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why nx1?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I missed this copy /paste that has to be changed. I fixed it on e984496
|
||
Parameters: | ||
X (list of n x 2 or n x 1 numpy arrays): input persistence diagrams. | ||
y (n x 1 array): persistence diagram lengths (unused). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have to be that specific? Sklearn seems to say just
y (None): Ignored.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good to me. I changed it on 009748c
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks ok to me, give @MathieuCarriere a bit of time to comment.
No description provided.