[Simplex tree] Generalization of filtration values in Simplex_tree #1122

hschreiber · 2024-08-19T14:50:19Z

Enables Filtration_value to be something else than double / float in the simplex tree. In particular, is thought as a preparation for the multi simplex tree of PR #976, where the filtration value will be entirely new classes.

mglisse

Some initial comments.

src/Simplex_tree/concept/FiltrationValue.h

mglisse · 2024-08-21T19:21:16Z

src/Simplex_tree/concept/FiltrationValue.h

+   * If the filtration is 1-critical and totally ordered (as in one parameter persistence), the union is simply
+   * the minimum of the two values. If the filtration is 1-critical, but not totally ordered (possible for
+   * multi-persistence), than the union is also the minium if the two given values are comparable and the
+   * method should throw an error if they are not, as a same simplex should not exist at those two values.
+   * Finally, if the filtration is k-critical, FiltrationValue should be able to store an union of values and this
+   * method adds the values of @p f2 in @p f1 and removes the values from @p f1 which are comparable and greater than
+   * other values.


I think I would prefer describing the general case (k-critical), and then say that if the type cannot represent this result (limited to 1-critical), it should throw.

Is the formulation better now?

src/Simplex_tree/concept/FiltrationValue.h

src/Simplex_tree/include/gudhi/Simplex_tree.h

…ound

mglisse · 2024-08-23T18:07:39Z

src/Simplex_tree/include/gudhi/Simplex_tree.h

@@ -1948,8 +2024,8 @@ class Simplex_tree {
   * than it was before. However, `upper_bound_dimension()` will return the old value, which remains a valid upper
   * bound. If you care, you can call `dimension()` to recompute the exact dimension.
   */
-  bool prune_above_filtration(Filtration_value filtration) {
-    if (std::numeric_limits<Filtration_value>::has_infinity && filtration == std::numeric_limits<Filtration_value>::infinity())
+  bool prune_above_filtration(const Filtration_value_rep filtration) {


In the multiparameter case (partial order), I wonder what makes the most sense: removing everything that is larger than something, or removing everything that is not smaller than something? (In 1d the difference only matters for NaN)

The less risky would be to remove everything larger than something as those have to be pruned for sure.

A compromise would be to remove everything larger and to set to infinity all parameters of a filtration value whose value is larger.

…lization

src/Simplex_tree/include/gudhi/Simplex_tree/Simplex_tree_siblings.h

src/Simplex_tree/concept/FiltrationValue.h

…b.com:hschreiber/gudhi-devel into simplex_tree_filtration_value_generalization

…lization

mglisse

Some comments (on everything but the crucial issue of naming the 2 functions 😉 )

src/Simplex_tree/include/gudhi/Simplex_tree.h

mglisse · 2024-09-09T14:45:07Z

src/Simplex_tree/include/gudhi/Simplex_tree.h

-      if (!(has_children(res_insert.first))) {
-        res_insert.first->second.assign_children(new Siblings(curr_sib, *vi));
-      }
+      res_insert = insert_node_<false, true, false>(curr_sib, *vi, filtration);


My first reaction is that insert_node_<false, true, false> is not very readable. But after looking at it for a while, I'd say ok, we can always revisit this later if we have a better idea.

I agree... But I found it better than having 4 methods doing more or less the same with just small variations. The other solution would have been to just keep the try_emplace and the update_simplex_tree_after_node_insertion in the method and letting the rest in the calling methods, but this would have lessen the point of insert_node_. Methods like insert_simplex_raw are so much more readable now.
If someone has a nicer solution, they are welcome!

src/Simplex_tree/concept/FiltrationValue.h

mglisse · 2024-09-12T15:32:13Z

src/Simplex_tree/include/gudhi/Simplex_tree.h

-      // dimension may need to be lowered
-      dimension_to_be_lowered_ = true;
-      return true;
+      //if filt and simplex.second.filtration() are non comparable, should return false.


For 1-parameter, the old behavior is to remove (return true) a simplex with filtration value NaN. Do we really want to change that?
What is the main motivation

matching the concept ?

keeping NaN

preparing for multiparameter (but the version David described requires more work, at least in the k-critical case, so we will likely disable this function in the beginning)

It was to match the multi-parameter case and I forgot somehow about the NaN case... If we disable this for multi-parameter instead, I would roll back the changes.

I added a condition to includes the NaNs also. If you prefer that we disable this methods for any filtration value which is not an arithmetic native type, I will just put back the old conditions. Otherwise, I would add in the doc of prune_above_filtration that operator< is used to judge if a simplex should be removed or not.

"arithmetic native type" is not the right condition. It is perfectly fine to use this function when the filtration value type is CGAL::Lazy_exact_nt or mpq_class or any other non-native scalar type.
The problematic case is multiparameter (which in the other branch can be tested with something like Options::is_multi_parameter), where the code will need heavier modifications (according to David's description of what the function should do, for instance), and thus it is safer to disable the function for now.
The main drawback of the old code is that it uses <= which isn't currently part of the concept (discussed in another comment).

True, "arithmetic native type" is not the right condition, but I don't see how to discriminate the "right" classes from the "wrong" classes otherwise (I can imagine that someone can come up with other weird cases for filtration other than multi-peristence). In particular, if the reason we want to forbid multi-persistence is that the condition "<" is not always the right choice. "<" is not wrong per-se for multi-persistence, we are just not sure if it is the usual use case for it, if there is one.
So, I was just suggesting that we either explicitly forbid it for everything non trivial or that we allow it for everyone, but we highlight in the documentation, that "above" means "operator<" and if it is not what the user wants, he shouldn't use it.

Forbidding non-builtin Filtration_value would be a regression. It should remain possible to use for instance CGAL::Gmpq, it represents a subset of reals and is perfectly fine for 1-parameter filtrations. Trying to list of options

We could explain in the changelog that people need to implement unify/intersect functions for their code to keep working (the doc already says it).

We could remove the restriction to arithmetic from the default unify/intersect functions. Now only users who want something different from min/max (typically for multipersistence) need to provide their own implementation (which C++ will automatically pick instead of the generic one if it is more specialized).

if !Options::is_multi_parameter, do not call unify/intersect but the default min/max instead (we can have a wrapper in Simplex_tree so we don't have to repeat that test for every use).

I like option 2, I think that's what you were suggesting in the comment above? If not, can you clarify?

I also like solution 2 the best. Even though I was referring to rec_prune_above_filtration in the comment above which does not make use of intersect/unify. But the spirit is the same.

src/Simplex_tree/concept/FiltrationValue.h

src/Simplex_tree/include/gudhi/Simplex_tree/simplex_tree_options.h

src/Simplex_tree/include/gudhi/Simplex_tree.h

src/Simplex_tree/include/gudhi/Simplex_tree/simplex_tree_options.h

mglisse · 2024-09-12T20:38:05Z

src/Simplex_tree/include/gudhi/Simplex_tree/simplex_tree_options.h

+/**
+ * @private
+ * @brief Given two filtration values, stores in the first value the greatest common upper bound of the two values.
+ * If a filtration value has value `NaN`, it should be considered as the lowest value possible.


Why is that? Ah, that's what make_filtration_non_decreasing expects (IIRC we only added a little bit of support for NaN in Simplex_tree to help Alpha_complex). Well actually, NaN means "not set" there (so it should be overridden by anything), which happens to be (almost?) equivalent to -inf in this context, but if we wanted to handle it in unify_births it might correspond to +inf in that context.
I wonder if we should try to formalize a bit what NaN means in the Simplex_tree before extending it to more than a hack in make_filtration_non_decreasing? On the other hand, as long as we don't document the behavior publicly, we can still change it later.
I think in the current uses only f1 can be a NaN (in case we decide to give an asymmetric definition...).

Yes, it was just to comply to what make_filtration_non_decreasing expected.
NaN meaning "not set" makes sense to me. Otherwise, I am not sure what other use NaN could have when not "there is a problem here".

But you are right, we should decide, if we want to handle NaN in general or not and how (except if we keep everything in the backend as for now and in general, we just pretend that NaN is not possible). For example, unify_birth does not handle NaN, because we didn't before, but if we introduce NaN to push_to_smallest_common_upper_bound because needed in make_filtration_non_decreasing, it looks a bit inconsistent for an "outside use".

mglisse · 2024-09-24T20:04:15Z

src/Simplex_tree/concept/FiltrationValue.h

+ * its subsimplices of same filtration value) provides an indexing scheme
+ * (see IndexingTag).
+ */
+struct FiltrationValue {


I sometimes wonder if we should rename it to FiltrationValueForSimplexTree. But the name of concepts is not that important, it can always be changed later since it does not appear in users' code.

If the concepts of FiltrationValue are not the same for other classes, it would make sense to rename it for the long run. But I have the feeling that for now, everything is more or less complying with the needs of the simplex tree, as it is mostly used as double.

src/Simplex_tree/concept/FiltrationValue.h

src/Simplex_tree/include/gudhi/Simplex_tree/simplex_tree_options.h

mglisse · 2024-10-19T12:48:45Z

src/Simplex_tree/include/gudhi/Simplex_tree.h

-      // dimension may need to be lowered
-      dimension_to_be_lowered_ = true;
-      return true;
+      //if filt and simplex.second.filtration() are non comparable, should return false.


Forbidding non-builtin Filtration_value would be a regression. It should remain possible to use for instance CGAL::Gmpq, it represents a subset of reals and is perfectly fine for 1-parameter filtrations. Trying to list of options

We could explain in the changelog that people need to implement unify/intersect functions for their code to keep working (the doc already says it).

We could remove the restriction to arithmetic from the default unify/intersect functions. Now only users who want something different from min/max (typically for multipersistence) need to provide their own implementation (which C++ will automatically pick instead of the generic one if it is more specialized).

if !Options::is_multi_parameter, do not call unify/intersect but the default min/max instead (we can have a wrapper in Simplex_tree so we don't have to repeat that test for every use).

I like option 2, I think that's what you were suggesting in the comment above? If not, can you clarify?

src/Simplex_tree/include/gudhi/Simplex_tree/simplex_tree_options.h

src/Simplex_tree/include/gudhi/Simplex_tree.h

mglisse · 2024-10-19T15:37:28Z

src/Simplex_tree/include/gudhi/Simplex_tree.h

+   * @param ignore_infinite_values If true, simplices with filtration values at infinity are not cached.
+   */
+  template<typename Comparator>
+  void initialize_filtration(Comparator&& is_before_in_filtration, bool ignore_infinite_values = false) {


Is "ignore_infinite_values" enough, or do we want to take the chance to allow some arbitrary predicate?

Sure, we could also use some arbitrary predicate. Some method telling us if a simplex should be ignored or not. Would it replace the ignore_infinite_values version or add another overload with it?

I don't know. initialize_filtration(bool ignore_infinite_values=false) is already in a release, so we shouldn't break it. But the new overload that takes a comparator is for advanced users and already takes a functor, so it shouldn't hurt much to pass it a lambda as second argument. So I would tend to say "replace", not "add". But I am open to other choices.

Ok. I replaced the second parameter with a functor in the new overload of initialize_filtration and also added some unit tests to make sure it works as intended.

src/Simplex_tree/include/gudhi/Simplex_tree.h

mglisse · 2024-10-19T16:17:36Z

src/Simplex_tree/include/gudhi/Simplex_tree.h

@@ -2595,17 +2676,15 @@ class Simplex_tree {
    if (members_size > 0) {
      if constexpr (!Options::stable_simplex_handles) sib->members_.reserve(members_size);
      Vertex_handle vertex;
-      Filtration_value filtration;
+      Filtration_value filtration(0);


The default value of the multi-parameter filtration values is {-inf} not {}. It seemed more appropriate here to have an empty vector. But I also didn't really verified what deserialize_trivial does, or more precisely, how memcpy works for vectors.

I see. Yes, an empty vector is good, since the only thing we will do with it is overwrite it. That constructor is confusing though. When I see Filtration_value(n), I don't really expect {-inf,-inf,...,-inf} n times. For plain double, Filtration_value filtration; (without (0)) is nicer because it gives sanitizers a chance to warn if there is a case where we forget to assign a value.
But ok for now, we can rediscuss that in the multiparameter branch.
Actually, this seems to be in the deserialization function, does that work in the multiparameter branch?

I remember now another reason why I did this. Sometimes filtration does not get initialized by default with 0 exactly, but with just a very small value close to 0. And then, the constructor of Filtration_simplex_base_dummy complains and throws in Debug mode.

Actually, this seems to be in the deserialization function, does that work in the multiparameter branch?

I naively though that it does as the methods looked quite general, but it does not... So I added some methods to the FiltrationValue concept (saying that here are only necessary in the case of de/serialization) and also some tests in simplex_tree_serialization_unit_test.cpp to make it work.

…lization

mglisse · 2024-10-27T23:32:39Z

src/Simplex_tree/concept/FiltrationValue.h

+   * @brief Given two filtration values, stores in the first value the lowest common upper bound of the two values.
+   * The overload for arithmetic types like `double` or `int` is already implemented as the maximum of the two
+   * given values and can also be used for non native arithmetic types like `CGAL::Gmpq` as long as it has an
+   * `operator<`. The overload is available with @ref Gudhi::intersect in @ref simplex_tree_options.h "".


I don't think we should mention simplex_tree_options.h here

the name of the file does not match the content (the 2 functions look like they could have their own header, although it doesn't matter if we don't advertise their location too much)

users should #include <Gudhi/Simplex_tree.h> if they want this function, they should not directly include simplex_tree_options.h which is an implementation detail. Doxygen "forces" us to show more details than desirable, but let's limit the damage where possible.

Making the doc of the functions public means that the NaN behavior is documented now. I hope this won't come back to bite us in the [rear end]...

I removed the mention of the file. For the NaN behaviour, I could take it out of the description and put it aside in a simple // comment which will be ignored by doxygen. Otherwise, I did not found a way to mix public and private with doxygen for a same description.

…version of initialize_filtration + change is_floating_point to has_quit_NaN for more generality + additional unit tests

src/Simplex_tree/include/gudhi/Simplex_tree.h

mglisse · 2024-10-28T15:49:47Z

src/Simplex_tree/include/gudhi/Simplex_tree.h

+   * @param ignore_simplex Method taking a simplex handle as input and returns true if and only if the corresponding
+   * simplex should be ignored in the filtration and therefore not be cached.


Maybe we could remind users here that it is their responsibility to ensure that the result is a filtration, i.e. when a face of a simplex is ignored, the simplex should be ignored as well.

src/Simplex_tree/include/gudhi/Simplex_tree.h

Co-authored-by: Marc Glisse <marc.glisse@inria.fr>

…value type

hschreiber added 6 commits August 14, 2024 18:53

add references for non arithemtic filtration values

1966df2

generalization of filtration value operations

b760df2

doc

b7269a8

small fixes

70549af

include fix

0416788

include fix

c4e04ca

mglisse reviewed Aug 21, 2024

View reviewed changes

hschreiber added 2 commits August 23, 2024 14:53

doc + bool return for unify_birth and push_to_smallest_common_upper_b…

290de10

…ound

minor fix for prone_above_filtration + Filtration_value concept

4d80e13

mglisse reviewed Aug 23, 2024

View reviewed changes

hschreiber mentioned this pull request Aug 26, 2024

[Simplex tree] Const Simplex_tree #1119

Closed

hschreiber and others added 2 commits August 26, 2024 16:36

add references in simplex tree even for arithmetic filtration values

ddb7d20

Merge branch 'GUDHI:master' into simplex_tree_filtration_value_genera…

35778b2

…lization

mglisse reviewed Aug 30, 2024

View reviewed changes

src/Simplex_tree/include/gudhi/Simplex_tree/Simplex_tree_siblings.h Outdated Show resolved Hide resolved

VincentRouvreau reviewed Sep 2, 2024

View reviewed changes

src/Simplex_tree/concept/FiltrationValue.h Show resolved Hide resolved

hschreiber and others added 4 commits September 3, 2024 12:27

update of SimplexTreeOptions.h

9cc2e76

XMerge branch 'simplex_tree_filtration_value_generalization' of githu…

df5a510

…b.com:hschreiber/gudhi-devel into simplex_tree_filtration_value_generalization

Merge branch 'GUDHI:master' into simplex_tree_filtration_value_genera…

7555436

…lization

move 'insert' of Sibling in Simplex_tree and use it

d07a6f1

mglisse reviewed Sep 12, 2024

View reviewed changes

hschreiber added 3 commits September 16, 2024 11:51

minor fixes

78470f5

merge upstream

6d837af

NaN fix for prune_above_filtration

413b6c1

hschreiber mentioned this pull request Sep 18, 2024

Update from gudhi DavidLapous/multipers#20

Merged

filtration value doc

da79b21

mglisse reviewed Sep 24, 2024

View reviewed changes

mglisse reviewed Oct 19, 2024

View reviewed changes

hschreiber and others added 3 commits October 22, 2024 15:02

Merge branch 'GUDHI:master' into simplex_tree_filtration_value_genera…

1495403

…lization

doc + minor changes

aec50c9

doc

369a965

Merge branch 'GUDHI:master' into simplex_tree_filtration_value_genera…

c77ca28

…lization

hschreiber mentioned this pull request Oct 25, 2024

simplex tree update DavidLapous/multipers#29

Merged

mglisse reviewed Oct 27, 2024

View reviewed changes

hschreiber added 2 commits October 28, 2024 13:55

change unify/intersect name + replaces bool with predicate in second …

1cf4ec7

…version of initialize_filtration + change is_floating_point to has_quit_NaN for more generality + additional unit tests

doc

95210dc

mglisse reviewed Oct 28, 2024

View reviewed changes

hschreiber and others added 4 commits October 29, 2024 13:38

fix (de)serialization for general filtration values

af820c5

Update src/Simplex_tree/include/gudhi/Simplex_tree.h

d92d698

Co-authored-by: Marc Glisse <marc.glisse@inria.fr>

doc

9c2047f

enable deserialization from a simplex tree with different filtration …

e46325e

…value type

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Simplex tree] Generalization of filtration values in Simplex_tree #1122

[Simplex tree] Generalization of filtration values in Simplex_tree #1122

hschreiber commented Aug 19, 2024

mglisse left a comment

mglisse Aug 21, 2024

hschreiber Aug 23, 2024

mglisse Aug 23, 2024

hschreiber Aug 26, 2024

mglisse left a comment

mglisse Sep 9, 2024

hschreiber Sep 16, 2024

mglisse Sep 12, 2024

hschreiber Sep 16, 2024

hschreiber Sep 16, 2024

mglisse Sep 16, 2024

hschreiber Sep 17, 2024 •

edited

Loading

mglisse Oct 19, 2024

hschreiber Oct 22, 2024

mglisse Sep 12, 2024

hschreiber Sep 16, 2024

mglisse Sep 24, 2024

hschreiber Oct 22, 2024

mglisse Oct 19, 2024

mglisse Oct 19, 2024

hschreiber Oct 22, 2024

mglisse Oct 27, 2024

hschreiber Oct 28, 2024

mglisse Oct 19, 2024

hschreiber Oct 22, 2024

mglisse Oct 27, 2024

hschreiber Oct 29, 2024

mglisse Oct 27, 2024

hschreiber Oct 29, 2024

mglisse Oct 28, 2024

		* @param ignore_simplex Method taking a simplex handle as input and returns true if and only if the corresponding
		* simplex should be ignored in the filtration and therefore not be cached.

[Simplex tree] Generalization of filtration values in Simplex_tree #1122

Are you sure you want to change the base?

[Simplex tree] Generalization of filtration values in Simplex_tree #1122

Conversation

hschreiber commented Aug 19, 2024

mglisse left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mglisse left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

hschreiber Sep 17, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

hschreiber Sep 17, 2024 •

edited

Loading