Persistence matrix module #669

hschreiber · 2022-08-09T10:30:06Z

The matrix module is not finished yet, but I would like to have feedback on the structure, in particular on the way the different options are handled.
The options are listed in options.h and they are processed in matrix.h.
The basic functions of a matrix can also be found in matrix.h, and the more specialized functions are in their respective header (*_pairing, *_rep_cycles, *_vine_swap).

There are three different types of matrices:

the basic boundary matrix which can be reduced to R,
the boundary matrix but decomposed in RU (R is the usual reduced boundary matrix and U is such that R times U is the original boundary matrix)
and the matrix representing the base of a chain complex (as defined in Clement's and Steve's paper about zigzag persistence).

In example/comp_test.cpp is a temporary example of how to call all different options implemented yet. The idea was to call all possible functions with trivial parameters to verify that everything compiles correctly, but not to test the functions.

mglisse

I started looking at this PR. I don't have a global view of the design yet (there are many files!), but here are minor details I noted while looking at the code.

src/Persistence_matrix/include/gudhi/utilities/Z2_field.h

src/Persistence_matrix/include/gudhi/utilities/Zp_field.h

mglisse · 2022-10-04T17:52:44Z

src/Persistence_matrix/include/gudhi/utilities/Zp_field.h

+}
+
+template<unsigned int characteristic>
+inline void Zp_field_element<characteristic>::_multiply(unsigned int v)


That doesn't look very efficient. Good thing it is a negligible part of the computation anyway...

It is the most efficient version I could find that computes the modulo multiplication without any overflow. But I agree that this class will probably never be used with such high numbers that an overflow is possible, so it is kind of useless. I will test at some point if this creates a real overhead or not.

src/Persistence_matrix/include/gudhi/utilities/Zp_field.h

src/Persistence_matrix/include/gudhi/boundary_matrix/base_matrix_0000.h

src/Persistence_matrix/include/gudhi/utilities/Zp_field.h

src/Persistence_matrix/include/gudhi/boundary_matrix/base_matrix_0000.h

src/Persistence_matrix/include/gudhi/matrix.h

…l into persistence_matrix

src/Persistence_matrix/include/gudhi/Persistence_matrix/columns/heap_column.h

…tion of operations in Multifields

mglisse · 2024-06-22T04:36:48Z

src/Persistence_matrix/include/gudhi/Persistence_matrix/base_matrix_with_column_compression.h

+   * @warning As the member is static, they can eventually be problems if the matrix is duplicated in several threads.
+   * If this become necessary, the static should be removed in the future.
+   */
+  inline static Simple_object_pool<Column_type> columnPool_;


It seems like a big problem if we cannot even have 2 threads that both compute persistence on completely independent filtrations.

Ah, I forgot about that. I think, we can just remove the static from it and it should work just fine as it is only used inside the compressed matrix.

IIRC you have a comment about that elsewhere in the file (near swap maybe?) saying that it may be a problem.

True, but it is not that much of a problem. It was just to remind me that if I remove the static, I should also swap the pools in swap as the pointers won't belong to the right pool otherwise.

So the right thing will happen if

I swap 2 non-empty matrices

I swap columns that don't belong to the same matrix

? Good then.

If two non empty matrices are swapped, you just have to be careful that the matrices own the right pool at the end. If two columns from different matrices are swapped, their content are swapped but not their addresses, so it is also fine. Were problem can arise when swapping two columns from different matrices, is at the row access, as the cells in the rows will not be swapped (you should first unlink a column and relink it after swap). But this has nothing to do with this pool. Nevertheless, I realized that I never mentioned this in the doc. I could also just add the linking/unlinking in the swap operator.

mglisse · 2024-07-15T19:05:25Z

src/Persistence_matrix/include/gudhi/matrix.h

+   * @ref set_characteristic before calling for the first time a method needing it.
+   * Ignored if @ref PersistenceMatrixOptions::is_z2 is true.
+   */
+  Matrix(unsigned int numberOfColumns, 


Should all the unsigned int used for columns be index_type or some related typedef?

I wanted to use a native type here to indicate that numberOfColumns is not something stored (and therefore doesn't really need to be optimized in size). And the chances are not very high that someone wants to use more than max unsigned int columns. Otherwise, index would be the right type to use.
I am more bothered by the fact that I use unsigned int in one constructor and int in another. This makes no sense.

I wanted to use a native type here to indicate that numberOfColumns is not something stored (and therefore doesn't really need to be optimized in size).

As a user, that's not the message I understand when I read unsigned int, almost the opposite. std::size_t or std::intmax_t (or maybe ptrdiff_t) would be closer.

the chances are not very high that someone wants to use more than max unsigned int columns.

"640K [of RAM] ought to be enough for anybody" 😉
A few years after starting Gudhi, we received a bug report because the code failed on a simplicial complex with 3 billion simplices. Sure, that required an awful lot of memory to reproduce, but I don't see a good reason to hardcode a limit of 4 billions (the default options can still limit the computation to 4G, that's fine).

mglisse · 2024-07-17T21:36:18Z

src/Persistence_matrix/include/gudhi/matrix.h

+  if constexpr (isNonBasic && !PersistenceMatrixOptions::is_of_boundary_type &&
+                PersistenceMatrixOptions::column_indexation_type == Column_indexation_types::CONTAINER)
+    return matrix_.insert_boundary(boundary, dim);
+  else
+    matrix_.insert_boundary(boundary, dim);


If matrix_.insert_boundary(boundary, dim); returns void exactly in the right cases, we could just write return matrix_.insert_boundary(boundary, dim); without need for 2 cases.
(if sometimes matrix_.insert_boundary(boundary, dim); returns a vector but we don't want to return that vector, then forget this comment)

mglisse · 2024-07-17T21:55:30Z

src/Persistence_matrix/include/gudhi/Persistence_matrix/chain_matrix.h

+  using tmp_column_type = typename std::conditional<
+      Master_matrix::Option_list::is_z2, 
+      std::set<id_index>,
+      std::map<id_index, Field_element_type>
+    >::type;


It is surprising that this is not one of the types defined in Persistence_matrix/columns.

mglisse · 2024-07-18T22:43:02Z

src/Persistence_matrix/include/gudhi/persistence_matrix_options.h

+ *
+ * @brief List of column types.
+ */
+enum Column_types { 


Is it possible for a user to use their own column type? There is a concept of what such a column type must look like, but from the use of this enum, it doesn't look like a user has any way of telling matrix to use it.

hschreiber and others added 20 commits June 22, 2022 15:58

initialization

e9f962e

initialization

0183294

options

9a9f9d2

options

4097390

options

b78092f

base matrix

33f866d

matrices

dcde95a

matrices

cc39397

matrices

9a4fb46

matrices

f04174c

debug

7172a77

debug compilation

e006851

cleanup

e956f40

compilation error correction + simple compilation test in examples

8c5f2e3

Merge branch 'GUDHI:master' into persistence_matrix

7dfe4c2

Merge branch 'GUDHI:master' into persistence_matrix

94f5932

fixed shadowing names

be3e43d

made optional method accessible + skeleton of overlay classes

5cf74ff

overlay for indexation

ddb1fc6

unit test for field classes and common methods of matrices

5ea389f

mglisse reviewed Oct 5, 2022

View reviewed changes

VincentRouvreau mentioned this pull request Oct 5, 2022

c++17 as a new standard to compile the library #697

Merged

hschreiber and others added 8 commits October 6, 2022 15:51

unit test for specialized matrix methods

0edbec4

deleting temporary compilation test

af6f889

Merge branch 'GUDHI:master' into persistence_matrix

bebe370

Merge branch 'persistence_matrix' of github.com:hschreiber/gudhi-deve…

b08b513

…l into persistence_matrix

Merge branch 'GUDHI:master' into persistence_matrix

f9b65e2

Merge branch 'GUDHI:master' into persistence_matrix

46c9125

Merge branch 'persistence_matrix' of github.com:hschreiber/gudhi-deve…

6b0dc90

…l into persistence_matrix

hide friends and correction of swap calls

3390def

hschreiber added 3 commits June 5, 2024 15:19

few optimisations for zp

abb93a6

factorization for better readability

55d17e5

correction for column hash method

5bf1b6d

mglisse reviewed Jun 12, 2024

View reviewed changes

src/Persistence_matrix/include/gudhi/Persistence_matrix/columns/heap_column.h Outdated Show resolved Hide resolved

hschreiber and others added 17 commits June 12, 2024 15:58

heap hash method correction

f131e31

doc correction

6e886a3

change of default dummy value for field characteristic and simplifica…

c207e6e

…tion of operations in Multifields

change of heap column == and > operators

8b10f70

test cleanup

d3dc1b4

fix CI compilation problem?

2249aca

reduces default unit tests to less than 30s each

1cbec06

include multifields only when gmpxx gound

28ef360

reduces the number of unit tests done for the matrices by default

79d2172

fix CI compilation problem?

0c8e7f5

fix is_non_zero()

b9424b7

fix gmp link error

cfee83d

fix memory leaks in heap_column

d802887

Merge branch 'GUDHI:master' into persistence_matrix

8d77ab7

fix windows compilation and test error

656c70c

Merge branch 'GUDHI:master' into persistence_matrix

a971ddd

Rollback merge issue with submodule

b2adaeb

VincentRouvreau merged commit e4419b6 into GUDHI:master Jun 20, 2024
6 of 7 checks passed

mglisse reviewed Jun 22, 2024

View reviewed changes

VincentRouvreau mentioned this pull request Jun 24, 2024

Cannot have 2 threads that both compute persistence on completely independent filtrations. #1082

Closed

hschreiber mentioned this pull request Jun 25, 2024

[Persistence_matrix] Rows not updated when two columns from different matrices are swapped. #1086

Open

VincentRouvreau added the 3.10.0 GUDHI version 3.10.0 label Jun 25, 2024

mglisse reviewed Jul 15, 2024

View reviewed changes

mglisse reviewed Jul 17, 2024

View reviewed changes

mglisse reviewed Jul 18, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Persistence matrix module #669

Persistence matrix module #669

hschreiber commented Aug 9, 2022

mglisse left a comment

mglisse Oct 4, 2022

hschreiber Oct 12, 2022

mglisse Jun 22, 2024

hschreiber Jun 25, 2024

mglisse Jun 25, 2024

hschreiber Jun 25, 2024

mglisse Jun 25, 2024

hschreiber Jun 25, 2024

mglisse Jul 15, 2024

hschreiber Jul 16, 2024

mglisse Jul 17, 2024

mglisse Jul 17, 2024

mglisse Jul 17, 2024

mglisse Jul 18, 2024

Persistence matrix module #669

Persistence matrix module #669

Conversation

hschreiber commented Aug 9, 2022

mglisse left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment