Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speed up the linear algebra part of character table calculations for (large) groups #4773

Merged
merged 6 commits into from
Mar 2, 2022

Conversation

hulpke
Copy link
Contributor

@hulpke hulpke commented Feb 16, 2022

Added code for matrix objects that represent matrices over (Integers mod m) for m nonprime or >2^16.
By keeping entries as integers and reducing in the arithmetic we gain a significant amount over lists of lists of ZmodnZObj, by avoiding packing/unpacking at each arithmetic step.
Use this in the Dixon-Schneider Algorithm. In examples (e.g. a group of order ~10^9 with ~550 classes) this leads to an overall speedup of 50% or more.

Text for release notes

The linear algebra part of character table calculations for groups has been sped up significantly.

@hulpke hulpke added kind: enhancement Label for issues suggesting enhancements; and for pull requests implementing enhancements release notes: to be added PRs introducing changes that should be (but have not yet been) mentioned in the release notes labels Feb 16, 2022
@hulpke hulpke force-pushed the additions branch 4 times, most recently from 3bd5add to a62a90e Compare February 17, 2022 18:52
Comment on lines +300 to +303
# rank higher than a method in the semigroups package, which otherwise jumps
# in and causes an error when testing
# line 318 of semigroups-3.4.0/gap/elements/semiringmat.gi
20,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we'll have to sort it out together with the Semigroups folks, though, don't want to break that package (paging @james-d-mitchell).

Copy link
Member

@fingolfin fingolfin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this! I left some remarks.

@@ -1345,6 +1353,8 @@ InstallMethod( Characteristic,
[ IsMatrixOrMatrixObj ],
M -> Characteristic( BaseDomain( M ) ) );

InstallOtherMethod(DefaultFieldOfMatrix,[IsMatrixOrMatrixObj],BaseDomain);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The documentation for DefaultFieldOfMatrix explicitly states that either a field or fail is returned; this may return other things.

Is it really necessary to make this overload?

Copy link
Contributor Author

@hulpke hulpke Feb 18, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have been using DefaultFiledOfMatrix in my code to get the definition field of an (ordinary) matrix. How would we get it in a uniform way, regardless whether an object is IsMatrix or IsMatrixObj. Will BaseDomain work for the "classical" matrices?

As for "necessary", it is used in the standard matrix routines.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes BaseDomain works for "classical" matrices (but there was a bug: it only looked .

But since it has different semantics, it is not so clear what the "correct" answer should be for e.g. an identity matrix with integer entries: should it be Integers or Rationals? This is of course clear with DefaultFiledOfMatrix. But BaseDomain currently returns Integers. We could of course change that, but in the end, it'll always be a guessing game; which is after all one of the primary reasons why we want MatrixObj: so that we can stop guessing and start specifying what we want.

lib/matobjnz.gi Outdated
# Representation preserving constructors:
############################################################################

InstallMethod( ZeroVector, "for an integer and a zmodnz vector",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of defining all these ZeroVector methods, it should (in theory! unless we have screwed up!) be sufficient to define a single NewZeroVector constructor; then the existing ZeroVector methods dispatch to it as needed. At least that's the theory.

Same for Matrix -> NewMatrix and so on.

I.e., implementors should install methods for the New* constructors, while user code should use the operations without the New.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With these installations I have been going parallel to matobjplist.gi -- indeed some code is almost verbatim copied. I'll see whether it works without the IsZero methods.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry about that: Unfortunately that file is not up-to-date. My PR #2973 to update it hasn't been finished but should be closer to how things ought to be.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So probably one should look over it again, once #2973 is included. In any case it is worth having the curretn version of the file already now as it really speeds up arithmetic.

lib/matobjnz.gi Outdated Show resolved Hide resolved
lib/matobjnz.gi Show resolved Hide resolved
lib/matobjnz.gi Outdated Show resolved Hide resolved
lib/matobjnz.gi Outdated Show resolved Hide resolved
lib/matrix.gi Outdated
"generic method for matrices",
[ IsMatrix and IsMutable],
[ IsMatrixOrMatrixObj and IsMutable],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not appropriate: this method requires mat to be in filter IsRowListMatrix, i.e., to be row-based and allow access to matrix objects via mat[i]. It also uses DefaultFieldOfMatrix, and indeed, it expects non-zero entries to be invertible.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the issue about not working at all, or just being slow? This patch makes no caim to be a solve-all, but just to provide some functionality that is otherwise missing.

I'm happy to separate the IsMatrixObj version and add restrictions, but I did not want to duplicate too much code.

The issue with non-zero entries being invertible also applies to the existing IsMatrix version.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general mat[i] won't work on general MatrixObj instances. We debated about providing a default implementation which uses proxy objects to "fake it", but there are all kinds of pitfalls with this, and results in super inefficient code. So in the end, we agreed that it's better to get an error and be able to fix the affected code.

Ideally, it shouldn't be necessary to have two versions of the code, though: if AddMatrixRows etc. are used then the resulting code should work both for new and old style matrices, i.e. for everything in the filter IsMatrixOrMatrixObj. The main question is performance: using AddMatrixRows might be slower initially when used on "old style matrices". However, there is still quite some optimization potential there. E.g. one source of slow down is method dispatch is slow for lists, and thus for old-style matrices; this could be dealt with using InstallEarlyMethod and/or some kernel methods; I'd be happy to help work on tuning this.

@fingolfin
Copy link
Member

This looks pretty good now; @ThomasBreuer could you please also have a look?

One concern is that a lot of the new code is not being tested at all. But this is also something that could (and should) be addressed on a generic level: We need to write a (set of) generic test routine(s) that take a matrix object (or a function construction some, or whatever) and performs a bunch of generic tests with them, to verify that the APIs work as intended for that implementation. That could probably borrow from (and replace) the limited tests in tst/testinstall/MatrixObj/ and elsewhere. Then authors of new implementations could simply invoke those existing generic tests, and already get substantial coverage of their new implementation (and confidence that things work as they should)

r:=rec(base:=pcomp);
else
r:=rec(base:=SumIntersectionMat(pcomp,
Matrix(BaseDomain(r.base[1]),r.base))[2]);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it perhaps make sense to make sure that the base component is a MatrixObj from the beginning, then calling Matrix here would not be necessary?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It could be. In the old setup a matrix is just a list of vectors, and there was no need to distinguish the concepts. Here, one has to, and it seemed naively that a basis should be a list of vectors rather than a matrix.

lib/ctblgrp.gi Outdated Show resolved Hide resolved
od;
od;
return n;
end);
Copy link
Contributor

@ThomasBreuer ThomasBreuer Feb 23, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we have to define what SemiEchelonMatTransformation returns if the argument is a MatrixObj.
The above function assumes that relations contains a list of lists; is this reasonable?
(And in case that the nullspace is trivial, the result should be a MatrixObj.)

And wouldn't Matrix( <filt>, <R>, <M> ) be the function to be called for creating a MatrixObj from a prescribed matrix?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this needs cleanup, and I felt I did not yet have sufficient experience with matrix objects to make this definition.

"generic method for matrices",
[ IsMatrix and IsMutable ],
[ IsMatrixOrMatrixObj and IsMutable ],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think rather the declaration of SemiEchelonMatDestructive should be changed to expect IsMatrixOrMatrixObj and IsMutable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would be happy with that.

@ThomasBreuer
Copy link
Contributor

@fingolfin I had a look and left some comments.
@hulpke Thanks a lot. Concerning my more general comments on the setup of matrix objects (and the documentation), perhaps I can deal with these issues after your pull request got merged.

@hulpke hulpke force-pushed the additions branch 3 times, most recently from b9f7665 to fe7fc73 Compare February 23, 2022 23:42
@hulpke
Copy link
Contributor Author

hulpke commented Feb 24, 2022

One concern is that a lot of the new code is not being tested at all.

I have added a (relatively basic, but quick ) test example -- character table of M24.

I have further code #4797 that makes ImmutableMatrix over ZmodnZ rings use the new matrices (so groups of such matrices use the compact form) and (for fields) enable the MeatAxe. This makes it easy to get lots of testing through calling higher level routines.

Copy link
Member

@fingolfin fingolfin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How shall we proceed with this? There are some issues, but it still is an improvement, and the issues listed are not more severe than other issues that still exist in the MatrixObj code, sooo... we could merge it and improve later, together with other things; perhaps even organize another MatrixObj coding sprint.

The only "danger" I see in merging is that someone might start to rely on some specific (but undocumented) code behavior and then if we change it, things break for them. But that's already the case with the rest of MatrixObj...

Thoughts?

@@ -336,6 +309,7 @@ DxRegisterModularChar := function(D,c)
local d,p;
# it may happen,that an irreducible character will be registered twice:
# 2-dim space,1 Orbit,combinatoric split. Avoid this!
if IsPlistRep(c) then c:=Vector(c); fi;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are the entries of D.modulars modified later on, or will they be immutable? In the latter case, you could also try

Suggested change
if IsPlistRep(c) then c:=Vector(c); fi;
if IsPlistRep(c) then c:=ImmutableVector(SOME_RING, c); fi;

provided SOME_RING can be derived from D or can be passed in some other way to this function.

Even if it is mutable, it might be a good idea to pass the basering and/or filter: The problem with Vector(c) is that if one passes a list-of-GF(4) elements to it, then sometimes one will get a GF(4) vector yet sometimes a GF(2) vector, which usually is not desirable

Comment on lines +711 to +715
Row:=CompatibleVector(M);
for col in [1..Length(Row)] do
Row[col]:=base[row,col];
od;
Row:=Row*M;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note to self: I think we need a function doing this; i.e. which takes a matrix M, and a list of co/a vector v, and returns a new vector in "compatible" representation.

Though in this case, perhaps base could be turned into a list of vectors of the right type from the start, i.e. as part of its construction?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fingolfin What about the syntax Vector( <M>, <v> )?

@@ -1345,6 +1353,8 @@ InstallMethod( Characteristic,
[ IsMatrixOrMatrixObj ],
M -> Characteristic( BaseDomain( M ) ) );

InstallOtherMethod(DefaultFieldOfMatrix,[IsMatrixOrMatrixObj],BaseDomain);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes BaseDomain works for "classical" matrices (but there was a bug: it only looked .

But since it has different semantics, it is not so clear what the "correct" answer should be for e.g. an identity matrix with integer entries: should it be Integers or Rationals? This is of course clear with DefaultFiledOfMatrix. But BaseDomain currently returns Integers. We could of course change that, but in the end, it'll always be a guessing game; which is after all one of the primary reasons why we want MatrixObj: so that we can stop guessing and start specifying what we want.

@hulpke
Copy link
Contributor Author

hulpke commented Mar 1, 2022

How shall we proceed with this? There are some issues, but it still is an improvement, and the issues listed are not more severe than other issues that still exist in the MatrixObj code, sooo... we could merge it and improve later, together with other things; perhaps even organize another MatrixObj coding sprint.

I am very much in favor of merging. As far as I can see , while not perfect, it does not break (or slow down) existing functionality, but gives a substantial speedup to character table calculations.

If there are changes that must be done before merging, please let me know and I will be happy to make them, but lets not hold these improvements back just because some concepts for matrix objects are not yet perfect -- only if we use them we will find out what the right objects are.

@ThomasBreuer
Copy link
Contributor

@hulpke

I am very much in favor of merging.

I would also propose to merge the current state.
My comment from last week was intended to propose this, and to open a new pull request in order to deal with the general changes that are motivated by the current pull request.

@hulpke
Copy link
Contributor Author

hulpke commented Mar 2, 2022

So, @fingolfin @ThomasBreuer , is one of you wiling to approve, so we can merge?

Relaxed some method installations, respectively almost duplicated some
methods, so that basic linear algebra becomes available for matrix objects
(at least over fields).

This might be far from optimal and is just a first approximation.
Create matrix/vector objects for (Integers mod m) (as ZmodnZ) that store the
entries as list of integers and modulo-reduce in the arithmetic. This is
significantly faster than to keep lists of lists of ZmodnZObj, since then
there is a permanent packing/unpacking of ZmodnZObj.
By using matrix objects (in particular for GF(p), p>65536), matrix
arithmetic becomes significantly faster and can speed up calculation cost
for larger groups substantially.

This resolves gap-system#4716
The semigroups package has a method for `Matrix` that causes problems.
@hulpke hulpke merged commit 3e9fb2a into gap-system:master Mar 2, 2022
@james-d-mitchell
Copy link
Contributor

It's a shame that this was merged before the impact on packages was fully realised, e689844 completely breaks the Semigroups package. Please see Issue #4814. @hulpke @fingolfin

@fingolfin
Copy link
Member

As far as I know, use of this enhancement has been effectively disabled by PR #4983 (hopefully only temporarily, see #4988), so I am not going to add this to the release notes. Should I be mistaken in this assessment, we can still add it retroactively in GAP 4.12.1.

@fingolfin fingolfin changed the title Matrix Objects for matrices of ZmodnzObj, Use in Dixon-Schneider Speed up the linear algebra part of character table calculations for (large) groups Aug 17, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind: enhancement Label for issues suggesting enhancements; and for pull requests implementing enhancements release notes: to be added PRs introducing changes that should be (but have not yet been) mentioned in the release notes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants