-
Notifications
You must be signed in to change notification settings - Fork 163
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Speed up the linear algebra part of character table calculations for (large) groups #4773
Conversation
3bd5add
to
a62a90e
Compare
# rank higher than a method in the semigroups package, which otherwise jumps | ||
# in and causes an error when testing | ||
# line 318 of semigroups-3.4.0/gap/elements/semiringmat.gi | ||
20, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we'll have to sort it out together with the Semigroups folks, though, don't want to break that package (paging @james-d-mitchell).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for working on this! I left some remarks.
@@ -1345,6 +1353,8 @@ InstallMethod( Characteristic, | |||
[ IsMatrixOrMatrixObj ], | |||
M -> Characteristic( BaseDomain( M ) ) ); | |||
|
|||
InstallOtherMethod(DefaultFieldOfMatrix,[IsMatrixOrMatrixObj],BaseDomain); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The documentation for DefaultFieldOfMatrix
explicitly states that either a field or fail
is returned; this may return other things.
Is it really necessary to make this overload?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have been using DefaultFiledOfMatrix
in my code to get the definition field of an (ordinary) matrix. How would we get it in a uniform way, regardless whether an object is IsMatrix
or IsMatrixObj
. Will BaseDomain
work for the "classical" matrices?
As for "necessary", it is used in the standard matrix routines.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes BaseDomain
works for "classical" matrices (but there was a bug: it only looked .
But since it has different semantics, it is not so clear what the "correct" answer should be for e.g. an identity matrix with integer entries: should it be Integers
or Rationals
? This is of course clear with DefaultFiledOfMatrix
. But BaseDomain
currently returns Integers
. We could of course change that, but in the end, it'll always be a guessing game; which is after all one of the primary reasons why we want MatrixObj: so that we can stop guessing and start specifying what we want.
lib/matobjnz.gi
Outdated
# Representation preserving constructors: | ||
############################################################################ | ||
|
||
InstallMethod( ZeroVector, "for an integer and a zmodnz vector", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of defining all these ZeroVector
methods, it should (in theory! unless we have screwed up!) be sufficient to define a single NewZeroVector
constructor; then the existing ZeroVector
methods dispatch to it as needed. At least that's the theory.
Same for Matrix
-> NewMatrix
and so on.
I.e., implementors should install methods for the New*
constructors, while user code should use the operations without the New
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With these installations I have been going parallel to matobjplist.gi
-- indeed some code is almost verbatim copied. I'll see whether it works without the IsZero
methods.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry about that: Unfortunately that file is not up-to-date. My PR #2973 to update it hasn't been finished but should be closer to how things ought to be.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So probably one should look over it again, once #2973 is included. In any case it is worth having the curretn version of the file already now as it really speeds up arithmetic.
lib/matrix.gi
Outdated
"generic method for matrices", | ||
[ IsMatrix and IsMutable], | ||
[ IsMatrixOrMatrixObj and IsMutable], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not appropriate: this method requires mat
to be in filter IsRowListMatrix
, i.e., to be row-based and allow access to matrix objects via mat[i]
. It also uses DefaultFieldOfMatrix
, and indeed, it expects non-zero entries to be invertible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the issue about not working at all, or just being slow? This patch makes no caim to be a solve-all, but just to provide some functionality that is otherwise missing.
I'm happy to separate the IsMatrixObj
version and add restrictions, but I did not want to duplicate too much code.
The issue with non-zero entries being invertible also applies to the existing IsMatrix
version.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general mat[i]
won't work on general MatrixObj instances. We debated about providing a default implementation which uses proxy objects to "fake it", but there are all kinds of pitfalls with this, and results in super inefficient code. So in the end, we agreed that it's better to get an error and be able to fix the affected code.
Ideally, it shouldn't be necessary to have two versions of the code, though: if AddMatrixRows
etc. are used then the resulting code should work both for new and old style matrices, i.e. for everything in the filter IsMatrixOrMatrixObj
. The main question is performance: using AddMatrixRows
might be slower initially when used on "old style matrices". However, there is still quite some optimization potential there. E.g. one source of slow down is method dispatch is slow for lists, and thus for old-style matrices; this could be dealt with using InstallEarlyMethod
and/or some kernel methods; I'd be happy to help work on tuning this.
3e47272
to
82a4a0e
Compare
This looks pretty good now; @ThomasBreuer could you please also have a look? One concern is that a lot of the new code is not being tested at all. But this is also something that could (and should) be addressed on a generic level: We need to write a (set of) generic test routine(s) that take a matrix object (or a function construction some, or whatever) and performs a bunch of generic tests with them, to verify that the APIs work as intended for that implementation. That could probably borrow from (and replace) the limited tests in |
r:=rec(base:=pcomp); | ||
else | ||
r:=rec(base:=SumIntersectionMat(pcomp, | ||
Matrix(BaseDomain(r.base[1]),r.base))[2]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it perhaps make sense to make sure that the base
component is a MatrixObj
from the beginning, then calling Matrix
here would not be necessary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It could be. In the old setup a matrix is just a list of vectors, and there was no need to distinguish the concepts. Here, one has to, and it seemed naively that a basis should be a list of vectors rather than a matrix.
od; | ||
od; | ||
return n; | ||
end); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we have to define what SemiEchelonMatTransformation
returns if the argument is a MatrixObj.
The above function assumes that relations
contains a list of lists; is this reasonable?
(And in case that the nullspace is trivial, the result should be a MatrixObj.)
And wouldn't Matrix( <filt>, <R>, <M> )
be the function to be called for creating a MatrixObj from a prescribed matrix?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this needs cleanup, and I felt I did not yet have sufficient experience with matrix objects to make this definition.
"generic method for matrices", | ||
[ IsMatrix and IsMutable ], | ||
[ IsMatrixOrMatrixObj and IsMutable ], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think rather the declaration of SemiEchelonMatDestructive
should be changed to expect IsMatrixOrMatrixObj and IsMutable
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would be happy with that.
@fingolfin I had a look and left some comments. |
b9f7665
to
fe7fc73
Compare
I have added a (relatively basic, but quick ) test example -- character table of M24. I have further code #4797 that makes |
f4f5acc
to
1fa64be
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How shall we proceed with this? There are some issues, but it still is an improvement, and the issues listed are not more severe than other issues that still exist in the MatrixObj code, sooo... we could merge it and improve later, together with other things; perhaps even organize another MatrixObj coding sprint.
The only "danger" I see in merging is that someone might start to rely on some specific (but undocumented) code behavior and then if we change it, things break for them. But that's already the case with the rest of MatrixObj...
Thoughts?
@@ -336,6 +309,7 @@ DxRegisterModularChar := function(D,c) | |||
local d,p; | |||
# it may happen,that an irreducible character will be registered twice: | |||
# 2-dim space,1 Orbit,combinatoric split. Avoid this! | |||
if IsPlistRep(c) then c:=Vector(c); fi; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are the entries of D.modulars
modified later on, or will they be immutable? In the latter case, you could also try
if IsPlistRep(c) then c:=Vector(c); fi; | |
if IsPlistRep(c) then c:=ImmutableVector(SOME_RING, c); fi; |
provided SOME_RING
can be derived from D
or can be passed in some other way to this function.
Even if it is mutable, it might be a good idea to pass the basering and/or filter: The problem with Vector(c)
is that if one passes a list-of-GF(4) elements to it, then sometimes one will get a GF(4) vector yet sometimes a GF(2) vector, which usually is not desirable
Row:=CompatibleVector(M); | ||
for col in [1..Length(Row)] do | ||
Row[col]:=base[row,col]; | ||
od; | ||
Row:=Row*M; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note to self: I think we need a function doing this; i.e. which takes a matrix M
, and a list of co/a vector v
, and returns a new vector in "compatible" representation.
Though in this case, perhaps base
could be turned into a list of vectors of the right type from the start, i.e. as part of its construction?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@fingolfin What about the syntax Vector( <M>, <v> )
?
@@ -1345,6 +1353,8 @@ InstallMethod( Characteristic, | |||
[ IsMatrixOrMatrixObj ], | |||
M -> Characteristic( BaseDomain( M ) ) ); | |||
|
|||
InstallOtherMethod(DefaultFieldOfMatrix,[IsMatrixOrMatrixObj],BaseDomain); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes BaseDomain
works for "classical" matrices (but there was a bug: it only looked .
But since it has different semantics, it is not so clear what the "correct" answer should be for e.g. an identity matrix with integer entries: should it be Integers
or Rationals
? This is of course clear with DefaultFiledOfMatrix
. But BaseDomain
currently returns Integers
. We could of course change that, but in the end, it'll always be a guessing game; which is after all one of the primary reasons why we want MatrixObj: so that we can stop guessing and start specifying what we want.
I am very much in favor of merging. As far as I can see , while not perfect, it does not break (or slow down) existing functionality, but gives a substantial speedup to character table calculations. If there are changes that must be done before merging, please let me know and I will be happy to make them, but lets not hold these improvements back just because some concepts for matrix objects are not yet perfect -- only if we use them we will find out what the right objects are. |
I would also propose to merge the current state. |
So, @fingolfin @ThomasBreuer , is one of you wiling to approve, so we can merge? |
with start value
Relaxed some method installations, respectively almost duplicated some methods, so that basic linear algebra becomes available for matrix objects (at least over fields). This might be far from optimal and is just a first approximation.
Create matrix/vector objects for (Integers mod m) (as ZmodnZ) that store the entries as list of integers and modulo-reduce in the arithmetic. This is significantly faster than to keep lists of lists of ZmodnZObj, since then there is a permanent packing/unpacking of ZmodnZObj.
By using matrix objects (in particular for GF(p), p>65536), matrix arithmetic becomes significantly faster and can speed up calculation cost for larger groups substantially. This resolves gap-system#4716
The semigroups package has a method for `Matrix` that causes problems.
It's a shame that this was merged before the impact on packages was fully realised, e689844 completely breaks the Semigroups package. Please see Issue #4814. @hulpke @fingolfin |
Added code for matrix objects that represent matrices over (Integers mod m) for m nonprime or >2^16.
By keeping entries as integers and reducing in the arithmetic we gain a significant amount over lists of lists of ZmodnZObj, by avoiding packing/unpacking at each arithmetic step.
Use this in the Dixon-Schneider Algorithm. In examples (e.g. a group of order ~10^9 with ~550 classes) this leads to an overall speedup of 50% or more.
Text for release notes
The linear algebra part of character table calculations for groups has been sped up significantly.