-
Notifications
You must be signed in to change notification settings - Fork 163
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enhance DirectFactorsOfGroup for faster StructureDescription #379
Enhance DirectFactorsOfGroup for faster StructureDescription #379
Conversation
Thanks for your submission, appreciated. I'll add comments to your code changes next, as part of reviewing this PR. One immediate comment: You mention tests you performed. It would be good if you could add some tests to this PR, too. I am also somewhat concerned about the method being drastically slower in certain cases. Can you provide a few concrete examples were things get noticably slower / faster? |
## </Description> | ||
## </ManSection> | ||
## | ||
DeclareOperation( "IsTrivialNormalIntersection",[IsGroup, IsGroup, IsGroup]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The formatting here is off, lots of spaces are missing. E.g. this one should be
DeclareOperation( "IsTrivialNormalIntersection", [ IsGroup, IsGroup, IsGroup ] )
in order to fit in with the rest of the GAP library code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I did not know about the GAP formatting protocol. Could you please point me to it? Thank you.
I have not yet had the chance to check GAPDoc, and I have no idea about how to make manuals, so if it explained there, then I certainly have not yet read it. I will gladly comply with it and make the code into the proper formatting after I know exactly what to do.
I am not sure how to write replies, and how to use the github facilities properly, but I will try to do my best. In any case, thank you very much for all the comments and feedback. About the algorithm: there are two separate things here. The KN-method is implemented, but I found that some parts are simply do not seem efficient. So I only incorporated those parts into the main algorithm that indeed seemed to compute the direct factors faster. The main example here is to look for a direct factor in the center. Further, I use some of my own heuristics in the main algorithm which seemed reasonable. Most of these are cheap but can drastically reduce computing time. In particular, the main method should_not be much slower than the other in any cases, because all checks that are done should be cheap (in some sense). Tests: I have no idea how you would want me to put the tests here. I have a "work" branch in which I still have the original DirectFactorsOfGroup, and I simply ran through groups calculating the output and time the calculation took. Then with another set of commands I simply compared them. For the comparison I decided to consider the time difference significant if the difference was at least 200ms and at least 10%. On my own machine I usually only tested SmallGroups up to size 128, but on a supercomputer I did the rest. There are shell scripts that generate the input files, runs the commands, etc. If you tell me how you want me to upload them, I will gladly do it. (I want to rerun them now with Hulpke's fix in #376 but somehow jobs are not starting currently on the supercomputer I am using. (BTW, I use 1-core GAP, and do not use HPC-GAP at all) Results of tests: (only some examples for now, can be made more precise if you want) gap> DirectFactorsOfGroupOld(SmallGroup(64,266));; time; gap> DirectFactorsOfGroupOld(SmallGroup(128,2300));; time; gap> DirectFactorsOfGroupOld(SmallGroup(128,2326));; time; gap> DirectFactorsOfGroupOld(SmallGroup(128,2327));; time; For SmallGroups of size 1-511 the new method was faster for 1 group with difference at least 1000 sec (not msec !) In comparison, there was 45 (!) groups where the old method was faster by at As for % faster, among these groups there were 2603 examples where the SmallGroup(128,5) is a typical example where the new method is slower than the old: gap> DirectFactorsOfGroup(SmallGroup(128, 5));; time; For big permuation groups, Stefan Kohl has a list. (BTW, he helped me a lot with all these, thank you.) The 18th of his examples was one where NormalSubgroups did not compute, because of the many complement problem. With the new method: gap> tpg := ReadAsFunction("testpermgroups.g")();; For the old method I have this after several minutes: gap> DirectFactorsOfGroupOld(G); time; Again, as I said, I will rerun the tests now, that Hulpke's fix is merged, and I will gladly run any other tests you want. I will try to reply to the other comments within the comment if I can. |
Just added some documentation to the commands to make it more clear which one does what. I will apply the other suggestions later. Can I ask for some help on how to separate the trailing whitespace removals from the other changes into a separate commit? Thank you. |
Commit 2ae9c90 enhances the main algorithm further. There is no theoretical enhancement, only coding. I did testing with commit 2ae9c90 and compared to the old DirectFactorsOfGroup. All SmallGroups were tested except for orders 512, 768, 1024, 1280, 1536, 1792, 1920 and 873 test permutation groups were tested, as well. I checked whether any of the two methods was at least 200ms and 10% faster than the other. The old method for order 1152 groups have not yet finished. On the SmallGroups, there were (altogether) 64 instances when the old method was faster, each of them is by less than a second. As a comparison, the current DirectFactorsOfGroup was faster Here, in 30 instances the improvement was more than 1000 sec (!), e.g. SmallGroup(256, 56091) is a huge improvement by 2 full hours. In about 800 cases the improvement is between 100-1000 sec, in about 16000 cases the improvement is between 10-100 sec. For the 873 test permutation groups the new method was faster 75 many times, the old method was faster 17 many times, 6 times by less than 1 second, 10 times by 1-10 seconds, and for one group the old method is faster by 80 seconds (this was a 30% decrease in speed, probably due to a bad choice of chief series). The new method was faster 3 times by about full 2 minutes, 23 times by 10-100 seconds, 20 times by 1-10 seconds, 29 times by less than 1 second. Using DirectFactorsOfGroupKN did not yet finish for most of these groups. I still have some improvements (for example breaking the loop in the KN method) which I would like to implement, and I should get the whitespace changes into a separate commit if somebody helps me with that. |
Thanks for all that work! Out of curiosity: Did you also compare the results of the old DirectFactorsOfGroup to the new one (or verify the structure of the results otherwise)? If you have that testing code available, though having a long runtime, it'd be a good test to include somewhere. If you are keeping (detailed) runtimes, I'd be interested in them, too. |
@markuspf Yes, for the SmallGroups I compute the IdGroup of the results, sort it and compare the two lists (and they were identical for all groups tested). For the bigger permutation groups, I compute the list of the sizes of the factors and compare them. These lists were identical, as well. I have two files: testFunction.g for computing the direct factors, and then testCompare.g for actually comparing them. On my own laptop I usually check for SmallGroups of size 1-255, and only if that succeeds do I commit my changes and do the checks on the supercomputer. If you tell me where I should look for format of the test files and some manual on how the test system works, then I can write some code that will check for several things. Then I could add that file to the pull request, as well. |
I also have the running times stored in files, just let me know how you want to get them. |
|
We usually store files in tst/testinstall (if they take < 10 seconds) or tst/teststandard (if they take longer). If your .g files are quite long, turning them into .tst files might be annoying, and also not necessary. My advice is to pop your files somewhere, with a guide as to how to run them, and we'll figure out how to put them into GAP. We need to get better at handling bigger tests, and making it easier ot add them, but we aren't there yet! |
We have directory |
You asked how to get rid of the spaces in the initial commit. You can do so using history rewriting. Beware, this will kill any uncommitted changes you have in your working tree, so make sure to first commit them, and/or stash them away using So here is what I just did to get rid of the spaces in my copy of your work:
Voila, the history is clean. |
We should collect this git toolbox that Max is explaining in issues on the |
The default DirectFactorsOfGroup attribute has undergone a serious enhancement, without having proper documentation, yet. - The Kayal-Nezhmetdinov algorithm is implemented with some tweaks under the attribute DirectFactorsOfGroupKN. From now on, it is referred to as the KN method. Some parts of the code is reused in the main DirectFactorsOfGroup attribute, but there are no direct calls to DirectFactorsOfGroupKN. - The KN method has an algorithm for computing a direct complement to a normal subgroup. This is implemented in the operation ComplementNormalSubgroup. This particular operation (in theory) should work for infinite groups, as well. - The main DirectFactorsOfGroup algorithm works as follows: - for Abelian groups computes the decomposition to cyclic groups of prime-power order, (for infinite Abelian groups the output is only a list of the direct factors instead of a set, because calling Set on the factors might not finish running in reasonable time; infinite non-Abelian groups are not handled, at all) - for nilpotent groups it calls itself on the Sylow subgroups, - checks several sufficient conditions for the group being direct indecomposable, - looks for Abelian cyclic components from the center of the group, - checks for more sufficient conditions for the group being direct indecomposable, - computes the normal subgroups and the minimal normal subgroups, and calls DirectFactorsOfGroupFromList. - DirectFactorsOfGroupFromList is a more efficient version of the old factorization method. It essentially searches through the normal subgroups (2nd argument, which is a list) by size and looks for a complement from the same list. If it finds a complement then the first factor is direct indecomposable, the second is decomposed further using the same list filtered from the clearly noncomplemented normal subgroups. Trivial intersection is checked by IsTrivialNormalIntersectionInList, which checks if any of the minimal subgroups (3rd argument, which is a list) is contained in both normal subgroups. This is faster than computing the normal intersection and checking if that is trivial. - IsTrivialNormalIntersection is implemented in cases where one knows more about the structure of the group. - Cases where the group is already a direct product or already has NormalSubgroups computed are handled in separate methods. Tested on SmallGroups of order different than 512, 768, 1024, 1280, 1536, 1792, and on a couple hundred test permutation groups. More times than not the new method was significantly (>200ms, >10%) faster than the old method, in many cases because it immediately found the group is direct indecomposable, rather than spending much time computing the normal subgroups. The KN method seems to be drastically slower if the direct factors are non-Abelian. Computing normal subgrups in such cases is much faster. However, the implementation is kept for possible future enhancements.
Added documentation to the following commands: - IsTrivialNormalIntersection, - IsTrivialNormalIntersectionInList, - ComplementNormalSubgroup (added GAPDoc label), - DirectFactorsOfGroup (added GAPDoc label), - DirectFactorsOfGroupFromList (mainly merged from old documentation), - DirectFactorsOfGroupKN.
- Abelian -> abelian - IsIdenticalObj put into ComplementNormalSubgroup - comment about Set or List is slightly expanded - Nilpotent -> nilpotent for unity's sake
- IsTrivialNormalIntersectionInList is now a function. - IsFamFamFam is added where applicable. - List creations are replaced by corresponding loops. - Trivial generators are excluded. - Special methods for Socle and nilpotent groups are removed for now. (They seemed to perform slower than the basic method, may reinvent them later.) - The idea behind the MinimalNormalSubgroups method is that the list of minimal normal subgroups is generally small, whereas there could be many normal subgroups and computing the intersection of any two would be slower than to check if each contains a generator of a minimal normal subgroup.
If a group is a nonabelian p-group, then it cannot have a unique maximal normal subgroup, therefore it is useless to check for this property.
Since I've been explicitly asked :-) I did not follow the initial discussion as this is not really an issue for the groups I typically work with. Looking at the code I'm puzzled at the inclusion of an operation |
@hulpke I can imagine, that the ComplementClassesRepresentative algorithm could be rewritten to only compute the normal ones, but I do not understand it well enough to even try. The current algorithm for Further, my initial testing was that ComplementClasses was slower for bigger groups. Now that I checked it again, the current version of ComplementNormalSubgroup is not always faster, and in fact, it sometimes is very slow. I found out the reason and will explain it in a separate post. |
So, I will need to tweak the ComplementNormalSubgroup algorithm, because it became slower in some situations than it was originally, when I did the very thorough testing. The reason is that at some point (grpnames.gi:156-157) you need to take the intersection of the Center and a rightcoset of N, and find an element of a particular order in it. However, However, for groups where N is big but the center is small, this algorithm becomes very slow. This did not seem to affect I would like to ask some advice for this. My first idea would be to compute the
So maybe Another question: is there any faster way to find an element of a subgroup/right coset (their intersection) with a particular order? Thank you. |
One more thing. In #552 it occurred that it might not be useful at all to have the direct factors as a |
The reason is that at some point (grpnames.gi:156-157) you need to take the
intersection of the Center and a rightcoset of N, and find an element of a
particular order in it. However, Intersection for a group and a coset takes a
loooong time, so I decided to loop over the right coset. This was a huge
increase for groups with relatively big centers (nilpotent groups with class 2)
in the DirectFactorsOfGroup algorithm. For such groups it is important to try
everything before computing NormalSubgroups, because there will be many normal
subgroups....
Just to point out: Could you try benchmarking with @ChrisJefferson's "ferret"
package loaded?
https://github.com/gap-packages/ferret
It features a vastly improved intersection method for permutation groups. (I
assume you are actually using permutation representations here, which might be
wrong and/or stupid).
Cheers,
Markus
|
@markuspf On my home machine I ran the same commands as above with or wihout ferret 0.5.1 loaded, all of them ran for about 5 seconds. |
It looks like Chris' methods are not picked up. Of course the intersection of the coset of the A_n and A_n in S_n is the pathological case for coset intersection, but one can test for this case separately (by detecting whether one is intersecting with a (coset of an) A_n, I suppose). I suggest we debug this separately though as to not delay the merging of this PR too much longer. I will try to find out why the ferret methods are (apparently) not used. I'll also ask Chris whether ferret even does Coset intersection yet (it should, and it should be orders of magnitude faster than the current GAP implementation). |
@hungaborhorvath As there currently is no user-accessible implementation the existing code by definition is useful. However the intersection of group with coset, or the searching through a coset do not look efficient. A variant based on cohomology (as the existing code) is likely to perform better. |
@hulpke How about BTW, if one complement is computed, then all the others can be computed by computing all possible homomorphisms from the normal subgroup to the center. About efficiency, how about this: I will experiment a bit on how to find elements in the intersection of the center and a rightcoset with a particular order (it just occurred to me that because the center is abelian, it is easy to determine the elements of a particular order, so maybe it is faster to look through these whether they are in the right coset. Or I could use RationalClasses, as well.) And then after merging this, we could check if some othe method may perform faster? |
If G/N is abelian, then check whether groups knows its Center and how big it is compared to the rightcoset to determine which to choose to loop through the elements.
Added some extra tests to direct_factors.tst, as well.
@hulpke I renamed Also rewritten parts of it to have it faster than it previously was. Applied some heuristic about whether to loop through elements of the Center or the right coset. After merging, a better Intersection method could be figured out. In any case, it should now run faster than I also added some more tests to check all lines of the new code. @fingolfin @markuspf @hulpke Is there anything else I should do? |
I am for merging this now into |
Enhance DirectFactorsOfGroup for faster StructureDescription
Thank you @hungaborhorvath for sticking in there with this quite long process. I hope we'll get more high quality contributions from you in the future! |
Thank you very much for all your help. I learned a lot during these months. |
@hungaborhorvath thank you, glad to see this being merged! Now the next phase of testing starts ;-) There is a new diff in the manual example:
and I am not sure if the new output is better than the old one. |
@alex-konovalov No, this is a bug. I just dicovered it about 20 minutes ago (along with a completely different bug with something else). I will soon issue a PR to remedy this. Thank you. |
@alex-konovalov Should I put this into the changes from 4.8 to 4.9, as well? Or might this go into one of the smaller releases? |
@hungaborhorvath this went into the master branch after stable-4.8 was started, so it will go into 4.9, so yes, please document it as well. |
@alex-konovalov I have written two sentences about this into the wiki page. |
The default DirectFactorsOfGroup attribute has undergone a serious
enhancement, without having proper documentation, yet.
attribute DirectFactorsOfGroupKN. From now on, it is referred to as the
KN method.
Some parts of the code is reused in the main DirectFactorsOfGroup
attribute, but there are no direct calls to DirectFactorsOfGroupKN.
normal subgroup. This is implemented in the operation
ComplementNormalSubgroup. This particular operation (in theory) should
work for infinite groups, as well.
prime-power order,
(for infinite Abelian groups the output is only a list of the direct
factors instead of a set, because calling Set on the factors might
not finish running in reasonable time;
infinite non-Abelian groups are not handled, at all)
indecomposable,
indecomposable,
calls DirectFactorsOfGroupFromList.
factorization method. It essentially searches through the normal subgroups
(2nd argument, which is a list) by size and looks for a complement from
the same list. If it finds a complement then the first factor is direct
indecomposable, the second is decomposed further using the same list
filtered from the clearly noncomplemented normal subgroups.
Trivial intersection is checked by IsTrivialNormalIntersectionInList,
which checks if any of the minimal subgroups (3rd argument, which is a
list) is contained in both normal subgroups. This is faster than computing
the normal intersection and checking if that is trivial.
about the structure of the group.
NormalSubgroups computed are handled in separate methods.
automatically by the text editor.
Tested on SmallGroups of order different than 512, 768, 1024, 1280, 1536,
1792, and on a couple hundred test permutation groups.
More times than not the new method was significantly (>200ms, >10%) faster
than the old method, in many cases because it immediately found the group
is direct indecomposable, rather than spending much time computing the
normal subgroups. In particular, the groups specifically mentioned in
issue #160 are factored almost instantly now.
The KN method seems to be drastically slower if the direct factors are
non-Abelian. Computing normal subgrups in such cases is much faster.
However, the implementation is kept for possible future enhancements.