Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inefficiencies in StructureDescription #160

Closed
olexandr-konovalov opened this issue Apr 20, 2015 · 23 comments
Closed

Inefficiencies in StructureDescription #160

olexandr-konovalov opened this issue Apr 20, 2015 · 23 comments

Comments

@olexandr-konovalov
Copy link
Member

I've just came across two examples: SmallGroup(256, 56091) and SmallGroup(512,10494180) where StructureDescription takes enormously long time. It was running long enough without returning anything until I've interrupted the calculation - while it works for many other groups of these orders, even though it may take a while for some of them.

@stevelinton
Copy link
Contributor

I had a look at the order 256 example. This group has almost 30000 normal subgroups, and once it has found those (takes about 11 seconds for me) it has to consider all pairs of appropriate orders to see if they are direct factors. The code for this in DirectFactorsOfGroup is not efficient. In particular it contains the line if IsTrivial( Intersection( N[i], N[j] ) ) which computes the exact intersection when all you want to know is whether it is trivial. Something like ForAll(N[i], IsOne(x) or not x in N[j]) would work better for these relatively small groups.

Actually an operations IntersectTrivially would be generally useful and often quicker than Intersection.

I think though that StructureDescription really needs a completely different approach for p-groups. It makes much more sense to describe them via one of the natural series. If you wanted a DirectFactorsOfGroup method I suspect a faster one could also be developed for these groups.

@stevelinton
Copy link
Contributor

This group also has a large automorphism group, under which the normal subgroups fall into quite large orbits. There is no point in checking more than one subgroup from an orbit as a direct factor. ComplementClassesReps might also provide a faster way to do the check. In fact, using these two tricks I can show that it is not a direct product in a few seconds.

@hungaborhorvath
Copy link
Contributor

Some weeks ago I have forked grpnames.g? to rewrite SemidirectFactorsOfGroup, because it was also very inefficient. I will look into the code of DriectFactorsOfGroup, as well, and make it more efficient. Thank you for the ideas.

BTW, what machine did you run it on so that it only took 11 seconds? It took my 2 years old laptop four and a half minutes just to compute NormalSubgroups for the 256 element example. The same calculation took slightly more than two minutes on a supercomputer (only used one node, though).

@stevelinton
Copy link
Contributor

That's just a nearly three year old MacBookPro. I just rechecked and it's about 11.5 seconds. I do have CRISP loaded and I think some of the methods from that package make a difference.

In working on functions like these, you need to beware of changes that make them much faster for one type of group and much slower for another -- for instance computing the automorphism group may help a lot if there are many normal subgroups, but might be very slow to little purpose in other groups.

@hungaborhorvath
Copy link
Contributor

Ah, yes, CRISP indeed made a difference, now it only takes 20 seconds. :-)

@olexandr-konovalov
Copy link
Member Author

I've run some parallel checks of the performance of StructureDescription using 48 SCSCP servers.

I am going to post details of the setup elsewhere; for now, the essential command was

res:=ParListWithSCSCP( a, "StructureDescriptionById": noretry, timeout:=limit);

which permits to specify the upper limit for the number of seconds to waiting for the result of the remote procedure call and do not resubmit failed RPCs. This option is very useful for finding examples demonstrating the worst-case behaviour of the implementation.

That's what I've observed so far:

  • order 64: each group takes less than a second, except [64,266] taking 6 seconds
  • order 128: groups taking more than 10 seconds are those with numbers 2300, 2326, 2327

@stevelinton
Copy link
Contributor

The documentation of StructureDescription specifically warns that it can be slow for p-groups.
I also think that it is simply not very useful or appropriate for p-groups — there are better ways to see their structure.

On the other hand DirectFactorsOfGroup is a natural enough operation, so speeding that up (and documenting it) could be useful.

Steve

On 21 Apr 2015, at 14:37, Alexander Konovalov notifications@github.com wrote:

I've run some parallel checks of the performance of StructureDescription using 48 SCSCP servers.

I am going to post details of the setup elsewhere; for now, the essential command was

res:=ParListWithSCSCP( a, "StructureDescriptionById": noretry, timeout:=limit);

which permits to specify the upper limit for the number of seconds to waiting for the result of the remote procedure call and do not resubmit failed RPCs. This option is very useful for finding examples demonstrating the worst-case behaviour of the implementation.

That's what I've observed so far:

• order 64: each group takes less than a second, except [64,266] taking 6 seconds
• order 128: groups taking more than 10 seconds are those with numbers 2300, 2326, 2327

Reply to this email directly or view it on GitHub.

@hungaborhorvath
Copy link
Contributor

I am thinking that there should be a separate method for Abelian groups (maybe calculating from AbelianInvariants, or simply from G^p, G^{p^2}, whatever), and then for a general nonperfect group one could try to lift a decomposition of G/G' into a possible decomposition of G using (A \times B)'=(A') \times (B'). I will think about it and try to code it. I think it should be faster than the current decomposition.

@hungaborhorvath
Copy link
Contributor

On second thought, it might be better to do this with Z(G) instead of G'. I will think it through.

@hungaborhorvath
Copy link
Contributor

Neeraj Kayal and Timur Nezhmetdinov gave a fast algorithm for direct product decomposition:

Neeraj Kayal and Timur Nezhmetdinov, Factoring Groups Efficiently, in International Colloquium on Automata, Languages and Programming (ICALP) , Springer Verlag, 2009.
http://research.microsoft.com/apps/pubs/default.aspx?id=102435

I am implementing the code into grpnames.g?

@stevelinton
Copy link
Contributor

I notice that the input to the algorithm in that paper is the group given in the form of a multiplication table.

For many groups, simply writing that down will be infeasibly slow or memory consuming, although it might be useful for p-groups for small p such as the ones that started this discussion.

Steve

On 25 Apr 2015, at 18:54, hungaborhorvath notifications@github.com wrote:

Neeraj Kayal and Timur Nezhmetdinov gave a fast algorithm for direct product decomposition:

Neeraj Kayal and Timur Nezhmetdinov, Factoring Groups Efficiently, in International Colloquium on Automata, Languages and Programming (ICALP) , Springer Verlag, 2009.
http://research.microsoft.com/apps/pubs/default.aspx?id=102435

I am implementing the code into grpnames.g?


Reply to this email directly or view it on GitHub.

@hungaborhorvath
Copy link
Contributor

I read through the algorithm. For centerless groups I think it is basically polynomial (probably quadratic) in the number of conjugacy classes of G (after they are computed). For groups having Abelian factors it is pretty fast. For non-centerless groups having only nonabelian factors you actually need to go through 2^s subsets, where s is some bound on the number of direct factor components (and they can only bound s by log |G|). So It is not that bad.

Of course, if there are very few normal subgroups, then it is faster to check them. After I implemented the algorithm, I will compare the old and the new method and see which should be used when.

There are some obvious possible enhancements. For example if the group is nilpotent, then we know that it is the direct product of its Sylow subgroups, so we really need to decompose those. Or if the group is given as a direct product, then again, we already have a decomposition to start with. I also think that the bound on s can be decreased by checking e.g. the number of minimal normal subgroups.

In fact, all of the groups mentioned in the thread have only one minimal normal subgroup, therefore they cannot be decomposed as a direct product. I think for solvable group it is cheap to calculate minimal normal subgroups (for p-groups they are contained in the center), so this could be a very quick way to decide for some p-groups that they are direct indecomposable.

@olexandr-konovalov
Copy link
Member Author

On 27 Apr 2015, at 16:04, james-d-mitchell notifications@github.com wrote:

I'm not sure if this is related to changes made here or not, but currently at the tip of the master branch doing StructureDescription(Group(())) returns fail where as in GAP 4.7.7 it returns "1".

Thanks, but I can't reproduce this (tip of the master branch, GAP started with -r -A).

@james-d-mitchell
Copy link
Contributor

I deleted the comment almost immediately after I made it, it was my
mistake, nothing to do with anything in this issue.

On Wed, 29 Apr 2015 at 21:46 Alexander Konovalov notifications@github.com
wrote:

On 27 Apr 2015, at 16:04, james-d-mitchell notifications@github.com
wrote:

I'm not sure if this is related to changes made here or not, but
currently at the tip of the master branch doing
StructureDescription(Group(())) returns fail where as in GAP 4.7.7 it
returns "1".

Thanks, but I can't reproduce this (tip of the master branch, GAP started
with -r -A).


Reply to this email directly or view it on GitHub
https://github.com/gap-system/gap/issues/160#issuecomment-97580508.[image:
Web Bug from
https://github.com/notifications/beacon/AGYFdxgxvebM86Zjet--_sAyvEre2DaOks5oETptgaJpZM4EEHgq.gif]

@hungaborhorvath
Copy link
Contributor

After a busy summer break and an even more busy September, I came back to optimizing the DirectFactorsOfGroup code. During the testing of several different algorithms, I came across a problem, which is probably some stupid coding mistake on my part, but somehow I cannot rectify this on my own.

The following code is a rewriting of the original DirectFactorsOfGroup code as

  1. to only check direct factors from a given list. This is supposed to enhance speed because after a direct factor is obtained, one does not need to recompute normal subgroups of the factor, but rather pull them from the list;
  2. it checks trivial intersection of two normal subgroups by checking if a (generator of a) minimal normal subgroup (again, given by a list) is in both normal subgroups. Again, this should be significantly faster if MinimalNormalSubgroups are computed at the beginning, compared to always intersecting the two normal subgroups.

However, testing shows that on a 96-element group it gives only two direct factors, rather than the correct three, if_CRISP_is_used. Without CRISP it gives the correct result.

The code:

DeclareOperation( "IsTrivialNormalIntersectionInList",[IsList, IsGroup, IsGroup]);
DeclareGlobalFunction( "DirectFactorsOfGroupFromList", [IsGroup, IsList, IsList]);

#############################################################################

#M IsTrivialNormalIntersectionInList( , , ) . . . . generic method

InstallMethod( IsTrivialNormalIntersectionInList,
"if minimal normal subgroups are computed",
[ IsList, IsGroup, IsGroup ],

function( L, U, V )
local gs;

gs:= List(L, N -> GeneratorsOfGroup(N)[1]);
return ForAll(gs, g -> not g in U or not g in V);

end);

InstallGlobalFunction( DirectFactorsOfGroupFromList,

function ( G, NList, MinList )

local  gs, Ns, MinNs, NNs, MinNNs, facts, sizes, i, j, s1, s2;

if not IsFinite(G) then TryNextMethod(); fi;
Ns := ShallowCopy(NList);
MinNs := ShallowCopy(MinList);
gs := GeneratorsOfGroup(Socle(G));
Ns := Filtered(Ns, N -> not IsSubset(N, gs));
sizes := List(Ns,Size);
SortParallel(sizes,Ns);
for s1 in Difference(Set(sizes),[Size(G),1]) do
  i := PositionSet(sizes,s1);
  s2 := Size(G)/s1;
  if s1 <= s2 then 
    repeat
      if s2 > s1 then
        j := PositionSet(sizes,s2);
        if j = fail then break; fi;
      else 
        j := i + 1;
      fi;
      while j <= Size(sizes) and sizes[j] = s2 do
        if IsTrivialNormalIntersectionInList(MinList,Ns[i],Ns[j]) then
        # Ns[i] is the smallest direct factor, hence direct irreducible
        # we keep from Ns only the subgroups of Ns[j] having size min. s1
          NNs := Filtered(Ns, N -> Size(N) >= s1 and s2 mod Size(N) = 0 
                                            and IsSubset(Ns[j], N)); 
          MinNNs := Filtered(MinNs, N -> s2 mod Size(N) = 0 
                                            and IsSubset(Ns[j], N)); 
          return Union([ Ns[i] ], DirectFactorsOfGroupFromList(Ns[j], 
                    NNs, MinNNs));
        fi;
        j := j + 1;
      od;
      i := i + 1;
    until i>=Size(sizes) or sizes[i] <> s1;
  fi;
od;
return [ G ];

end );

With CRISP I have only two direct factors.

gap> G := Group([ (6,7)(8,9)(10,11)(12,13)(14,15)(16,17)(18,19)(20,21)(22,23)(24,25), (3,4)(6,8)(13,15)(16,18)(23,25), (8,9)(12,13)(16,17)(20,21)(24,25) ]);
Group([ (6,7)(8,9)(10,11)(12,13)(14,15)(16,17)(18,19)(20,21)(22,23)
(24,25), (3,4)(6,8)(13,15)(16,18)(23,25), (8,9)(12,13)(16,17)(20,21)(24,25) ])
gap> DirectFactorsOfGroupFromList(G, NormalSubgroups(G), MinimalNormalSubgroups(G));
[ <permutation group of size 48 with 5 generators>, Group([ (3,4)(20,21) ]) ]
gap>

Without CRISP I have the correct three direct factors.

gap> G := Group([ (6,7)(8,9)(10,11)(12,13)(14,15)(16,17)(18,19)(20,21)(22,23)(24,25), (3,4)(6,8)(13,15)(16,18)(23,25), (8,9)(12,13)(16,17)(20,21)(24,25) ]);
Group([ (6,7)(8,9)(10,11)(12,13)(14,15)(16,17)(18,19)(20,21)(22,23)
(24,25), (3,4)(6,8)(13,15)(16,18)(23,25), (8,9)(12,13)(16,17)(20,21)(24,25) ])
gap> DirectFactorsOfGroupFromList(G, NormalSubgroups(G), MinimalNormalSubgroups(G));
[ <permutation group of size 24 with 4 generators>, Group([ (3,4)(20,21) ]),
Group([ (3,4)(10,11) ]) ]
gap>

Could someone tell me what I am doing wrong in the code?

Edit: somehow some comments of the code came out in a very ugly formatting, so I just removed them.

@bh11
Copy link
Contributor

bh11 commented Oct 11, 2015

@hungaborhorvath
Right now, I can’t reproduce your problem – I do get three factors with CRISP as well. However, there is a bug in CRISP 1.3.8, so you might try CRISP 1.3.9 instead (currently only avaliable from http://www.icm.tu-bs.de/~bhoeflin/crisp/index.html, should be on the GAP server shortly).

@hungaborhorvath
Copy link
Contributor

My results are with CRISP 1.3.9. I downloaded them from the page you suggested, as well.

Edit: I tried it again with a completely new gap on a different machine. I cloned https://github.com/gap-system/gap, compiled it, put only GAPDoc 1.5.1 and CRISP 1.3.9 into pkg, loaded the above code and obtained only 2 direct factors. If I removed the CRISP folder from pkg, then it gave me 3 direct factors.

Edit2: Another interesting thing I noticed is that if there are some previous other computations in the same GAP session, then it produces the correct answer with CRISP, as well. E.g. if I start with the code

H := Group([ (2,3)(4,5)(7,8)(9,10), (4,5)(9,10), (3,11)(6,12)(9,13) ]);
DirectFactorsOfGroupFromList(H, NormalSubgroups(H), MinimalNormalSubgroups(H));

then afterwards the direct factors of above G are computed as they should. I am not entirely sure what other previous computations will make this difference, but I have now reproduced this on two different machines, both are Debian Jessie based systems with Bunsenlabs netinstall if matters, and the gap is a cloned version of the github one, freshly compiled, only GAPDoc 1.5.1 and CRISP 1.3.9.

@bh11
Copy link
Contributor

bh11 commented Oct 11, 2015

@hungaborhorvath
Seems to be related to the GAP master branch. Using that branch, I can reproduce the bug, at least after repeating the command

DirectFactorsOfGroupFromList(G, NormalSubgroups(G), MinimalNormalSubgroups(G));

a number of times. About one in three attempts, it fails. (Also tried switching back to gmp 5, but as expected, this doesn't change anything.)

I cannot reproduce the problem with the stable-4.7 branch – even when running the above code 1000 times.

In fact, if one sets the assertion level to 1, a CRISP routine detects an inconsistency: a modulo pcgs returns wrong exponents, e.g., at the break caused by the assertion, enter

ExponentsOfPcElement (gpcgs, gpcgs[1]);

which will return [0,0] instead of [1,0]. However, I don't know the perm pcgs code well enough to further diagnose this.

Cheers

Burkhard.

@hungaborhorvath
Copy link
Contributor

Something is very fishy here.....

I tried it with v4.7.5 (this is the gap version bundled with debian jessie). With CRISP it seems to give the right result all the time. Without CRISP, it still gives 3 direct factors, but one of the factors is changing if I call the code again.....

┌───────┐ GAP, Version 4.7.5 of 24-May-2014 (free software, GPL)
│ GAP │ http://www.gap-system.org
└───────┘ Architecture: x86_64-pc-linux-gnu-gcc-default64
Libs used: gmp, readline
Loading the library and packages ...
Components: small 2.1, small2 2.0, id2 3.0, trans 1.0, prim 2.1
Packages: GAPDoc 1.5.1
Try '?help' for help. See also '?copyright' and '?authors'
gap> G := Group([ (6,7)(8,9)(10,11)(12,13)(14,15)(16,17)(18,19)(20,21)(22,23)(24,25), (3,4)(6,8)(13,15)(16,18)(23,25), (8,9)(12,13)(16,17)(20,21)(24,25) ]);
Group([ (6,7)(8,9)(10,11)(12,13)(14,15)(16,17)(18,19)(20,21)(22,23)
(24,25), (3,4)(6,8)(13,15)(16,18)(23,25), (8,9)(12,13)(16,17)(20,21)(24,25) ])
gap> DirectFactorsOfGroupFromList(G, NormalSubgroups(G), MinimalNormalSubgroups(G));
[ <permutation group of size 24 with 4 generators>, Group([ (3,4)(20,21) ]),
Group([ (3,4)(10,11) ]) ]
gap> G := Group([ (6,7)(8,9)(10,11)(12,13)(14,15)(16,17)(18,19)(20,21)(22,23)(24,25), (3,4)(6,8)(13,15)(16,18)(23,25), (8,9)(12,13)(16,17)(20,21)(24,25) ]);
Group([ (6,7)(8,9)(10,11)(12,13)(14,15)(16,17)(18,19)(20,21)(22,23)
(24,25), (3,4)(6,8)(13,15)(16,18)(23,25), (8,9)(12,13)(16,17)(20,21)(24,25) ])
gap> DirectFactorsOfGroupFromList(G, NormalSubgroups(G), MinimalNormalSubgroups(G));
[ Group([ (8,9)(10,11)(12,13)(16,17)(24,25), (7,9,8)(12,13,14)(16,19,17)
(22,24,25), (6,7)(8,9)(12,13)(14,15)(16,17)(18,19)(22,23)(24,25) ]),
Group([ (3,4)(20,21) ]), Group([ (3,4)(10,11) ]) ]
gap>

The first factors in the two separate runs are even setwise different (but of course they are isomorphic).

BTW, is this the right place to report this problem, or should I make a separate issue?

@bh11
Copy link
Contributor

bh11 commented Nov 4, 2015

The issue with the GAP master branch seems to be resolved now (see #327).

@fingolfin
Copy link
Member

What is the status of this issue? It seems that the initial computations @alex-konovalov tried still are slow, as it still tries to find all normal subgroups, and there are just too many

Personally, I'd argue that this is not really an issue, but just life... Even if we improve the algorithm further, I am sure we'll be able to find new cases where it does not perform well.

@hungaborhorvath
Copy link
Contributor

hungaborhorvath commented Dec 16, 2016

These are all timings on my own laptop with current master:

gap> StructureDescription(SmallGroup(256,56091)); time; 
"(C2 x ((C2 x ((C4 x C2) : C2)) : C2)) : C2"
17528
gap> StructureDescription(SmallGroup(512,10494180)); time;
"(C2 x ((C2 x ((C8 x C2) : C2)) : C2)) : C2"
49012
gap> StructureDescription(SmallGroup(64,266)); time;
"(C2 x ((C4 x C2) : C2)) : C2"
312
gap> StructureDescription(SmallGroup(128,2300)); time;
"((C4 x D8) : C2) : C2"
572
gap> StructureDescription(SmallGroup(128,2326)); time;
"(C2 x ((C2 x C2 x C2) : (C2 x C2))) : C2"
1768
gap> StructureDescription(SmallGroup(128,2327)); time;
"(C2 x ((C2 x Q8) : C2)) : C2"
1760

Of course I have no idea how these compare to the timings of @alex-konovalov , but the first two groups did not even give a result in half an hour, so I would not call this slow....

Edit: I just ran the last four groups on my laptop using the old StructureDescription code (but on current master):

gap> Read("old.g");
gap> StructureDescriptionOld(SmallGroup(64,266)); time;
"(C2 x ((C4 x C2) : C2)) : C2"
1956
gap> StructureDescriptionOld(SmallGroup(128,2300)); time;
"((C4 x Q8) : C2) : C2"
26828
gap> StructureDescriptionOld(SmallGroup(128,2326)); time;
"(C2 x ((C2 x C2 x C2) : (C2 x C2))) : C2"
12652
gap> StructureDescriptionOld(SmallGroup(128,2327)); time;
"(C2 x ((C2 x Q8) : C2)) : C2"
10880

@olexandr-konovalov
Copy link
Member Author

@hungaborhorvath thanks for the edit - that's exactly what I was going to ask. I think this may now be closed. It lead to some improvements, and I agree with @fingolfin that the corner cases are "just life".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants