Large(r) Internally represented FFE's #2846

jdebeule · 2018-09-20T09:34:45Z

The changes in the pull request allow internally represented finite field of order up to 2^24. Changes in the .h and .c (and ffe.g) files are commented.

Some tests are broken (obviously): those where an just above the old limit (2^16) are chosen and the field of that order is constructed: the results are not expected to be in the internal representation.

…t fields of order up to 2^24 can be internally represented. An internal FFE is now UInt4 (32 bit). The changes in the code are actually minimal. Only the computation of a primitive root modulo p has been changed a bit. So far, primitive roots modulo p with p < 2^16 are still computed as before. Primitive roots modulo p for 2^16 < p < 2^24 are now computed using the Library function PrimitiveRootMod. This requires btw the use of InitFopyGVar in finfield.c. Thanks to Markus Pfeiffer and Max Horn for pointing this out. The restriction 2^24 seems arbitray. Theoretically, the adaptions made could allow fields of order up to 2^32. But then the Zech Log table will blow up, and the amount of data in ffdata.c as well. So far, we kept the limit to 2^24.

markuspf · 2018-09-20T10:41:04Z

The size of ffdata.c (26MB) is defninitely a problem now, in my opinion.

We can make pre-computation of finite fields data part of the compilation process (which makes compilation take longer...). Additionally would it be sensible to make the size of finite fields whose elemetns are represented as immediates a compile-time-switch.
We'll have to make sure tests are adapted as well.

Or we allow pre-computation of some of these values on the fly at runtime...

jdebeule · 2018-09-20T16:23:34Z

ffdata.c contains a precomputed list of characteristics, degrees, and sizes of all internal FFs. Up to order 2^16, these lists contain only 6542 entries. Up to 2^24, this number becomes 1077871. The FF code of GAP 4r8 did not rely on these precomputed lists. Upon construction of a FF in GAP, it was first checked whether the required FF was not yet constructed (by looking at each entry of the list of already constructed FFs, and checking whether characteristic and degree match with characteristic and degree in a dynamical lists CharFF and DegrFF. By using the precomputed lists, an interpolated search is possible. However, this approach makes the precomputed lists necessary, and the amount of data explodes of course when increasing the MAXSIZE_GF_INTERNAL to 2^24. So maybe we should consider going back to the implementation of internal FFs of GAP4r8.

markuspf · 2018-09-20T16:52:22Z

The pre-computed data was added for HPC-GAP to avoid awkward problems. I'll have to look again whether we can solve this problem slightly more elegantly.

hulpke · 2018-09-23T15:09:43Z

As for the running GAP job this seems to be <10MB (which does not worry me). The source file seems to be large mainly because of the inefficiency of storing decimals. Can one improve this (e..g gzip reduces to 6MB)?
Also, it would be helpful to understand why this table (which presumably is created by program) could not be created on the fly when starting GAP. @markuspf pointed to HPCGAP, but there should be no parallelity issue if the data is created before calculations start.

stevelinton · 2018-09-25T11:08:11Z

As an experiment I tried a very simplistic, single-threaded C programme to populate these arrays. It takes about 200ms on my laptop. I'm pretty sure that could be enhanced by a significant factor with a little care. So I can't see any reason not to do this during startup.

markuspf · 2018-09-26T12:59:20Z

In HPC-GAP we could even be daring and do the coputation in a separate thread at startup ;)

stevelinton · 2018-10-01T09:26:07Z

So I think we're agreed on initialising these data structures programmatically at startup, rather than having such a huge additional C file. @jdebeule would you mind doing that, or should someone else produce a new PR?

jdebeule · 2018-10-02T07:26:06Z

Hi all,

thanks for the interesting comments. Indeed I think initialising these data structures at start up is a good solution. Steve, would you mind sharing the code/files of your experiment?

Best,

Jan

stevelinton · 2018-10-02T10:26:17Z

Here's the test file I ran. It uses a sieve of Erastothenes to find the primes, and adds the prime powers as it finds each prime. Depending on the exact specification of the arrays, you might need to sort it afterwards, or mark the prime powers for adding when you get to them, or something.

The most obvious speedup is to deal with a few small primes systematically. So rather than using the sieve to mark and then skip over all the multiples of 2, 3 and 5, you could just look a numbers that are congruent to 1, 7, 11, 13,17, 19, 23 and 29 mod 30, for instance.

#include <stdio.h>
#include <string.h>
#include <stdlib.h>

int degrff[1100000];
int sizeff[1100000];
int charff[1100000];
int numff = 0;
#define MAX_FF (1<<24)
#define SQR_FF (1<<12)
int main() {
    unsigned char *sieve = malloc(MAX_FF);    
    memset(sieve, 0, MAX_FF);
    for (int i = 2; i <= SQR_FF; i++) {
        if (!sieve[i]) {
            int p = i;
            int q = p;
            int d = 1;
            while (q <= MAX_FF) {
                sizeff[numff] = q;
                charff[numff] = p;
                degrff[numff] = d;
                numff++;
                q *= p;
                d++;
            }
            int np = p+p;
            while (np < MAX_FF) {
                sieve[np] = 1;
                np += p;
            }
        }
    }
    for (int i = SQR_FF+1; i < MAX_FF; i++) {
        if (!sieve[i]) {
            sizeff[numff] = i;
            charff[numff] = i;
            degrff[numff] = 1;
            numff++;
        }
    }
    printf("%i\n",numff);
}

fingolfin

Some remarks.

BTW, 200 ms extra startup time would be very bad from my perspective. But perhaps it can be optimized; or delayed until it is actually needed (but then we'd have to be very careful with e.g. HPC-GAP in mind)

fingolfin · 2018-09-20T22:30:03Z

lib/ffe.g

@@ -10,14 +10,16 @@
 ##
 ##  This file deals with internal finite field elements.
 ##
+##  Update September 2018: MAXSIZE_GF_INTERNAL changed to 2^24.


That doesn't really belong here, that kind of information is available in the repository.

fingolfin · 2018-09-20T22:30:17Z

lib/ffe.g



 #############################################################################
 ##

 #V  MAXSIZE_GF_INTERNAL . . . . . . . . . . . . maximal size of internal ffes
 ##
-BIND_GLOBAL( "MAXSIZE_GF_INTERNAL", 2^16 );
+BIND_GLOBAL( "MAXSIZE_GF_INTERNAL", 2^24 ); #jdebeule 16/09/18 2^24 was 2^16.


Drop the #jdebeule comment

This applies many times again later on, so I'll not repeat this each time.

fingolfin · 2018-09-20T22:31:21Z

src/finfield.c

@@ -48,6 +48,26 @@
 **  is the  logarithm  of $z^{a-1} +   1$.  This list  is  usually called the
 **  Zech-Logarithm  table.  The zeroth  entry in the  finite field bag is the
 **  order of the finite field minus one.
+**
+**  Note of September 2018. A finite field element (FFE) uses 32 bit (cfr supra).


What does "cfr supra" mean?

fingolfin · 2018-10-02T10:31:06Z

src/finfield.c

+**  Therefore, currently the largest allowed size for internatlly represented fields
+**  is set to 2^24. A second hurdle to deal with larger fields is the data stored in ffdata.c.
+**  The interpolation search requires these lists to be available, but increasing the largest
+**  possible order, increases the amount of data.


Minor nitpick (feel free to ignore): Ideally, kernel code comments should wrap at about 78 columns, to match existing code.

fingolfin · 2018-10-02T10:32:06Z

src/finfield.c

@@ -83,103 +103,14 @@ static Obj TYPE_FFE0;
 **  'PolsFF' is a  list of  Conway  polynomials for finite fields.   The even
 **  entries are the  proper prime powers,  odd entries are the  corresponding
 **  conway polynomials.
+*
+**  Update September 2018: the 6948 Conway polynomials for fields of prime power
+**  order at most 2^32 are stored in the file finfield_conway.h


Drop this comment

fingolfin · 2018-10-02T10:33:45Z

src/finfield.c

@@ -1623,13 +1565,14 @@ Obj FuncZ2 ( Obj self, Obj p, Obj d)
    {
      ip = INT_INTOBJ(p);
      id = INT_INTOBJ(d);
-      if (ip > 1 && id > 0 && id <= 16 && ip < 65536)
+      /* jdebeule 19/9/2018 '24' was '16', 65536 became new constant from ffdata.h */
+      if (ip > 1 && id > 0 && id <= 24 && ip <= SIZE_LARGEST_INTERNAL_FF)


I think the 24 should also be made a constant. Say BITS_INTERNAL_FF; then define SIZE_LARGEST_INTERNAL_FF as 1 << BITS_INTERNAL_FF

fingolfin · 2018-10-02T10:36:35Z

src/finfield.h

@@ -157,8 +187,11 @@ extern  Obj             TypeFF0;
 **  which  will only work if  the product of integers  of type 'FFV' does not
 **  cause an overflow.  And of course the successor table stored for a finite
 **  field will become quite large for fields with more than 65536 elements.
+**
+**  Update of September 2018: we fully agree with the comments above. See
+**  the necessary changes in POW1_FFV, VAL_FFE, NEW_FFE, and FLD_FFE


Instead of agreeing, the remark before should be removed, and of course parts of the comment above also are not valid anymore, as they talk about UInt2 and 65536 elements etc. In general, it'd be good to go through all comments here and make sure they are up-to-date (but of course it's perfectly sensible that you did not do this so far, as it was (is?) not 100% clear this PR gets merged. In other words: I am happy if adjusting the comments is delayed until the functional part of this PR is complete and generally accepted; I merely point this (and other) nitpicks out right now as part of general reviewing.

fingolfin · 2018-10-02T10:41:29Z

src/finfield.h

 */
-#define FLD_FFE(ffe)            ((FF)((((UInt)(ffe)) & 0xFFFF) >> 3))
+/*#define FLD_FFE(ffe)            ((FF)((((UInt)(ffe)) & 0xFFFF) >> 3))*/
+#define FLD_FFE(ffe)            ((FF)((((UInt)(ffe)) & 0xFFFFFFFF) >> 3)) /*jdebeule (16/9/18): see line above for old version */


While we are at it, I'd suggest to change this to a static inline function. Moreover, I'd replace 0xFFFF resp. 0xFFFFFFFF by a constant, too, e.g. MASK_INTERNAL_FF = SIZE_LARGEST_INTERNAL_FF-1

Then:

static inline FF FLD_FFE(Obj ffe) { return (FF)(((UInt)ffe) & SIZE_LARGEST_INTERNAL_FF) >> 3); }

Of course we could also extract the changes turning this (and other) macros into static inline functions, as well as the new constants MASK_INTERNAL_FF, BITS_INTERNAL_FF, into a separate PR, which could be merged quickly, and then this PR just would have to be rebased. I'd be happy to volunteer to create such a PR, too (in order to avoid causing lots of extra work for you). But only if you'd consider that helpful, of course.

To be clear: I really think this (or some other PR) should introduce MASK_INTERNAL_FF; the change to static inline, however, could also be made after this PR here is merged.

fingolfin · 2018-10-02T10:42:21Z

src/finfield.h

+/*#define NEW_FFE(fld,val)        ((Obj)(((UInt)(val) << 16) + \
+                                ((UInt)(fld) << 3) + (UInt)0x02))*/
+#define NEW_FFE(fld,val)        ((Obj)(((UInt)(val) << 32) + \
+                                ((UInt)(fld) << 3) + (UInt)0x02)) /* jdebeule (16/9/18): see above for old version */


I think we should turn all the macros in this header file to static inline (and I'd do it right now, but again, I don't want to cause too much work for you by forcing you to rebase this PR, so we can delay this to later).

fingolfin · 2018-10-02T10:45:12Z

src/finfield.c


+#include "finfield_conway.h" /* jdebeule 16/09/18 */


Is there a script that generated finfield_conway.h? If so, can we please add it to the repository, too? I don't really like having opaque data tables that then can't easily be edited, updated or understood.

In that regard, it is sad how e.g. 63001, 6 +242*251 turned into 63001, 60748. If I had the script, I am sure I could adjust it to produce the latter again...

Of course if you drop this part of the change again (keeping the old table, and generating the rest dynamically), this concern goes away.

stevelinton · 2018-11-06T12:00:16Z

I'd hate to see this change lost. I think it's a really good idea.

It occurs to me that the main issue we are having comes from the fact that we want to number the finite fields consecutively in a thread-safe way, which, itself, comes from the fact that Martin wanted to do fields up to 2^16 in 32 bit ints and that didn't leave enough bits to just put the field order in the representation of the elements. (16 (field size)+16 (Zech log) +2 tag bits > 32). We on the other hand don't have this problem in a 64 bit world. (24+24+3 < 64). So we can adopt a much simpler solution.

If we put q itself in the immediate FF representation and then have a thread-safe write-only hash table giving the characteristics and degrees (or something of the kind) then we should avoid all our race condition problems without having to precompute big data structures.

stevelinton · 2018-11-08T21:31:02Z

On reflection, the best solution seems to be to leave the PR more or less exactly as it is, except to
actually construct ffdata.c during the build process. There should be no problem including a small C utility to construct it, and simply building the file in gen when needed. There is scope to save some memory by reducing two big arrays from unsigned long to UInt4 saving half their memory.

stevelinton · 2018-11-09T16:35:54Z

Turns out we already have etc/ffgen.c. So I think all that needs to be done is to make that a bit more flexible and include it in the build process by default, rather than having it as an optional task that can be performed by developers

jdebeule · 2018-11-14T08:07:10Z

Hi all,

thanks a lot so far for the feedback, suggestions and improvements. Past weeks were too busy for me to work on this. I will make time for this again after the Aachen workshop next week.

Jan

fingolfin · 2018-11-14T20:07:30Z

@jdebeule note that @stevelinton by now made PR #2997 and you might want to look there, to avoid duplicate work

fingolfin · 2019-01-18T13:25:44Z

Closing this in favor of PR #2997

jdebeule added 3 commits September 19, 2018 17:51

no chagnes, except addition of comments to explain changes

01ea22b

a comment in ffe.g as well

31dbc26

fingolfin added the gapdays2018-fall Issues and PRs that arose at https://www.gapdays.de/gapdays2018-fall label Sep 20, 2018

fingolfin reviewed Oct 2, 2018

View reviewed changes

stevelinton mentioned this pull request Nov 13, 2018

Larger Finite Fields, take 2 #2997

Open

fingolfin closed this Jan 18, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Large(r) Internally represented FFE's #2846

Large(r) Internally represented FFE's #2846

jdebeule commented Sep 20, 2018

markuspf commented Sep 20, 2018

jdebeule commented Sep 20, 2018

markuspf commented Sep 20, 2018 •

edited

Loading

hulpke commented Sep 23, 2018

stevelinton commented Sep 25, 2018

markuspf commented Sep 26, 2018

stevelinton commented Oct 1, 2018

jdebeule commented Oct 2, 2018

stevelinton commented Oct 2, 2018

fingolfin left a comment

fingolfin Sep 20, 2018

fingolfin Sep 20, 2018

fingolfin Sep 20, 2018

fingolfin Oct 2, 2018

fingolfin Oct 2, 2018

fingolfin Oct 2, 2018

fingolfin Oct 2, 2018

fingolfin Oct 2, 2018

fingolfin Oct 2, 2018

fingolfin Oct 2, 2018

fingolfin Oct 2, 2018

stevelinton commented Nov 6, 2018

stevelinton commented Nov 8, 2018

stevelinton commented Nov 9, 2018

jdebeule commented Nov 14, 2018

fingolfin commented Nov 14, 2018

fingolfin commented Jan 18, 2019

Large(r) Internally represented FFE's #2846

Large(r) Internally represented FFE's #2846

Conversation

jdebeule commented Sep 20, 2018

markuspf commented Sep 20, 2018

jdebeule commented Sep 20, 2018

markuspf commented Sep 20, 2018 • edited Loading

hulpke commented Sep 23, 2018

stevelinton commented Sep 25, 2018

markuspf commented Sep 26, 2018

stevelinton commented Oct 1, 2018

jdebeule commented Oct 2, 2018

stevelinton commented Oct 2, 2018

fingolfin left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

stevelinton commented Nov 6, 2018

stevelinton commented Nov 8, 2018

stevelinton commented Nov 9, 2018

jdebeule commented Nov 14, 2018

fingolfin commented Nov 14, 2018

fingolfin commented Jan 18, 2019

markuspf commented Sep 20, 2018 •

edited

Loading