Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FLOAT_LITERAL_CACHE is dangerous and a source of leaks #1993

Open
fingolfin opened this issue Dec 5, 2017 · 3 comments
Open

FLOAT_LITERAL_CACHE is dangerous and a source of leaks #1993

fingolfin opened this issue Dec 5, 2017 · 3 comments
Labels
kind: bug Issues describing general bugs, and PRs fixing them topic: kernel

Comments

@fingolfin
Copy link
Member

fingolfin commented Dec 5, 2017

When coding GAP functions that contain float constants, these constants are coded either as "lazy" or "eager".

Eager constants are using a specific type of floats (e.g. the built-in machine floats). These are coded by evaluating the constant into float object, which is stored in EAGER_FLOAT_LITERAL_CACHE, and then the index where it was stored is coded into the function. Example:

gap> f:=x->42.0_;;
gap> EAGER_FLOAT_LITERAL_CACHE;
[ 42. ]

Lazy constants are re-evaluated if the user sets a different default floats type using SetFloats. These, too, are assigned a unique global index; when the float constant expression is evaluated, the result is cached in FLOAT_LITERAL_CACHE; this cache is emptied each time SetFloats is called. Example:

gap> g:=x->23.0;;
gap> FLOAT_LITERAL_CACHE;
Error, Variable: 'FLOAT_LITERAL_CACHE' must have a value
not in any function at *stdin*:2
gap> g(1);
23.
gap> FLOAT_LITERAL_CACHE;
[ ,,,, 23. ]

There are several problems with this: First off, exposing these caches allows for crashes and inconsistent computations. While exposing them may have been helpful when developing this feature, it provides a major risk. E.g. we can trivially crash GAP:

gap> EAGER_FLOAT_LITERAL_CACHE:="house";; f:=x->1.0_;;
Assertion failed: (IS_PLIST(EAGER_FLOAT_LITERAL_CACHE)), function CodeEagerFloatExpr, file src/code.c, line 1907.
Abort trap: 6

Moreover, in HPC-GAP, EAGER_FLOAT_LITERAL_CACHE is an atomic list and not immutable, so wonderful things like this are possible:

gap> f:=x->42.0_;;
gap> EAGER_FLOAT_LITERAL_CACHE;
[ 42. ]
gap> EAGER_FLOAT_LITERAL_CACHE[1]:=23.;
23.
gap> Display(f);
function ( x )
    return 42.0_;
end
gap> f(1);
23.

The same is possible with FLOAT_LITERAL_CACHE even in regular GAP, although there we could at least in principle mark it as immutable.

Another problem is that GAP functions are objects like any other, and hence can be garbage collected. But they contain no reference to the lazy or eager floats they contain. Hence the need for us to use the caches. But that also means that if a function is GCed, then any float constant it referenced will stay around for ever. I.e. we have a leak.

The first part of this issue can be resolved by not exposing these caches to the user. That's rather easy to do. To solve the second issue, here are some thoughts:

For the eager float constants, we could do the following: first, add a new entry to struct BodyHeader, say Obj floatcache. Then when coding a body (T_BODY), any eager floats get assigned to that cache. Drawback is that this causes some overhead for every function which has a body, even if they don't use floats.

For the lazy ones, we need a way to store them in a way that still makes them flushable. So perhaps FLOAT_LITERAL_CACHE could become an (atomic) record (which we abuse as a "sparse list" here). Then we add a custom free function for T_BODY, which iterates over the syntax tree stored in the T_BODY, and for each float cache index i in it, it unbinds FLOAT_LITERAL_CACHE.(i).
(This is somewhat related to issue #1889, I guess). Or, to have it a bit simpler, we could also store the indices for the lazy constants separately, perhaps even in the same floatcache. I.e. each floatcache entry will either be an eagerly evaluated floatean, or else a a positive immediate integer which is an "index" into FLOAT_LITERAL_CACHE.

@fingolfin fingolfin added kind: bug Issues describing general bugs, and PRs fixing them topic: kernel kind: bug: wrong result Issues describing bugs that result in mathematically or otherwise wrong results, and PRs fixing them labels Dec 5, 2017
@fingolfin
Copy link
Member Author

To clarify: not exposing EAGER_FLOAT_LITERAL_CACHE is trivial. For FLOAT_LITERAL_CACHE we also need to add a replacement way which allows SetFloats to flush it, but that, too, is easy.

@stevelinton
Copy link
Contributor

It's not really a surprise that you can break GAP by modifying all-caps data structures. I imagine the method selection data offers lots of ways to do it too. Doesn't mean that fixing them isn't worth doing, but I'm not sure it's a priority. To my mind, the memory leak is a worse problem, but every function body corresponds to a function()...end in some code that was parsed, so it's hard to visualize a program that would GC really huge numbers of function bodies. Maybe one using ReadAsFunction really heavily. On the other hand having the float cache in the function body is probably nicer overall. For the lazy case, would it be easier, and perhaps more useful anyway to maintain somewhere a list of all T_BODY objects in the workspace (typically a few 10s of thousands) and use a custom free function to remove dead body bags from it. That might be useful for other purposes.

fingolfin added a commit to fingolfin/gap that referenced this issue Jan 11, 2019
When coding GAP function bodies, we now have an (optional) plist in the
function body of (immutable) values. Instead of storing the raw data
for an integer or string within the respective expression, we instead
store the integer resp. string in that values plist, and retrieve it
from there when printing or evaluation such expressions (for strings,
evaluation returns a shallow copy of the stored string constant, to
preserve semantics). For eager floats, the values plist is used instead
of `EAGER_FLOAT_LITERAL_CACHE`, which is removed by this patch (thus
resolving part of issue gap-system#1993).

These changes have several advantages:
* we avoid a possible GC crash when compiling (large) string expressions
* a function like `f := x -> 12345678901234567890` used to return a new
  integer object with each call; now it always returns the same integer,
  i.e., `IsIdenticalObj(f(1), f(2));` now returns true
* floats benefit from avoiding use of a global "cache" which never is
  emptied (thus these eager float literals now are garbage collected if
  the function referencing them is garbage collected)
* the cache simplifies future optimizations: e.g. we could convert
  expressions like `-2^100` by evaluating them and storing the result
  in the values list; we could also store permutations precomputed, that
  is, replace a `T_PERM_EXPR` and the `T_PERM_CYCLE` expressions nested
  in it by a single "constant expression" that references the immutable
  permutation described by these expressions (at least in the case these
  expressions were explicitly defined, i.e., `(1,2)` would work but not
  `(i,j)`
fingolfin added a commit to fingolfin/gap that referenced this issue Jan 11, 2019
When coding GAP function bodies, we now have an (optional) plist in the
function body of (immutable) values. Instead of storing the raw data
for an integer or string within the respective expression, we instead
store the integer resp. string in that values plist, and retrieve it
from there when printing or evaluation such expressions (for strings,
evaluation returns a shallow copy of the stored string constant, to
preserve semantics). For eager floats, the values plist is used instead
of `EAGER_FLOAT_LITERAL_CACHE`, which is removed by this patch (thus
resolving part of issue gap-system#1993).

These changes have several advantages:
* we avoid a possible GC crash when compiling (large) string expressions
* a function like `f := x -> 12345678901234567890` used to return a new
  integer object with each call; now it always returns the same integer,
  i.e., `IsIdenticalObj(f(1), f(2));` now returns true
* floats benefit from avoiding use of a global "cache" which never is
  emptied (thus these eager float literals now are garbage collected if
  the function referencing them is garbage collected)
* the cache simplifies future optimizations: e.g. we could convert
  expressions like `-2^100` by evaluating them and storing the result
  in the values list; we could also store permutations precomputed, that
  is, replace a `T_PERM_EXPR` and the `T_PERM_CYCLE` expressions nested
  in it by a single "constant expression" that references the immutable
  permutation described by these expressions (at least in the case these
  expressions were explicitly defined, i.e., `(1,2)` would work but not
  `(i,j)`
fingolfin added a commit to fingolfin/gap that referenced this issue Jan 11, 2019
When coding GAP function bodies, we now have an (optional) plist in the
function body of (immutable) values. Instead of storing the raw data
for an integer or string within the respective expression, we instead
store the integer resp. string in that values plist, and retrieve it
from there when printing or evaluation such expressions (for strings,
evaluation returns a shallow copy of the stored string constant, to
preserve semantics). For eager floats, the values plist is used instead
of `EAGER_FLOAT_LITERAL_CACHE`, which is removed by this patch (thus
resolving part of issue gap-system#1993).

These changes have several advantages:
* a function like `f := x -> 12345678901234567890` used to return a new
  integer object with each call; now it always returns the same integer,
  i.e., `IsIdenticalObj(f(1), f(2));` now returns true
* floats benefit from avoiding use of a global "cache" which never is
  emptied (thus these eager float literals now are garbage collected if
  the function referencing them is garbage collected)
* the cache simplifies future optimizations: e.g. we could convert
  expressions like `-2^100` by evaluating them and storing the result
  in the values list; we could also store permutations precomputed, that
  is, replace a `T_PERM_EXPR` and the `T_PERM_CYCLE` expressions nested
  in it by a single "constant expression" that references the immutable
  permutation described by these expressions (at least in the case these
  expressions were explicitly defined, i.e., `(1,2)` would work but not
  `(i,j)`
fingolfin added a commit to fingolfin/gap that referenced this issue Jan 11, 2019
When coding GAP function bodies, we now have an (optional) plist in the
function body of (immutable) values. Instead of storing the raw data
for an integer or string within the respective expression, we instead
store the integer resp. string in that values plist, and retrieve it
from there when printing or evaluation such expressions (for strings,
evaluation returns a shallow copy of the stored string constant, to
preserve semantics). For eager floats, the values plist is used instead
of `EAGER_FLOAT_LITERAL_CACHE`, which is removed by this patch (thus
resolving part of issue gap-system#1993).

These changes have several advantages:
* a function like `f := x -> 12345678901234567890` used to return a new
  integer object with each call; now it always returns the same integer,
  i.e., `IsIdenticalObj(f(1), f(2));` now returns true
* floats benefit from avoiding use of a global "cache" which never is
  emptied (thus these eager float literals now are garbage collected if
  the function referencing them is garbage collected)
* the cache simplifies future optimizations: e.g. we could convert
  expressions like `-2^100` by evaluating them and storing the result
  in the values list; we could also store permutations precomputed, that
  is, replace a `T_PERM_EXPR` and the `T_PERM_CYCLE` expressions nested
  in it by a single "constant expression" that references the immutable
  permutation described by these expressions (at least in the case these
  expressions were explicitly defined, i.e., `(1,2)` would work but not
  `(i,j)`
fingolfin added a commit that referenced this issue Jan 14, 2019
When coding GAP function bodies, we now have an (optional) plist in the
function body of (immutable) values. Instead of storing the raw data
for an integer or string within the respective expression, we instead
store the integer resp. string in that values plist, and retrieve it
from there when printing or evaluation such expressions (for strings,
evaluation returns a shallow copy of the stored string constant, to
preserve semantics). For eager floats, the values plist is used instead
of `EAGER_FLOAT_LITERAL_CACHE`, which is removed by this patch (thus
resolving part of issue #1993).

These changes have several advantages:
* a function like `f := x -> 12345678901234567890` used to return a new
  integer object with each call; now it always returns the same integer,
  i.e., `IsIdenticalObj(f(1), f(2));` now returns true
* floats benefit from avoiding use of a global "cache" which never is
  emptied (thus these eager float literals now are garbage collected if
  the function referencing them is garbage collected)
* the cache simplifies future optimizations: e.g. we could convert
  expressions like `-2^100` by evaluating them and storing the result
  in the values list; we could also store permutations precomputed, that
  is, replace a `T_PERM_EXPR` and the `T_PERM_CYCLE` expressions nested
  in it by a single "constant expression" that references the immutable
  permutation described by these expressions (at least in the case these
  expressions were explicitly defined, i.e., `(1,2)` would work but not
  `(i,j)`
@fingolfin
Copy link
Member Author

As an update: EAGER_FLOAT_LITERAL_CACHE has been removed.

@fingolfin fingolfin changed the title EAGER_FLOAT_LITERAL_CACHE and FLOAT_LITERAL_CACHE are dangerous and source of leaks FLOAT_LITERAL_CACHE is dangerous and a source of leaks Mar 26, 2019
@fingolfin fingolfin removed the kind: bug: wrong result Issues describing bugs that result in mathematically or otherwise wrong results, and PRs fixing them label Mar 26, 2019
ssiccha pushed a commit to ssiccha/gap that referenced this issue Mar 27, 2019
When coding GAP function bodies, we now have an (optional) plist in the
function body of (immutable) values. Instead of storing the raw data
for an integer or string within the respective expression, we instead
store the integer resp. string in that values plist, and retrieve it
from there when printing or evaluation such expressions (for strings,
evaluation returns a shallow copy of the stored string constant, to
preserve semantics). For eager floats, the values plist is used instead
of `EAGER_FLOAT_LITERAL_CACHE`, which is removed by this patch (thus
resolving part of issue gap-system#1993).

These changes have several advantages:
* a function like `f := x -> 12345678901234567890` used to return a new
  integer object with each call; now it always returns the same integer,
  i.e., `IsIdenticalObj(f(1), f(2));` now returns true
* floats benefit from avoiding use of a global "cache" which never is
  emptied (thus these eager float literals now are garbage collected if
  the function referencing them is garbage collected)
* the cache simplifies future optimizations: e.g. we could convert
  expressions like `-2^100` by evaluating them and storing the result
  in the values list; we could also store permutations precomputed, that
  is, replace a `T_PERM_EXPR` and the `T_PERM_CYCLE` expressions nested
  in it by a single "constant expression" that references the immutable
  permutation described by these expressions (at least in the case these
  expressions were explicitly defined, i.e., `(1,2)` would work but not
  `(i,j)`
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind: bug Issues describing general bugs, and PRs fixing them topic: kernel
Projects
None yet
Development

No branches or pull requests

2 participants