Extra pool in Code objects prevents S4 objects from collection

Consider the following code (simplified version of the R methods package [test](https://github.com/reactorlabs/gnur/blob/6483fffd7edb49bac2e98479ac534cb7e8448475/src/library/methods/tests/basicRefClass.R#L333)):

```r
A <- setRefClass("A")
A$methods(f = function(i) { print(i) })

bug <- TRUE

# class B inherits from A
B <- setRefClass("B", contains = "A")
B$methods(
  # destructor that sets the global flag `bug` to FALSE
  finalize = function() { bug <<- FALSE },

  # method that calls `f` from the superclass
  g = function() {
    # R at its best - first pull out the method `f` from super class into the current environment
    usingMethods("f")
    # call it using lapply - the match.fun in lapply will resolve the string to a closure
    lapply(1, "f")
})

# new instance of class B
b <- B()

# call B$g which in turn calls A$f printing 1
b$g()

# removes b
rm(b)

# since b is not referenced from anywhere, it should be collected
# and have its finalizer called
gc()

# check that the finalized has run
stopifnot(!bug)
```

In vanilla R it works well.
On master it works well as long as the number of type feedback slots for the observed callees is less than 13.

Here is what happens:

- The `lapply` function gets compiled to RIR.
- It contains a call to `.Internal(lapply, xs, FUN)` which we special-case (inline) in the RIR compiler turning it into a loop with one call to `FUN(xs[[i]])`. This call will create a type feedback slot for the observed callees.
- Once we call it from the `B$g` the interpreter will record the callee, i.e. the closure `A$f` into the type feedback.
- The closure is S4 method, in its environment, it contains reference to `.self` which in turn will prevent it from collection.

The way the observed callees recording works is that first the closure gets stored in the extra pool of the corresponding caller code object, i.e. the 'lapply' function, and then the index of this pool entry gets stored in the type feedback.
Normally, we keep just 3 callees per call site. Since lapply is used for a whole lot of things before we exhaust the available slots way before we call the `b$g()`. However, when we increase the number of slots, the bug manifests. It even manifests when PIR is disabled (PIR_ENABLE=off)

This happened when @stepam38 added the support for the contextualized type feedback.

IMHO: It is a fairly weird test: relying on a call to gc() to collect an object is wrong. Having finalized together with gc also wrong.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Extra pool in Code objects prevents S4 objects from collection #1288

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development