Skip to content

PreserveStorage doesn't clean up ref counters, leading to unnecessary object copies #1203

Closed
@mb147

Description

@mb147

Brief background

Rcpp keeps a precious pairlist when PreserveStorage is used.
Releasing an element of that list leaves a ref behind, creating a later memory leak.

As of v4 (certainly) (but possibly as early as Nov 2019), R moved from the NAMED mechanism to ref counting.
This has exposed an issue that likely went undetected before then, because object copies were more common.
https://stat.ethz.ch/pipermail/r-devel/2019-November/078728.html

Relates to my recent question on SO: https://stackoverflow.com/questions/71417422/

Proposed fix

In Rcpp, barrier.cpp, Line 122, add:
SET_TAG(token, R_NilValue);

Use case description

  1. Write a simple Rcpp function that just needs to read data from a vector, do some calcs, then let it go and return, e.g. mean(x) would be a good analogy.
  2. Call that function from R via .Call().
  3. Expected behaviour is that x should have the same REF count in R before and after that call.
  4. If signature of function is FUN(NumericVector x) then previous bullet (3) is violated.
  5. Any further modification of x causes R to copy it, leaving a memory leak equal in size to original object.

Reproducible example

See below, which does the following:

  • Demonstrate the current code in Rcpp(1.0.8)
  • Narrow down the problem to PreserveStorage
  • Replicate with own editable support functions that mirror barrier.cpp
  • Fix it and demonstrate it works

System detail

R: 4.1.2
Windows 11
Rcpp: 1.0.8

Runnable code


library(Rcpp)

# ************************
# Rcpp support functions for investigation
# ************************
sourceCpp(code='
  #define R_NO_REMAP
  #include 
  #include 
  #include 
  using namespace Rcpp;

  // ***********************
  // Lifted from barrier.cpp
  // ***********************

  static SEXP my_precious = R_NilValue;

  // [[Rcpp::export]]
  void my_precious_init() {
      my_precious = Rf_cons(R_NilValue, R_NilValue);
      R_PreserveObject(my_precious);
  }

  // [[Rcpp::export]]
  void my_precious_teardown() {
      R_ReleaseObject(my_precious);
  }

  // [[Rcpp::export]]
  SEXP my_PreciousPreserve(SEXP object) {
    if (object == R_NilValue) {
       return R_NilValue;
    }
    PROTECT(object);
    SEXP cell = PROTECT(Rf_cons(my_precious, CDR(my_precious)));
    SET_TAG(cell, object);                  // <<<<<------ STUFF THAT INCREMENTS REFCNT
    SETCDR(my_precious, cell);
    if (CDR(cell) != R_NilValue) {
       SETCAR(CDR(cell), cell);
    }
    UNPROTECT(2);
    return cell;
  }

  // ***********************
  // Lifted from barrier.cpp and fix added
  // ***********************

  // [[Rcpp::export]]
  void my_PreciousRelease(SEXP token, bool withFix = false) {
     // Lifted from barrier.cpp, only withFix added
     if (token == R_NilValue || TYPEOF(token) != LISTSXP) {
         return;
     }
     if (withFix) {
       SET_TAG(token, R_NilValue);          // <<<<<------ FIX THAT DECREMENTS REFCNT
     }
     SEXP before = CAR(token);
     SEXP after = CDR(token);
     SETCDR(before, after);
     if (after != R_NilValue) {
         SETCAR(after, before);
     }
  }

  // ***********************
  // Investigation helpers, mirroring how Vector and PreserveStorage use barrier functions
  // ***********************

  // [[Rcpp::export]]
  void currentRcppProblem(NumericVector x) {}

  // [[Rcpp::export]]
  void narrowDownToPreserveStorage(SEXP x) {
    PreserveStorage> s;

    // From Vector(SEXP x) : Vector.h, Line 73
    Shield safe(x);
    s.set__( r_cast(safe) );
  }

  // [[Rcpp::export]]
  void replicateWithOwnCode(SEXP x) {
    // Mirroring PreserveStorage.h using content from barrier.cpp

    my_precious_init();                    // replaces Rcpp_precious_init
    SEXP token = my_PreciousPreserve(x);   // replaces Rcpp_PreciousPreserve

    my_PreciousRelease(token, false);      // replaces Rcpp_PreciousRelease
                                           // fix turned off (false)
    my_precious_teardown();                // replaces Rcpp_precious_teardown
  }

  // [[Rcpp::export]]
  void fixPreserveStorage(SEXP x) {
    // Mirroring PreserveStorage.h using content from barrier.cpp

    my_precious_init();                    // replaces Rcpp_precious_init
    SEXP token = my_PreciousPreserve(x);   // replaces Rcpp_PreciousPreserve

    my_PreciousRelease(token, true);      // replaces Rcpp_PreciousRelease
                                           // fix turned on (true)
    my_precious_teardown();                // replaces Rcpp_precious_teardown
  }

  // [[Rcpp::export]]
  void noProtectWorkaround(Vector x) {}

')

# ************************
# Wrap that in some R code that demonstrates the ref count on return
# ************************

rcppDigging <- function() {
# Narrow down the problem in stages (using a fresh vector each time).

  cat("Intial\n")
  initial_vec <- c(1,2) # Simple numeric vector. Verify has initial REF(1) as expected.
  .Internal(inspect(initial_vec))

  cat("\ncurrentRcppProblem\n")
  currentRcppProblem(y <- c(1,2)) # Mem leak: Should still have REF(1), but has REF(3)
  .Internal(inspect(y))

  cat("\nCopy on modify because REF > 1 --> memory leak\n")
  y[1] <- 3 # Memory leak. Original y left with a ref somewhere.
  .Internal(inspect(y))

  # Expectation: calling currentRcppProblem() should temporarily increment REF, but then
  #              it should return to REF(1) when scope of function ends.

  # Problem:     REF(3) means that any modification of y after returning to R from Rcpp generates
  #              a copy instead of a modify-in-place.
  #              The original still has REFs, so hangs around inside Rcpp. Hence memory leak.



  # Investigation: PreserveStorage is one culprit.
  # Running this code just makes a storage class, and gives REF(2)
  cat("\nnarrowDownToPreserveStorage\n")
  narrowDownToPreserveStorage(y <- c(1,2))
  .Internal(inspect(y))

  # Replicate this by reproducing the internals of PreserveStorage with replacement functions.
  # Result: REF(2)
  cat("\nreplicateWithOwnCode\n")
  replicateWithOwnCode(y <- c(1,2))
  .Internal(inspect(y))

  # Turn on the fix that decrements the ref counter correctly.
  # Result: REF(1)
  cat("\nfixPreserveStorage\n")
  fixPreserveStorage(y <- c(1,2))
  .Internal(inspect(y))

  # A workaround exists by using NoProtectStorage, but that seems pretty unsafe.
  # That presumably keeps no record that Rcpp is referencing an object while it's in scope.
  # Maybe not an issue if single-threaded, but not thread-safe, so assume use should be discouraged.
  cat("\nnoProtectWorkaround\n")
  noProtectWorkaround(y <- c(1,2))
  .Internal(inspect(y))
}
rcppDigging()

Code output

Intial
@0x0000019f605cbc18 14 REALSXP g0c2 [REF(1)] (len=2, tl=0) 1,2

currentRcppProblem
@0x0000019f605cbb98 14 REALSXP g0c2 [REF(3)] (len=2, tl=0) 1,2

Copy on modify because REF > 1 --> memory leak
@0x0000019f605d1858 14 REALSXP g0c2 [REF(1)] (len=2, tl=0) 3,4

narrowDownToPreserveStorage
@0x0000019f605d17d8 14 REALSXP g0c2 [REF(2)] (len=2, tl=0) 1,2

replicateWithOwnCode
@0x0000019f605d1758 14 REALSXP g0c2 [REF(2)] (len=2, tl=0) 1,2

fixPreserveStorage
@0x0000019f605d16d8 14 REALSXP g0c2 [REF(1)] (len=2, tl=0) 1,2

noProtectWorkaround
@0x0000019f605d1658 14 REALSXP g0c2 [REF(1)] (len=2, tl=0) 1,2

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions