Skip to content

Commit

Permalink
Fix memory leak in fread (#4710)
Browse files Browse the repository at this point in the history
  • Loading branch information
jimhester authored and mattdowle committed Nov 16, 2022
1 parent 16cf294 commit 263958f
Show file tree
Hide file tree
Showing 6 changed files with 37 additions and 3 deletions.
5 changes: 5 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,10 +2,15 @@

# data.table [v1.14.6](https://github.com/Rdatatable/data.table/milestone/27?closed=1)

## BUG FIXES

1. `fread()` could leak memory, [#3292](https://github.com/Rdatatable/data.table/issues/3292). Thanks to @patrickhowerter for reporting, and Jim Hester for the fix. The fix requires R 3.4.0 or later. Loading `data.table` in earlier versions now highlights this issue on startup, asks users to upgrade R, and warns that we intend to upgrade `data.table`'s dependency from 8 year old R 3.1.0 (April 2014) to 5 year old R 3.4.0 (April 2017).

## NOTES

1. Test 1962.098 has been modified to pass latest changes to `POSIXt` in R-devel.


# data.table [v1.14.4](https://github.com/Rdatatable/data.table/milestone/26?closed=1) (17 Oct 2022)

## NOTES
Expand Down
4 changes: 4 additions & 0 deletions R/onAttach.R
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,10 @@
else
paste0("This is ", Sys.info()["sysname"], ". This warning should not normally occur on Windows or Linux where OpenMP is turned on by data.table's configure script by passing -fopenmp to the compiler. If you see this warning on Windows or Linux, please file a GitHub issue."),
"\n**********")
if (.Call(CbeforeR340)) {
# not base::getRversion()<"3.4.0" in case the user upgrades R but does not reinstall data.table; a reasonable mistake since data.table would seem to be the latest version
packageStartupMessage("**********\nThis data.table installation was compiled for R < 3.4.0 (Apr 2017) where fread() is known to leak memory. Please upgrade R and reinstall data.table to fix the leak. Maintaining and testing code branches to support very old versions increases development time so please do upgrade R. We intend to bump data.table's dependency from 8 year old R 3.1.0 (Apr 2014) to 5 year old R 3.4.0 (Apr 2017).\n**********")
}
}
}

Expand Down
8 changes: 8 additions & 0 deletions inst/tests/benchmark.Rraw
Original file line number Diff line number Diff line change
Expand Up @@ -168,3 +168,11 @@ test(1742.5, substring(x,nchar(x)-10,nchar(x)), c("50,28,95,76","62,87,23,40"))

# Add scaled-up non-ASCII forder test 1896

# fread leak, #3292
dummy = rep("1\t2\t3\t4\t5", 10000000)
writeLines(dummy, "out.tsv")
start = gc()["Vcells",2]
for (i in 1:10) data.table::fread("out.tsv")
end = gc()["Vcells",2]
test(, end/start < 1.05)

4 changes: 4 additions & 0 deletions src/data.table.h
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,9 @@
# define USE_RINTERNALS // #3301
# define DATAPTR_RO(x) ((const void *)DATAPTR(x))
#endif
#if !defined(R_VERSION) || R_VERSION < R_Version(3, 4, 0)
# define SET_GROWABLE_BIT(x) // #3292
#endif
#include <Rinternals.h>
#define SEXPPTR_RO(x) ((const SEXP *)DATAPTR_RO(x)) // to avoid overhead of looped STRING_ELT and VECTOR_ELT
#include <stdint.h> // for uint64_t rather than unsigned long long
Expand Down Expand Up @@ -314,6 +317,7 @@ SEXP nqRecreateIndices(SEXP, SEXP, SEXP, SEXP, SEXP);
SEXP fsort(SEXP, SEXP);
SEXP inrange(SEXP, SEXP, SEXP, SEXP);
SEXP hasOpenMP(void);
SEXP beforeR340(void);
SEXP uniqueNlogical(SEXP, SEXP);
SEXP dllVersion(void);
SEXP initLastUpdated(SEXP);
Expand Down
8 changes: 5 additions & 3 deletions src/freadR.c
Original file line number Diff line number Diff line change
Expand Up @@ -526,9 +526,11 @@ void setFinalNrow(size_t nrow) {
if (length(DT)) {
if (nrow == dtnrows)
return;
for (int i=0; i<LENGTH(DT); i++) {
SETLENGTH(VECTOR_ELT(DT,i), nrow); // TODO: realloc
SET_TRUELENGTH(VECTOR_ELT(DT,i), nrow);
const int ncol=LENGTH(DT);
for (int i=0; i<ncol; i++) {
SETLENGTH(VECTOR_ELT(DT,i), nrow);
SET_TRUELENGTH(VECTOR_ELT(DT,i), dtnrows);
SET_GROWABLE_BIT(VECTOR_ELT(DT,i)); // #3292
}
}
R_FlushConsole(); // # 2481. Just a convenient place; nothing per se to do with setFinalNrow()
Expand Down
11 changes: 11 additions & 0 deletions src/init.c
Original file line number Diff line number Diff line change
Expand Up @@ -111,6 +111,7 @@ R_CallMethodDef callMethods[] = {
{"Cinrange", (DL_FUNC) &inrange, -1},
{"Cbetween", (DL_FUNC) &between, -1},
{"ChasOpenMP", (DL_FUNC) &hasOpenMP, -1},
{"CbeforeR340", (DL_FUNC) &beforeR340, -1},
{"CuniqueNlogical", (DL_FUNC) &uniqueNlogical, -1},
{"CfrollfunR", (DL_FUNC) &frollfunR, -1},
{"CdllVersion", (DL_FUNC) &dllVersion, -1},
Expand Down Expand Up @@ -314,6 +315,16 @@ SEXP hasOpenMP(void) {
}
// # nocov end

SEXP beforeR340(void) {
// used in onAttach.R for message about fread memory leak fix needing R 3.4.0
// at C level to catch if user upgrades R but does not reinstall data.table
#if defined(R_VERSION) && R_VERSION<R_Version(3,4,0)
return ScalarLogical(true);
#else
return ScalarLogical(false);
#endif
}

extern int *_Last_updated; // assign.c

SEXP initLastUpdated(SEXP var) {
Expand Down

0 comments on commit 263958f

Please sign in to comment.