Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reproducible segfault on setkey() in data.table 1.9.7 development version #1753

Open
peterlittlejohn opened this issue Jun 23, 2016 · 4 comments
Assignees
Milestone

Comments

@peterlittlejohn
Copy link

peterlittlejohn commented Jun 23, 2016

Dear data.table team,

Here's a reproducible segmentation fault in the current development version (1.9.7):

library(data.table)
test <- data.table(A=rep(letters[1:20],5e07))
setkey(test,A) #  segfault here

## data.table info:
# data.table 1.9.7 IN DEVELOPMENT built 2016-06-22 23:36:56 UTC; travis

## system info:
# R version 3.2.3 (2015-12-10)
# Platform: x86_64-redhat-linux-gnu (64-bit)
# Running under: CentOS Linux 7 (Core)

# locale:
#  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
#  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
#  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
#  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
#  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
# [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

# attached base packages:
# [1] stats     graphics  grDevices utils     datasets  methods   base 
@peterlittlejohn
Copy link
Author

Following up on this:
The segfault only occurs with development version, not with CRAN version 1.9.6.

Different machine with 16Gb of RAM; more system info shown below.
library(data.table)
data.table 1.9.6 For help type ?data.table or https://github.com/Rdatatable/data.table/wiki

test <- data.table(A=rep(letters[1:20],5e07))
setkey(test,A)
tables()

NAME NROW NCOL MB COLS KEY
[1,] test 1,000,000,000 1 7,630 A A

Total: 7,630MB

Same commands issued after installing current development version of data.table:
library(data.table)
data.table 1.9.7 IN DEVELOPMENT built 2016-06-24 16:35:29 UTC; travis

test <- data.table(A=rep(letters[1:20],5e07))
setkey(test,A) 

*** caught segfault ***
address 0x7f6ede7e5810, cause 'memory not mapped'

*** caught segfault ***
address 0x7f6ea2e38e10, cause 'memory not mapped'

*** caught segfault ***
address 0x7f6e914eb608, cause 'memory not mapped'

Traceback:
1: setkeyv(x, cols, verbose = verbose, physical = physical)
2: setkey(test, A)

Sysinfo
sysname release
"Linux" "4.5.7-200.fc23.x86_64"
machine login
"x86_64" "ap"

@ChristK
Copy link

ChristK commented Jun 25, 2016

@arunsrinivasan I can reproduce it on 2 different machines running Scientific Linux release 6.2 (Carbon) x64. One with 256Gb of RAM and one with 2Tb of RAM. However, unlike OP I get segfaults both in CRAN and in latest GitHub data.table version

sessionInfo()
#R version 3.3.0 (2016-05-03)
#Platform: x86_64-pc-linux-gnu (64-bit)
#Running under: Scientific Linux release 6.2 (Carbon)

#locale:
# [1] LC_CTYPE=en_GB.UTF-8       LC_NUMERIC=C              
# [3] LC_TIME=en_GB.UTF-8        LC_COLLATE=en_GB.UTF-8    
# [5] LC_MONETARY=en_GB.UTF-8    LC_MESSAGES=en_GB.UTF-8   
# [7] LC_PAPER=en_GB.UTF-8       LC_NAME=C                 
# [9] LC_ADDRESS=C               LC_TELEPHONE=C            
#[11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C       

#attached base packages:
#[1] stats     graphics  grDevices utils     datasets  methods   base     

#other attached packages:
#[1] data.table_1.9.7

@jangorecki
Copy link
Member

reproduced on

R version 3.2.1 (2015-06-18)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu precise (12.04.5 LTS)

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] data.table_1.9.7

@mainguyenanhvu
Copy link

reproduced on:

R version 4.1.3 (2022-03-10) -- "One Push-Up"
Copyright (C) 2022 The R Foundation for Statistical Computing
Platform: x86_64-conda-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> library(data.table)
data.table 1.14.2 using 128 threads (see ?getDTthreads).  Latest news: r-datatable.com

@jangorecki jangorecki reopened this Jun 19, 2022
@MichaelChirico MichaelChirico modified the milestones: v1.9.8, 1.17.0 Jul 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants