Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fread/fwrite default options for NA mismatch #2281

Open
BenoitLondon opened this issue Aug 3, 2017 · 7 comments
Open

fread/fwrite default options for NA mismatch #2281

BenoitLondon opened this issue Aug 3, 2017 · 7 comments

Comments

@BenoitLondon
Copy link

Hi,

first thank you for this package. :)
I had a problem when saving and reading a csv with fread and fwrite

library(data.table)
dt <- data.table(a = c(NA_integer_, 1L, 2L), b = c("a", "b", NA_character_))
tmp <- tempfile()
fwrite(dt, tmp)
dt2 <- fread(tmp)
dt
#    a  b
#1: NA  a
#2:  1  b
#3:  2 NA
dt2
#    a b
#1: NA a
#2:  1 b
#3:  2  
all.equal(dt,dt2)
#[1] "Column 'b': 'is.NA' value mismatch: 0 in current 1 in target"

So I realised fread and fwrite have different default for na(-strings), so I checked read.csv and write.csv and they use the same default option.
In the end it mixes empty strings with NA which is not an expected default behavior imho.
Shouldn't fwrite use the same default option as write.csv i.e. na = "NA" or NA? so that a call to fread after fwrite would recover the same object?

Cheers,
Benoit

@st-pasha
Copy link
Contributor

st-pasha commented Aug 3, 2017

Appears to be the same issue as #2214

@BenoitLondon
Copy link
Author

Thanks I saw the issue you re refering to but I thought it was a similar but different problem.

@MichaelChirico
Copy link
Member

MichaelChirico commented Aug 3, 2017 via email

@BenoitLondon
Copy link
Author

Just it looked like the other issue is more involved and requires more changes than this one but all fine ;)

@mattdowle mattdowle added this to the v1.10.6 milestone Mar 3, 2018
@mattdowle
Copy link
Member

Thanks @BenoitLondon. Agree.
I've just checked that this is now fixed with the PR #2652 that's in discussion currently. I'll link there back to here and this issue will be closed if and when that PR is merged.

> dt <- data.table(a = c(NA_integer_, 1L, 2L), b = c("a", "b", NA_character_))
> tmp <- tempfile()
> fwrite(dt, tmp)
> dt2 <- fread(tmp)
> dt
    a    b
1: NA    a
2:  1    b
3:  2 <NA>
> dt2
    a    b
1: NA    a
2:  1    b
3:  2 <NA>
> identical(dt,dt2)
[1] TRUE

@Fablepongiste
Copy link

I think this is still not fixed ? Or back with an issue ?

Should default in fwrite for NA not be ... NA ?

> dt <- data.table(a = c(NA_integer_, 1L, 2L), b = c("a", "b", NA_character_))
> tmp <- tempfile()
> fwrite(dt, tmp)
> dt2 <- fread(tmp)
> dt
    a    b
1: NA    a
2:  1    b
3:  2 <NA>
> dt2
    a b
1: NA a
2:  1 b
3:  2  

@mattdowle mattdowle reopened this Nov 29, 2018
@mattdowle mattdowle modified the milestones: v1.11.0, 1.12.0 Nov 29, 2018
@mattdowle mattdowle modified the milestones: 1.12.0, 1.12.2 Jan 6, 2019
@mattdowle mattdowle modified the milestones: 1.12.2, 1.12.4 Feb 26, 2019
@jangorecki jangorecki modified the milestones: 1.12.4, 1.13.0 Sep 17, 2019
@mattdowle mattdowle modified the milestones: 1.12.7, 1.12.9 Dec 8, 2019
@mattdowle mattdowle modified the milestones: 1.13.1, 1.13.3 Oct 17, 2020
@ben-schwen
Copy link
Member

Cannot reproduce on Ubuntu 20.04 with data.table 1.4.1

@jangorecki jangorecki removed this from the 1.14.3 milestone Jul 19, 2022
@jangorecki jangorecki added this to the 1.14.5 milestone Jul 19, 2022
@jangorecki jangorecki modified the milestones: 1.14.11, 1.15.1 Oct 29, 2023
@jangorecki jangorecki removed this from the 1.16.0 milestone Nov 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants