Closed
Description
opened on Jul 5, 2013
Working with real world data, it is common to encounter poorly formatted files and such. One thing I find myself frequently doing is something like this in DataFrames:
julia> int(df["Age"])
ERROR: ArgumentError("'F' is not a valid digit (in \"F\")")
in parseint at string.jl:1209
in int at string.jl:1242
in map_to2 at abstractarray.jl:1450
in map at abstractarray.jl:1459
in int at /Users/viral/.julia/DataFrames/src/dataarray.jl:746
The Age column should been integers, but there is some bad data in the column, as a result of which readtable
left the column as a string. When I try to use int
, it is obvious that the data is corrupt, but I have no idea where it is corrupt.
I often find myself doing something like:
[ try int(df["Age"][i]) catch end for i in 1:nrow(df) ]
I think that an isconvertible
function would be generally useful, which takes the same arguments as convert
, but returns a boolean based on whether the conversion is possible or not.
Metadata
Assignees
Labels
No labels
Activity