Skip to content

DataFrame might accept string with getindex #959

Closed
@femtotrader

Description

@femtotrader

Hello,

There is a strange API difference between DataFrame (from TimeSeries.jl) and TimeArray (from TimeSeries.jl).

with DataFrames

julia> filename = "test/ford_2012.csv"
julia> dfOHLCV = readtable(filename);

julia> dfOHLCV[:Date] = Date(dfOHLCV[:Date]);

julia> dfOHLCV
250x6 DataFrames.DataFrame
│ Row │ Date       │ Open  │ High  │ Low   │ Close │ Volume    │
┝━━━━━┿━━━━━━━━━━━━┿━━━━━━━┿━━━━━━━┿━━━━━━━┿━━━━━━━┿━━━━━━━━━━━┥
│ 1   │ 2012-01-03 │ 11.0  │ 11.25 │ 10.99 │ 11.13 │ 45709900  │
│ 2   │ 2012-01-04 │ 11.15 │ 11.53 │ 11.07 │ 11.3  │ 79725200  │
│ 3   │ 2012-01-05 │ 11.33 │ 11.63 │ 11.24 │ 11.59 │ 67877500  │
│ 4   │ 2012-01-06 │ 11.74 │ 11.8  │ 11.52 │ 11.71 │ 59840700  │
│ 5   │ 2012-01-09 │ 11.83 │ 11.95 │ 11.7  │ 11.8  │ 53981500  │
│ 6   │ 2012-01-10 │ 12.0  │ 12.05 │ 11.63 │ 11.8  │ 121750600 │
│ 7   │ 2012-01-11 │ 11.74 │ 12.18 │ 11.65 │ 12.07 │ 63806000  │
│ 8   │ 2012-01-12 │ 12.16 │ 12.18 │ 11.89 │ 12.14 │ 48687700  │
│ 9   │ 2012-01-13 │ 12.01 │ 12.08 │ 11.84 │ 12.04 │ 46366700  │
│ 10  │ 2012-01-17 │ 12.2  │ 12.26 │ 11.96 │ 12.02 │ 44398400  │
│ 11  │ 2012-01-18 │ 12.03 │ 12.37 │ 12.0  │ 12.34 │ 47102700  │
│ 12  │ 2012-01-19 │ 12.48 │ 12.72 │ 12.43 │ 12.61 │ 70894200  │
│ 13  │ 2012-01-20 │ 12.55 │ 12.64 │ 12.45 │ 12.59 │ 43705700  │
│ 14  │ 2012-01-23 │ 12.69 │ 12.84 │ 12.55 │ 12.66 │ 49379700  │
│ 15  │ 2012-01-24 │ 12.56 │ 12.86 │ 12.46 │ 12.82 │ 45768400  │
│ 16  │ 2012-01-25 │ 12.8  │ 12.98 │ 12.7  │ 12.93 │ 54021600  │
│ 17  │ 2012-01-26 │ 13.03 │ 13.05 │ 12.66 │ 12.79 │ 75470700  │
│ 18  │ 2012-01-27 │ 11.96 │ 12.53 │ 11.79 │ 12.21 │ 142155300 │
⋮
│ 232 │ 2012-12-04 │ 11.4  │ 11.44 │ 11.23 │ 11.31 │ 37760200  │
│ 233 │ 2012-12-05 │ 11.32 │ 11.4  │ 11.18 │ 11.31 │ 33152400  │
│ 234 │ 2012-12-06 │ 11.26 │ 11.31 │ 11.19 │ 11.24 │ 31065800  │
│ 235 │ 2012-12-07 │ 11.27 │ 11.5  │ 11.26 │ 11.48 │ 38404500  │
│ 236 │ 2012-12-10 │ 11.41 │ 11.53 │ 11.41 │ 11.47 │ 26025200  │
│ 237 │ 2012-12-11 │ 11.51 │ 11.58 │ 11.4  │ 11.49 │ 36326900  │
│ 238 │ 2012-12-12 │ 11.52 │ 11.56 │ 11.43 │ 11.47 │ 31099900  │
│ 239 │ 2012-12-13 │ 11.46 │ 11.5  │ 11.21 │ 11.27 │ 35443200  │
│ 240 │ 2012-12-14 │ 11.27 │ 11.27 │ 11.03 │ 11.1  │ 36933500  │
│ 241 │ 2012-12-17 │ 11.16 │ 11.41 │ 11.14 │ 11.39 │ 46983300  │
│ 242 │ 2012-12-18 │ 11.48 │ 11.68 │ 11.4  │ 11.67 │ 61810400  │
│ 243 │ 2012-12-19 │ 11.79 │ 11.85 │ 11.62 │ 11.73 │ 54884700  │
│ 244 │ 2012-12-20 │ 11.74 │ 11.8  │ 11.58 │ 11.77 │ 47750100  │
│ 245 │ 2012-12-21 │ 11.55 │ 11.86 │ 11.47 │ 11.86 │ 94489300  │
│ 246 │ 2012-12-24 │ 11.67 │ 12.4  │ 11.67 │ 12.4  │ 91734900  │
│ 247 │ 2012-12-26 │ 12.31 │ 12.79 │ 12.31 │ 12.79 │ 140331900 │
│ 248 │ 2012-12-27 │ 12.79 │ 12.81 │ 12.36 │ 12.76 │ 108315100 │
│ 249 │ 2012-12-28 │ 12.55 │ 12.88 │ 12.52 │ 12.87 │ 95668600  │
│ 250 │ 2012-12-31 │ 12.88 │ 13.08 │ 12.76 │ 12.95 │ 106908900 │

julia> dfOHLCV[:Open]
250-element DataArrays.DataArray{Float64,1}:
 11.0
 11.15
 11.33
 11.74
 11.83
 12.0
 11.74
 12.16
 12.01
 12.2
 12.03
 12.48
 12.55
 12.69
 12.56
 12.8
 13.03
 11.96
 12.06
  ⋮
 11.4
 11.32
 11.26
 11.27
 11.41
 11.51
 11.52
 11.46
 11.27
 11.16
 11.48
 11.79
 11.74
 11.55
 11.67
 12.31
 12.79
 12.55
 12.88

with TimeArray

julia> ohlcv = readtimearray(filename)
250x5 TimeSeries.TimeArray{Float64,2,Date,Array{Float64,2}} 2012-01-03 to 2012-12-31

             Open     High     Low      Close    Volume
2012-01-03 | 11.0     11.25    10.99    11.13    45709900
2012-01-04 | 11.15    11.53    11.07    11.3     79725200
2012-01-05 | 11.33    11.63    11.24    11.59    67877500
2012-01-06 | 11.74    11.8     11.52    11.71    59840700

2012-12-26 | 12.31    12.79    12.31    12.79    140331900
2012-12-27 | 12.79    12.81    12.36    12.76    108315100
2012-12-28 | 12.55    12.88    12.52    12.87    95668600
2012-12-31 | 12.88    13.08    12.76    12.95    106908900

julia> ohlcv["Close"]
250x1 TimeSeries.TimeArray{Float64,1,Date,Array{Float64,1}} 2012-01-03 to 2012-12-31

             Close
2012-01-03 | 11.13
2012-01-04 | 11.3
2012-01-05 | 11.59
2012-01-06 | 11.71

2012-12-26 | 12.79
2012-12-27 | 12.76
2012-12-28 | 12.87
2012-12-31 | 12.95

but

julia> dfOHLCV["Close"]
ERROR: MethodError: `getindex` has no method matching getindex(::DataFrames.DataFrame, ::ASCIIString)
Closest candidates are:
  getindex(::DataFrames.DataFrame, ::Real, ::Union{Real,Symbol})
  getindex{T<:Union{Real,Symbol}}(::DataFrames.DataFrame, ::Real, ::AbstractArray{T<:Union{Real,Symbol},1})
  getindex(::DataFrames.DataFrame, ::Real, ::Colon)
  ...

julia> ohlcv[:Close]
ERROR: MethodError: `getindex` has no method matching getindex(::TimeSeries.TimeArray{Float64,2,Date,Array{Float64,2}}, ::Symbol)
Closest candidates are:
  getindex{T,N,D}(::TimeSeries.TimeArray{T,N,D,A<:AbstractArray{T,N}}, ::Int64)
  getindex{T,N,D}(::TimeSeries.TimeArray{T,N,D,A<:AbstractArray{T,N}}, ::UnitRange{Int64})
  getindex{T,N,D}(::TimeSeries.TimeArray{T,N,D,A<:AbstractArray{T,N}}, ::Array{Int64,1})
  ...

It will be nice if getindex could accept ASCIIString (or more generally AbstractString) as second parameter when a DataFrame is given as first parameter.

Kind regards

PS: see JuliaStats/TimeSeries.jl#262

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions