Skip to content

Commit

Permalink
Add transformation and renaming to select and select!
Browse files Browse the repository at this point in the history
  • Loading branch information
bkamins authored Mar 19, 2020
1 parent 20353f4 commit d98b9be
Show file tree
Hide file tree
Showing 13 changed files with 1,459 additions and 737 deletions.
4 changes: 4 additions & 0 deletions docs/src/lib/types.md
Original file line number Diff line number Diff line change
Expand Up @@ -49,6 +49,9 @@ The `RepeatedVector` and `StackedVector` types are subtypes of `AbstractVector`
with the exception that they are read only. Note that they are not exported and should not be constructed directly,
but they are columns of a `DataFrame` returned by `stack` with `view=true`.

The `ByRow` type is a special type used for selection operations to signal that the wrapped function should be applied
to each element (row) of the selection.

## [The design of handling of columns of a `DataFrame`](@id man-columnhandling)

When a `DataFrame` is constructed columns are copied by default. You can disable
Expand Down Expand Up @@ -103,6 +106,7 @@ without caution because:

```@docs
AbstractDataFrame
ByRow
DataFrame
DataFrameRow
GroupedDataFrame
Expand Down
48 changes: 39 additions & 9 deletions docs/src/man/getting_started.md
Original file line number Diff line number Diff line change
Expand Up @@ -522,34 +522,64 @@ julia> df[in.(df.A, Ref([1, 5, 601])), :]
│ 3 │ 601 │ 7 │ 301 │
```

Equivalently, the `in` function can be called with a single argument to create a function object that tests whether each value belongs to the subset (partial application of `in`): `df[in([1, 5, 601]).(df.A), :]`.
Equivalently, the `in` function can be called with a single argument to create
a function object that tests whether each value belongs to the subset
(partial application of `in`): `df[in([1, 5, 601]).(df.A), :]`.

#### Column selection using `select` and `select!`

You can also use the [`select`](@ref) and [`select!`](@ref) functions to select columns in a data frame.
You can also use the [`select`](@ref) and [`select!`](@ref) functions to select,
rename and transform columns in a data frame.

The `select` function creates a new data frame:
```jldoctest dataframe
julia> df = DataFrame(x1=1, x2=2, y=3)
1×3 DataFrame
julia> df = DataFrame(x1=[1, 2], x2=[3, 4], y=[5, 6])
2×3 DataFrame
│ Row │ x1 │ x2 │ y │
│ │ Int64 │ Int64 │ Int64 │
├─────┼───────┼───────┼───────┤
│ 1 │ 1 │ 2 │ 3 │
│ 1 │ 1 │ 3 │ 5 │
│ 2 │ 2 │ 4 │ 6 │
julia> select(df, Not(:x1)) # drop column :x1 in a new data frame
1×2 DataFrame
2×2 DataFrame
│ Row │ x2 │ y │
│ │ Int64 │ Int64 │
├─────┼───────┼───────┤
│ 1 │ 2 │ 3 │
│ 1 │ 3 │ 5 │
│ 2 │ 4 │ 6 │
julia> select(df, r"x") # select columns containing 'x' character
1×2 DataFrame
2×2 DataFrame
│ Row │ x1 │ x2 │
│ │ Int64 │ Int64 │
├─────┼───────┼───────┤
│ 1 │ 1 │ 2 │
│ 1 │ 1 │ 3 │
│ 2 │ 2 │ 4 │
julia> select(df, :x1 => :a1, :x2 => :a2) # rename columns
2×2 DataFrame
│ Row │ a1 │ a2 │
│ │ Int64 │ Int64 │
├─────┼───────┼───────┤
│ 1 │ 1 │ 3 │
│ 2 │ 2 │ 4 │
julia> select(df, :x1, :x2 => (x -> x .- minimum(x)) => :x2) # transform columns
2×2 DataFrame
│ Row │ x1 │ x2 │
│ │ Int64 │ Int64 │
├─────┼───────┼───────┤
│ 1 │ 1 │ 0 │
│ 2 │ 2 │ 1 │
julia> select(df, :x2, :x2 => ByRow(sqrt)) # transform columns by row
2×2 DataFrame
│ Row │ x2 │ x2_sqrt │
│ │ Int64 │ Float64 │
├─────┼───────┼─────────┤
│ 1 │ 3 │ 1.73205 │
│ 2 │ 4 │ 2.0 │
```

It is important to note that `select` always returns a data frame,
Expand Down
2 changes: 2 additions & 0 deletions src/DataFrames.jl
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@ import DataAPI,
export AbstractDataFrame,
All,
Between,
ByRow,
DataFrame,
DataFrame!,
DataFrameRow,
Expand Down Expand Up @@ -83,6 +84,7 @@ include("dataframerow/utils.jl")

include("other/broadcasting.jl")

include("abstractdataframe/selection.jl")
include("abstractdataframe/iteration.jl")
include("abstractdataframe/join.jl")
include("abstractdataframe/reshape.jl")
Expand Down
Loading

0 comments on commit d98b9be

Please sign in to comment.