Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rename ignoring not found #2964

Open
chris-b1 opened this issue Dec 13, 2021 · 2 comments
Open

Rename ignoring not found #2964

chris-b1 opened this issue Dec 13, 2021 · 2 comments

Comments

@chris-b1
Copy link

For the mapping overloads of rename / rename! - I've at times wanted to ignore columns that aren't found.

  rename(df::AbstractDataFrame, (from => to)::Pair...)
  rename(df::AbstractDataFrame, d::AbstractDict)
  rename(df::AbstractDataFrame, d::AbstractVector{<:Pair})

I think an optional kwarg would be a reasonable extension of the api? (not sure on name)

julia> df = DataFrame(; a=[1, 2, 3], b=[4, 5, 6])
3×2 DataFrame
 Row │ a      b
     │ Int64  Int64
─────┼──────────────
   11      4
   22      5
   33      6


julia> rename(df, :a => :column_1, :c => :column_2)
ERROR: ArgumentError: Tried renaming :c to :column_2, when :c does not exist in the Index.

julia> rename(df, :a => :column_1, :c => :column_2; ignorenotfound=true)
3×2 DataFrame
 Row │ column_1  b
     │ Int64     Int64
─────┼─────────────────
   11      4
   22      5
   33      6
@bkamins bkamins added this to the 1.x milestone Dec 13, 2021
@bkamins
Copy link
Member

bkamins commented Dec 13, 2021

This issue pops up occasionally for several functions (I would need to think to create a list of them). We can consider adding it, as I can see the rationale (I understand you e.g. have many data frames with different column names and have a fixed set of operations you want to do and sometimes these operations do not apply as the data frame does not have some column - right?)

@chris-b1
Copy link
Author

Thanks. Yep, in particular for me it is a messy data / evolving schema situation.

E.g., data from Jan-Feb has columns [:a, :b, :c], data from Mar has [:a, :b, :c, :d] and so on - I'd like one data processing pipeline that can handle any of the historical files.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants