Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deprecate get and haskey for AbstractDataFrame #1836

Merged
merged 5 commits into from
Jun 3, 2019

Conversation

bkamins
Copy link
Member

@bkamins bkamins commented May 31, 2019

I propose to deprecate get method for AbstractDataFrame. I think it is not used and in general it is confusing to have it since we moved from AbstractDict interpretation.

@nalimilan
Copy link
Member

Shouldn't we also deprecate haskey? It's useful, but it's quite inconsistent since we don't define keys? Or maybe we should define keys too. That could make sense if we decide to keep allowing df[col], but it could also be confusing since a data frame is two-dimensional in most situations.

@bkamins
Copy link
Member Author

bkamins commented Jun 1, 2019

This was exactly my thinking that haskey is useful (as opposed to get).

We could rename haskey to hascol to make it more explicit and restrict its signature to allow only Symbol or Integer. Then we could decide if we go back to #1712 and decide if we want columnindex (or colindex if we prefer) as this would be a complementary function to hascol.

@nalimilan
Copy link
Member

Could we just deprecate haskey(n, df) in favor of n in names(df) for now? Adding hascol is an interesting possibility, but that would only make sense if we also add a function to access columns like getcol or column. Better not rush that decision.

@bkamins
Copy link
Member Author

bkamins commented Jun 1, 2019

I could do it, but the problem is that n in names(df) will be much slower for two reasons:

  • names makes a copy
  • it will be a linear search, while haskey uses a dict inbuilt into Index

In short - if we do it, then it should be only a temporary decision with the plan to provide a first class function that checks if some column exists in a data frame.

How do you see this?

@nalimilan
Copy link
Member

Yes, it's much slower, but do we expect it to be used in performance-critical paths?

@bkamins
Copy link
Member Author

bkamins commented Jun 1, 2019

Yes - I just wanted to make sure that we note down the consequences.

@bkamins
Copy link
Member Author

bkamins commented Jun 1, 2019

I have deprecated haskey.

@bkamins bkamins merged commit 41acf5d into JuliaData:master Jun 3, 2019
@bkamins bkamins deleted the deprecate_get branch June 3, 2019 10:02
@nalimilan nalimilan changed the title Deprecate get for AbstractDataFrame Deprecate get and haskey for AbstractDataFrame Jun 14, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants