-
-
Notifications
You must be signed in to change notification settings - Fork 19.3k
REF: Separate out DataFrame/Series Construction Helpers #24100
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Hello @jbrockmendel! Thanks for submitting the PR.
|
Codecov Report
@@ Coverage Diff @@
## master #24100 +/- ##
===========================================
- Coverage 92.21% 42.53% -49.68%
===========================================
Files 161 162 +1
Lines 51684 51713 +29
===========================================
- Hits 47658 21998 -25660
- Misses 4026 29715 +25689
Continue to review full report at Codecov.
|
Codecov Report
@@ Coverage Diff @@
## master #24100 +/- ##
==========================================
- Coverage 92.21% 92.2% -0.01%
==========================================
Files 161 162 +1
Lines 51684 51700 +16
==========================================
+ Hits 47658 47672 +14
- Misses 4026 4028 +2
Continue to review full report at Codecov.
|
jreback
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these can be done in a followup along with renaming the functions to be more consistent with what they do, e.g. init_ndarray, should really be: create_block_manager_from_ndarray (though we have a slight conflict on that name). but for sure need to separate out public / private things.
|
|
||
| return create_block_manager_from_blocks([values], [columns, index]) | ||
| return init_ndarray(values, index, columns, dtype=dtype, copy=copy) | ||
| # TODO: can we just get rid of this as a method? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes i would just call these directly (your new routines)
| arrays = [data[k] for k in keys] | ||
|
|
||
| return _arrays_to_mgr(arrays, data_names, index, columns, dtype=dtype) | ||
| return init_dict(data, index, columns, dtype=dtype) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same comment as below, yes)
| @@ -0,0 +1,699 @@ | |||
| """ | |||
| Functions for preparing various inputs passed to the DataFrame or Series | |||
| constructors before passing them to aBlockManager. | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
space before BlockManager
|
thanks. prob should open an issue with a todo list for followups. |
commit 28c61d770f6dfca6857fd0fa6979d4119a31129e
Author: Tom Augspurger <tom.w.augspurger@gmail.com>
Date: Thu Dec 6 12:18:19 2018 -0600
uncomment
commit bae2e322523efc73a1344464f51611e2dc555ccb
Author: Tom Augspurger <tom.w.augspurger@gmail.com>
Date: Thu Dec 6 12:17:09 2018 -0600
maybe fixes
commit 6cb4db05c9d6ceba3794096f0172cae5ed5f6019
Author: Tom Augspurger <tom.w.augspurger@gmail.com>
Date: Thu Dec 6 09:57:37 2018 -0600
we back
commit d97ab57fb32cb23371169d9ed659ccfac34cfe45
Merge: a117de4 b78aa8d
Author: Tom Augspurger <tom.w.augspurger@gmail.com>
Date: Thu Dec 6 09:51:51 2018 -0600
Merge remote-tracking branch 'upstream/master' into disown-tz-only-rebased2
commit b78aa8d
Author: gfyoung <gfyoung17+GitHub@gmail.com>
Date: Thu Dec 6 07:18:44 2018 -0500
REF/TST: Add pytest idiom to reshape/test_tile (pandas-dev#24107)
commit 2993b8e
Author: gfyoung <gfyoung17+GitHub@gmail.com>
Date: Thu Dec 6 07:17:55 2018 -0500
REF/TST: Add more pytest idiom to scalar/test_nat (pandas-dev#24120)
commit b841374
Author: evangelineliu <hsiyinliu@gmail.com>
Date: Wed Dec 5 18:21:46 2018 -0500
BUG: Fix concat series loss of timezone (pandas-dev#24027)
commit 4ae63aa
Author: jbrockmendel <jbrockmendel@gmail.com>
Date: Wed Dec 5 14:44:50 2018 -0800
Implement DatetimeArray._from_sequence (pandas-dev#24074)
commit 2643721
Author: jbrockmendel <jbrockmendel@gmail.com>
Date: Wed Dec 5 14:43:45 2018 -0800
CLN: Follow-up to pandas-dev#24100 (pandas-dev#24116)
commit 8ea7744
Author: chris-b1 <cbartak@gmail.com>
Date: Wed Dec 5 14:21:23 2018 -0600
PERF: ascii c string functions (pandas-dev#23981)
commit cb862e4
Author: jbrockmendel <jbrockmendel@gmail.com>
Date: Wed Dec 5 12:19:46 2018 -0800
BUG: fix mutation of DTI backing Series/DataFrame (pandas-dev#24096)
commit aead29b
Author: topper-123 <contribute@tensortable.com>
Date: Wed Dec 5 19:06:00 2018 +0000
API: rename MultiIndex.labels to MultiIndex.codes (pandas-dev#23752)
Why? In implementing #24096 I found it tough to tell all the paths by which a DatetimeIndex get passed to a DataFrame or Series. Collecting all these helper functions is a step towards reducing the number of paths available so these things can be caught in one place.
The main thing this PR does is move helper functions from core.series and core.frame. A few ancillary things it does
_try_castand_get_axesare de-nested