Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New array conversion methods #9236

Merged
merged 24 commits into from
Oct 1, 2021
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
24 commits
Select commit Hold shift + click to select a range
2922dd1
Add DataFrame.values_host and remove uses of DataFrame.as_matrix from…
vyasr Sep 10, 2021
853085e
Initial implementation of to_numpy and to_cupy.
vyasr Sep 10, 2021
baa873f
Add warning and fix comments.
vyasr Sep 14, 2021
05db299
Move all standard functions to frame.
vyasr Sep 14, 2021
59f944d
Add warning to Series.to_array.
vyasr Sep 14, 2021
f7054d7
Remove as_gpu_matrix from tests and fix some uncovered bugs.
vyasr Sep 14, 2021
77ae420
Centralize to_numpy and to_cupy logic.
vyasr Sep 14, 2021
da49be6
Delete docstrings for deprecated methods.
vyasr Sep 14, 2021
f70ab9e
Revert to calling the column methods directly for SingleColumnFrame.
vyasr Sep 14, 2021
3e6d6da
Replace to_array with to_numpy wherever easy in tests.
vyasr Sep 14, 2021
e1de7f9
Add support for null replacement and fix a few bugs.
vyasr Sep 15, 2021
4d1ac2a
Remove usage of default_na_value everywhere possible.
vyasr Sep 17, 2021
944704b
Rename default_na_value to _default_na_value.
vyasr Sep 17, 2021
047c2ca
Add comment for additional methods to be removed.
vyasr Sep 17, 2021
6fc2bbf
Add back accidentally remove default na and remove all possible refs …
vyasr Sep 17, 2021
b228483
Replace numerous uses of to_array with to_numpy.
vyasr Sep 17, 2021
46423f5
Address PR comments.
vyasr Sep 17, 2021
e808c90
Add more deprecation comments and fix issue in testing function.
vyasr Sep 21, 2021
603a8b6
Change exception thrown by StringColumn.values.
vyasr Sep 21, 2021
24724d9
Replace more instances of to_array in tests.
vyasr Sep 21, 2021
8ed6d00
Remove all remaining references to to_array in code.
vyasr Sep 21, 2021
bf898b2
Add proper implementation of find_common_type for categoricals.
vyasr Sep 21, 2021
22b961e
Convert columns to index before setting in _init_from_series_list.
vyasr Sep 22, 2021
8a62620
Address review comments.
vyasr Sep 30, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Next Next commit
Add DataFrame.values_host and remove uses of DataFrame.as_matrix from…
… tests.
  • Loading branch information
vyasr committed Sep 21, 2021
commit 2922dd1a7faedf876e652f47595d1955744c8906
12 changes: 12 additions & 0 deletions python/cudf/cudf/core/dataframe.py
Original file line number Diff line number Diff line change
Expand Up @@ -1005,6 +1005,18 @@ def values(self):
"""
return cupy.asarray(self.as_gpu_matrix())

@property
def values_host(self):
"""
Return a NumPy representation of the data.

Returns
-------
out : numpy.ndarray
A host representation of the underlying data.
"""
return self.values.get()

def __array__(self, dtype=None):
raise TypeError(
"Implicit conversion to a host NumPy array via __array__ is not "
Expand Down
4 changes: 2 additions & 2 deletions python/cudf/cudf/tests/test_dataframe.py
Original file line number Diff line number Diff line change
Expand Up @@ -291,7 +291,7 @@ def test_dataframe_basic():
np.testing.assert_equal(df["vals"].to_array(), hvals)

# As matrix
mat = df.as_matrix()
mat = df.values_host

expect = np.vstack([hkeys, hvals]).T

Expand Down Expand Up @@ -1150,7 +1150,7 @@ def test_dataframe_hash_partition(nrows, nparts, nkeys):
for p in got:
if len(p):
# Take rows of the keycolumns and build a set of the key-values
unique_keys = set(map(tuple, p.as_matrix(columns=keycols)))
unique_keys = set(map(tuple, p[keycols].values_host))
# Ensure that none of the key-values have occurred in other groups
assert not (unique_keys & part_unique_keys)
part_unique_keys |= unique_keys
Expand Down
4 changes: 2 additions & 2 deletions python/cudf/cudf/tests/test_onehot.py
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@ def test_onehot_simple():
assert df2.columns[0] == "vals"
for i in range(1, len(df2.columns)):
assert df2.columns[i] == "vals_%s" % (i - 1)
got = df2.as_matrix(columns=df2.columns[1:])
got = df2[df2.columns[1:]].values_host
expect = np.identity(got.shape[0])
np.testing.assert_equal(got, expect)

Expand All @@ -45,7 +45,7 @@ def test_onehot_random():
df2 = df.one_hot_encoding(
column="src", prefix="out_", cats=tuple(range(10, 17))
)
mat = df2.as_matrix(columns=df2.columns[1:])
mat = df2[df2.columns[1:]].values_host

for val in range(low, high):
colidx = val - low
Expand Down