-
Notifications
You must be signed in to change notification settings - Fork 6.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[DataFrame] Sample implement #1954
[DataFrame] Sample implement #1954
Conversation
a45fcf8
to
0519e3a
Compare
Test FAILed. |
Jenkins, retest this please. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only thing I would add is some extra comments when the logic gets a bit hard to follow. Looks great!
python/ray/dataframe/dataframe.py
Outdated
for sample in samples] | ||
|
||
if axis == 1: | ||
new_cols = [_deploy_func.remote(lambda df: df.iloc[:, [tup[1]]], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a comment here about the logic. What is tup[0]
and tup[1]
, etc.
Test PASSed. |
Test FAILed. |
Test PASSed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't put it everywhere, but in a few of the comments, there is only a description of what is happening, but that should be evident from the code itself. I would prefer the comments to say why something is the way that it is.
python/ray/dataframe/dataframe.py
Outdated
|
||
if weights is not None: | ||
|
||
# If a series, align with frame |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer for the comments to say why something was done, rather than what. I think that if you say why it's important to align with frame, that would be more helpful in reading the code..
python/ray/dataframe/dataframe.py
Outdated
if isinstance(weights, pd.Series): | ||
weights = weights.reindex(self.axes[axis]) | ||
|
||
# Strings acceptable if a dataframe and axis = 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment as above.
python/ray/dataframe/dataframe.py
Outdated
raise ValueError("weight vector many not include negative " | ||
"values") | ||
|
||
# If has nan, set to zero. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment as above.
python/ray/dataframe/dataframe.py
Outdated
# If has nan, set to zero. | ||
weights = weights.fillna(0) | ||
|
||
# Renormalize if don't sum to 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment as above.
python/ray/dataframe/dataframe.py
Outdated
weights = weights.values | ||
|
||
if n is None and frac is None: | ||
# default to n = 1 if n and frac are None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment as above.
python/ray/dataframe/dataframe.py
Outdated
weights = weights.fillna(0) | ||
|
||
# Renormalize if don't sum to 1 | ||
if weights.sum() != 1: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sum weights once and use the variable for these two if statements and rebalance
Test PASSed. |
python/ray/dataframe/dataframe.py
Outdated
weights_sum = weights.sum() | ||
if weights_sum != 1: | ||
if weights_sum != 0: | ||
weights = weights / weights.sum() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
weights.sum()
-> weights_sum
Test PASSed. |
* 'master' of https://github.com/ray-project/ray: [rllib] Fix broken link in docs (ray-project#1967) [DataFrame] Sample implement (ray-project#1954) [DataFrame] Implement Inter-DataFrame operations (ray-project#1937) remove UniqueIDHasher (ray-project#1957) [rllib] Add DDPG documentation, rename DDPG2 <=> DDPG (ray-project#1946) updates (ray-project#1958) Pin Cython in autoscaler development example. (ray-project#1951) Incorporate C++ Buffer management and Seal global threadpool fix from arrow (ray-project#1950) [XRay] Add consistency check for protocol between node_manager and local_scheduler_client (ray-project#1944) Remove smart_open install. (ray-project#1943) [DataFrame] Fully implement append, concat and join (ray-project#1932) [DataFrame] Fix for __getitem__ string indexing (ray-project#1939) [DataFrame] Implementing write methods (ray-project#1918) [rllib] arr[end] was excluded when end is not None (ray-project#1931) [DataFrame] Implementing API correct groupby with aggregation methods (ray-project#1914)
Implements
sample()
function for ray dataframes