Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add UniformEncoder (and its ordered version) #681

Merged
merged 10 commits into from
Aug 14, 2023
3 changes: 3 additions & 0 deletions rdt/transformers/categorical.py
Original file line number Diff line number Diff line change
Expand Up @@ -264,7 +264,10 @@ def _fit(self, data):
else:
freq = data.value_counts(normalize=True, dropna=False)

nan_value = freq[np.nan] if np.nan in freq.index else None
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure this works with other types of nans, like float('nan')

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe but this should not happen because freq is defined by data.value_counts(), no?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was concerned with the freq.index having nans in there which are not np.nan. I'm not sure if it's impossible for that to happen.

freq = freq.reindex(self.order).array
amontanez24 marked this conversation as resolved.
Show resolved Hide resolved
freq[np.isnan(freq)] = nan_value

self.frequencies, self.intervals = self._compute_frequencies_intervals(self.order, freq)

def _transform(self, data):
Expand Down
Loading