Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

within_range generates a long output, possibly it could be much shorter #25

Open
ianozsvald opened this issue Aug 8, 2015 · 1 comment

Comments

@ianozsvald
Copy link

within_range reports the truth values of the range test for each row in the column it tests. For a longer dataframe (e.g. 891 titanic rows) where you have 1 violating row you get a long list of False that hides the True row. Possibly the report could just summarise the rows that are in violation of the constraint?

Current:

import engarde.checks as ck
df = pd.DataFrame(np.random.randn(4, 2))
ck.within_range(df, {0:(0, 10)})
AssertionError: ('Outside range', 0    False
1    False
2    False
3     True
Name: 0, dtype: bool)

Suggested:

AssertionError: ('Outside range', 
3     True
Name: 0, dtype: bool)

and possibly the .sum() of the result column could be included to report the number of violations, in case that number is very large?

@ianozsvald
Copy link
Author

(sidenote - I'm not using this in my video series, I'm noting this just as a possible-future-tweak)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant