Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Warn & replace dataframes with non-unique indexes #691

Merged
Merged
Changes from 1 commit
Commits
Show all changes
63 commits
Select commit Hold shift + click to select a range
5b90244
Add unittest for issue #686
dagardner-nv Feb 9, 2023
1f168a3
wip
dagardner-nv Feb 10, 2023
f38a07f
wip
dagardner-nv Feb 10, 2023
eca479a
Add 'has_unique_index' helper method to MessageMeta
dagardner-nv Feb 10, 2023
d824440
Add integration test for desrialization stage, along with test for is…
dagardner-nv Feb 10, 2023
5bf99bf
Test for has_unique_index method
dagardner-nv Feb 10, 2023
3add91b
Remove parametrize variables not needed for this test
dagardner-nv Feb 10, 2023
e3be4cf
First pass at replacing a non-unique index
dagardner-nv Feb 10, 2023
e43ac89
Add cpp impl for has_unique_index
dagardner-nv Feb 10, 2023
e60742c
wip
dagardner-nv Feb 10, 2023
53ee170
Move index reset to MutableTableInfo so that the column & index names…
dagardner-nv Feb 10, 2023
3fd0ea3
use logger.warning instead of logger.warn
dagardner-nv Feb 10, 2023
c651744
Update multi-segment test
dagardner-nv Feb 10, 2023
1d41fd6
Select only the columns in the view when writing json
dagardner-nv Feb 10, 2023
f9396be
Log and ignore include_index_col=false, otherwise cudf will throw an …
dagardner-nv Feb 11, 2023
fb141c9
wip
dagardner-nv Feb 11, 2023
c77360f
Document work-around
dagardner-nv Feb 11, 2023
ef3eb30
Fix casing for cuDF
dagardner-nv Feb 13, 2023
0554785
Merge branch 'branch-23.03' into david-warn-non-unique-686
dagardner-nv Feb 13, 2023
ae7d4af
Change fatal log to an error log
dagardner-nv Feb 13, 2023
ccc6e6c
Only set include_index_col=False when writing CSV
dagardner-nv Feb 13, 2023
d9669e2
Merge branch 'branch-23.03' into david-warn-non-unique-686
dagardner-nv Feb 14, 2023
bf4d4e6
Merge branch 'branch-23.03' into david-warn-non-unique-686
dagardner-nv Feb 15, 2023
7acff20
wip
dagardner-nv Feb 15, 2023
2a14392
Move index reset logic to a method on MessageMeta
dagardner-nv Feb 15, 2023
123d216
Merge branch 'david-warn-non-unique-686' of github.com:dagardner-nv/M…
dagardner-nv Feb 15, 2023
aad70ca
Repeat test with dup id occurring at the front and the end of the df
dagardner-nv Feb 15, 2023
d63bdcc
Only use the index for slicing if the index is unique, otherwise use …
dagardner-nv Feb 15, 2023
a46d43e
Add test for replace_non_unique_index method
dagardner-nv Feb 15, 2023
4715a5d
rename reset_index to replace_non_unique_index
dagardner-nv Feb 15, 2023
829cc3d
Remove unused import
dagardner-nv Feb 15, 2023
50dcce7
Add missing docstring
dagardner-nv Feb 15, 2023
a0f1388
Merge branch 'branch-23.03' into david-warn-non-unique-686
dagardner-nv Feb 16, 2023
09c1e0f
Merge branch 'branch-23.03' into david-warn-non-unique-686
dagardner-nv Feb 22, 2023
7c2f770
Merge branch 'branch-23.03' into david-warn-non-unique-686
dagardner-nv Feb 22, 2023
6ad14c8
Merge branch 'branch-23.03' into david-warn-non-unique-686
dagardner-nv Feb 23, 2023
ec0ed04
Merge branch 'branch-23.03' into david-warn-non-unique-686
dagardner-nv Feb 24, 2023
22daccd
Merge branch 'branch-23.03' into david-warn-non-unique-686
dagardner-nv Feb 25, 2023
1f35d00
Add missing includes
dagardner-nv Feb 25, 2023
7ad8d2d
Cleanup includes
dagardner-nv Feb 25, 2023
e4976db
Merge branch 'branch-23.03' into david-warn-non-unique-686
dagardner-nv Feb 27, 2023
0b2d13c
Merge branch 'branch-23.03' into david-warn-non-unique-686
dagardner-nv Mar 7, 2023
4b99d1e
Merge branch 'branch-23.03' into david-warn-non-unique-686
dagardner-nv Mar 7, 2023
482fd45
Adding additional tests to MultiMessage and fixing the bugs it discovers
mdemoret-nv Mar 10, 2023
bb54dad
All multi message tests passing
mdemoret-nv Mar 14, 2023
d4b8761
Most tests now passing
mdemoret-nv Mar 14, 2023
c002a93
Merge branch 'branch-23.03' into david-warn-non-unique-686
mdemoret-nv Mar 15, 2023
f4fb726
Removing files that should not have been committed
mdemoret-nv Mar 15, 2023
51e4e71
Removing stub generation
mdemoret-nv Mar 15, 2023
76921d3
Fixing up post merge failures
mdemoret-nv Mar 15, 2023
65e7edb
Large cleanup and added multi tensor tests
mdemoret-nv Mar 16, 2023
b55f50d
Merge branch 'branch-23.03' into david-warn-non-unique-686
mdemoret-nv Mar 16, 2023
4e92c8b
Style cleanup
mdemoret-nv Mar 16, 2023
68ff815
Merge branch 'branch-23.03' into david-warn-non-unique-686
mdemoret-nv Mar 16, 2023
77e2db0
Cleaning up the code
mdemoret-nv Mar 16, 2023
1ac0c6a
Large cleanup
mdemoret-nv Mar 16, 2023
39beb1f
Non-slow tests passing
mdemoret-nv Mar 17, 2023
42a70b9
Large cleanup. All tests passing locally
mdemoret-nv Mar 17, 2023
1cfa57d
Merge branch 'branch-23.03' into david-warn-non-unique-686
mdemoret-nv Mar 17, 2023
5bf02e9
Removing stubs from the build in CI
mdemoret-nv Mar 17, 2023
345fa78
IWYU fixes
mdemoret-nv Mar 17, 2023
365f583
Final changes to get CI to pass
mdemoret-nv Mar 17, 2023
1d9fe36
Style fixes
mdemoret-nv Mar 17, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
use logger.warning instead of logger.warn
  • Loading branch information
dagardner-nv committed Feb 10, 2023
commit 3fd0ea3a83a6fdd4cca85865f75efb30a2d7893c
2 changes: 1 addition & 1 deletion morpheus/stages/preprocess/deserialize_stage.py
Original file line number Diff line number Diff line change
Expand Up @@ -86,7 +86,7 @@ def process_dataframe(x: MessageMeta, batch_size: int) -> typing.List[MultiMessa

"""
if (not x.has_unique_index()):
logger.warn("Non unique index found in dataframe, generating new index.")
logger.warning("Non unique index found in dataframe, generating new index.")
# Reset the index preserving the original index in a new column
with x.mutable_dataframe() as df:
df.index.name = "_index_" + (df.index.name or "")
Expand Down