Skip to content

Commit

Permalink
Do type conversions on a copy of the metadata
Browse files Browse the repository at this point in the history
This fixes the unexpected behavior described in the previous commit.
  • Loading branch information
victorlin committed Aug 12, 2023
1 parent b5a77e1 commit 9f9bc4f
Show file tree
Hide file tree
Showing 2 changed files with 11 additions and 6 deletions.
9 changes: 6 additions & 3 deletions augur/filter/include_exclude_rules.py
Original file line number Diff line number Diff line change
Expand Up @@ -184,11 +184,14 @@ def filter_by_query(metadata, query) -> FilterFunctionReturn:
set()
"""
# Create a copy to prevent modification of the original DataFrame.
metadata_copy = metadata.copy()

# Try converting all columns to numeric.
for column in metadata.columns:
metadata[column] = pd.to_numeric(metadata[column], errors='ignore')
for column in metadata_copy.columns:
metadata_copy[column] = pd.to_numeric(metadata_copy[column], errors='ignore')

return set(metadata.query(query).index.values)
return set(metadata_copy.query(query).index.values)


def filter_by_ambiguous_date(metadata, date_column, ambiguity) -> FilterFunctionReturn:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -13,12 +13,14 @@ Create metadata TSV file for testing.
> ~~

Confirm that `--exclude-ambiguous-dates-by` works for all year only ambiguous dates.
This currently fails because the metadata DataFrame is modified in-place.

$ ${AUGUR} filter \
> --metadata metadata.tsv \
> --query 'region=="Asia"' \
> --exclude-ambiguous-dates-by any \
> --empty-output-reporting silent \
> --output-strains filtered_strains.txt > /dev/null 2>&1
[2]
> --output-strains filtered_strains.txt
4 strains were dropped during filtering
\t1 of these were filtered out by the query: "region=="Asia"" (esc)
\t3 of these were dropped because of their ambiguous date in any (esc)
0 strains passed all filters

0 comments on commit 9f9bc4f

Please sign in to comment.