Skip to content
This repository was archived by the owner on Sep 3, 2025. It is now read-only.

Conversation

@wssheldon
Copy link
Contributor

No description provided.

@wssheldon wssheldon requested a review from mvilanova May 1, 2023 15:44
@wssheldon wssheldon added the enhancement New feature or request label May 1, 2023
Comment on lines 545 to 564
else:
# Apply the default deduplication rule if there's no deduplication rule set on the signal
# and the signal instance is not snoozed
if not has_dedup_filter and not filtered:
default_dedup_window = datetime.now(timezone.utc) - timedelta(hours=1)
default_dedup_query = (
db_session.query(SignalInstance)
.filter(
SignalInstance.signal_id == signal_instance.signal_id,
SignalInstance.created_at >= default_dedup_window,
SignalInstance.id != signal_instance.id,
)
.order_by(asc(SignalInstance.created_at))
.all()
)

if default_dedup_query:
signal_instance.case_id = default_dedup_query[0].case_id
signal_instance.filter_action = SignalFilterAction.deduplicate
filtered = True
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think indentation for this code block is wrong?


# Check if there's a deduplication rule set on the signal
has_dedup_filter = any(
f.action == SignalFilterAction.deduplicate for f in signal_instance.signal.filters
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it necessary to loop through all the signal_instance.signal.filters here if we do it again below?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Barring any refactor of this function, I believe the check is necessary to handle the case of deduplication filters that don't match the signal instance (they will be filtered by the default rule, which is not intended). For example:

    else:
        # Apply the default deduplication rule if there's no deduplication rule set on the signal
        # and the signal instance is not snoozed
        if not filtered:
            default_dedup_window = datetime.now(timezone.utc) - timedelta(hours=1)
            default_dedup_query = (
                db_session.query(SignalInstance)
                .filter(
                    SignalInstance.signal_id == signal_instance.signal_id,
                    SignalInstance.created_at >= default_dedup_window,
                    SignalInstance.id != signal_instance.id,
                )
                .order_by(asc(SignalInstance.created_at))
                .all()
            )

            if default_dedup_query:
                signal_instance.case_id = default_dedup_query[0].case_id
                signal_instance.filter_action = SignalFilterAction.deduplicate
                filtered = True

Results in:

=============================================================================================================================================================================================== FAILURES ===============================================================================================================================================================================================
__________________________________________________________________________________________________________________________________________________________________________ test_filter_actions_deduplicate_different_entities __________________________________________________________________________________________________________________________________________________________________________
Traceback (most recent call last):
  File "/Users/wshel/Projects/dispatch/tests/signal/test_signal_service.py", line 149, in test_filter_actions_deduplicate_different_entities
    assert not filter_signal(db_session=session, signal_instance=signal_instance_1)
AssertionError: assert not True
 +  where True = <function filter_signal at 0x166560b80>(db_session=<sqlalchemy.orm.session.Session object at 0x1606eeb90>, signal_instance=<SignalInstance #32345795-3f52-4b17-b71f-3d671ad7ce0e>)
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Captured stderr call -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
/Users/wshel/.pyenv/versions/3.11.2/envs/dispatch-prod/lib/python3.11/site-packages/sqlalchemy/orm/query.py:2575: SAWarning: Pathed join target SignalInstance.entities has already been joined to; skipping
  util.warn(
_______________________________________________________________________________________________________________________________________________________________________ test_filter_actions_deduplicate_different_entities_types _______________________________________________________________________________________________________________________________________________________________________
Traceback (most recent call last):
  File "/Users/wshel/Projects/dispatch/tests/signal/test_signal_service.py", line 207, in test_filter_actions_deduplicate_different_entities_types
    assert not filter_signal(db_session=session, signal_instance=signal_instance_1)
AssertionError: assert not True
 +  where True = <function filter_signal at 0x166560b80>(db_session=<sqlalchemy.orm.session.Session object at 0x1606eeb90>, signal_instance=<SignalInstance #ded5b413-b982-4e3e-99e2-0059988f2b17>)
======================================================================================================================================================================================= short test summary info ========================================================================================================================================================================================
FAILED tests/signal/test_signal_service.py::test_filter_actions_deduplicate_different_entities - assert not True
FAILED tests/signal/test_signal_service.py::test_filter_actions_deduplicate_different_entities_types - assert not True

signal_instance.filter_action = SignalFilterAction.deduplicate
filtered = True
break
else:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

signal_instances = (
db_session.query(SignalInstance)
.filter(SignalInstance.project_id == project.id)
.filter(SignalInstance.filter_action == None) # noqa
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

filter_action gets set to none as string, but we seem to compare it to the None keyword, which SQLAlchemy converts it to NULL?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we change the default value of filter_action in SignalInstanceBase to none?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something like filter_action: SignalFilterAction = SignalFilterAction.none?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then update this filter to compare with SignalFilterAction.none?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Somehow it still works with the None here, but it's very confusing, I'll update it to work this way.

Copy link
Contributor Author

@wssheldon wssheldon May 1, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, we can't update this without making things complicated. Since we have set un filtered instances to SignalFilterAction.none here

def filter_signal(*, db_session: Session, signal_instance: SignalInstance) -> bool:
    ...
    if not filtered:
        signal_instance.filter_action = SignalFilterAction.none

All of the existing instances in the database (that were not filtered), without a case_id, will be re-processed.

It's probably smart to add a boundary on time created here know that I see this.

@wssheldon wssheldon merged commit a098a3a into master May 1, 2023
@wssheldon wssheldon deleted the enhancement/default-dedupe branch May 1, 2023 23:44
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants