You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have included information about relevant versions
I have verified that the issue persists when using the master branch of Faust.
Steps to reproduce
Pretty much any time you restart a worker, it will replay the last message it received. So if there were messages [0,1,2,3,4] that a worker processed, then restart, it will re-process item 4. This will mess up any analytics that are based on stateful counts. With a trivial case of incrementing a counter in a table, this can consistently reproduced by simply restarting and starting a worker and finding the last id continue to increment even though there were no new messages to the underlying topic.
If using the group_by functionality to re-partition a stream, I am finding that it will replay ALL of the messages resulting in much more duplicates than simply +1 to counts.
Expected behavior
Do not replay the most recent message.
Actual behavior
Replays messages on restart.
Full traceback
Paste the full traceback (if there is any)
Versions
Python version: 3.7
Faust version: 0.3.0
Operating system: ubuntu 18.04
Kafka version: latest
RocksDB version (if applicable)
The text was updated successfully, but these errors were encountered:
Checklist
master
branch of Faust.Steps to reproduce
Pretty much any time you restart a worker, it will replay the last message it received. So if there were messages
[0,1,2,3,4]
that a worker processed, then restart, it will re-process item4
. This will mess up any analytics that are based on stateful counts. With a trivial case of incrementing a counter in a table, this can consistently reproduced by simply restarting and starting a worker and finding the last id continue to increment even though there were no new messages to the underlying topic.If using the group_by functionality to re-partition a stream, I am finding that it will replay ALL of the messages resulting in much more duplicates than simply +1 to counts.
Expected behavior
Actual behavior
Full traceback
Versions
The text was updated successfully, but these errors were encountered: