Skip to content

[SPARK-11681][Streaming] Correctly update state timestamp even when state is not updated #9648

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 3 commits into from

Conversation

tdas
Copy link
Contributor

@tdas tdas commented Nov 12, 2015

Bug: Timestamp is not updated if there is data but the corresponding state is not updated. This is wrong, and timeout is defined as "no data for a while", not "not state update for a while".

Fix: Update timestamp when timestamp when timeout is specified, otherwise no need.
Also refactored the code for better testability and added unit tests.

@tdas
Copy link
Contributor Author

tdas commented Nov 12, 2015

@zsxwing Could you please take a look at this PR

@SparkQA
Copy link

SparkQA commented Nov 12, 2015

Test build #45702 has finished for PR 9648 at commit 53a7c2c.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):\n * case class Pivot(\n

@zsxwing
Copy link
Member

zsxwing commented Nov 12, 2015

retest this please

@zsxwing
Copy link
Member

zsxwing commented Nov 12, 2015

LGTM

@SparkQA
Copy link

SparkQA commented Nov 12, 2015

Test build #45708 has finished for PR 9648 at commit 53a7c2c.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@tdas
Copy link
Contributor Author

tdas commented Nov 13, 2015

Thanks for the review @zsxwing, merging this to master and 1.6

@tdas
Copy link
Contributor Author

tdas commented Nov 13, 2015

Oops there are conflicts. Fixing them and then merging.

@SparkQA
Copy link

SparkQA commented Nov 13, 2015

Test build #45805 has finished for PR 9648 at commit 7b345cc.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):\n * class JavaTrackStateDStream[KeyType, ValueType, StateType, EmittedType](\n

asfgit pushed a commit that referenced this pull request Nov 13, 2015
…tate is not updated

Bug: Timestamp is not updated if there is data but the corresponding state is not updated. This is wrong, and timeout is defined as "no data for a while", not "not state update for a while".

Fix: Update timestamp when timestamp when timeout is specified, otherwise no need.
Also refactored the code for better testability and added unit tests.

Author: Tathagata Das <tathagata.das1565@gmail.com>

Closes #9648 from tdas/SPARK-11681.

(cherry picked from commit e4e46b2)
Signed-off-by: Tathagata Das <tathagata.das1565@gmail.com>
@asfgit asfgit closed this in e4e46b2 Nov 13, 2015
dskrvk pushed a commit to dskrvk/spark that referenced this pull request Nov 13, 2015
…tate is not updated

Bug: Timestamp is not updated if there is data but the corresponding state is not updated. This is wrong, and timeout is defined as "no data for a while", not "not state update for a while".

Fix: Update timestamp when timestamp when timeout is specified, otherwise no need.
Also refactored the code for better testability and added unit tests.

Author: Tathagata Das <tathagata.das1565@gmail.com>

Closes apache#9648 from tdas/SPARK-11681.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants