Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix statistics #2578

Merged
merged 8 commits into from
Nov 12, 2022
Merged

fix statistics #2578

merged 8 commits into from
Nov 12, 2022

Conversation

xuanyu66
Copy link
Collaborator

@xuanyu66 xuanyu66 commented Nov 3, 2022

What problem does this PR solve?

close #2573
In #2259, TiResolutionRule was deleted, which cause statistics would not be collected.
Thus in https://github.com/pingcap/tispark/pull/2300/files, ENABLE_AUTO_LOAD_STATISTICS was considered to be useless and deprecated.

What is changed and how it works?

  • add a new ResolutionRule to collect statistics if ENABLE_AUTO_LOAD_STATISTICS is true.
  • add documents to explain ENABLE_AUTO_LOAD_STATISTICS.
  • fix the logicalPlan result.

Check List

Tests

  • Unit test

Related changes

  • Need to cherry-pick to the release branch
  • Need to update the documentation
  • Need to be included in the release note

@ti-chi-bot
Copy link
Member

ti-chi-bot commented Nov 3, 2022

[REVIEW NOTIFICATION]

This pull request has been approved by:

  • Daemonxiao
  • shiyuhang0

To complete the pull request process, please ask the reviewers in the list to review by filling /cc @reviewer in the comment.
After your PR has acquired the required number of LGTMs, you can assign this pull request to the committer in the list by filling /assign @committer in the comment to help you merge this pull request.

The full list of commands accepted by this bot can be found here.

Reviewer can indicate their review by submitting an approval review.
Reviewer can cancel approval by submitting a request changes review.

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

license-eye has totally checked 485 files.

Valid Invalid Ignored Fixed
484 1 0 0
Click to see the invalid file list
  • core/src/main/scala/org/apache/spark/sql/catalyst/rule/TiStatisticsRuleFactory.scala

@xuanyu66 xuanyu66 added needs-cherry-pick-release-3.0 PR which needs to be cherry-picked to release-2.5 needs-cherry-pick-release-3.1 PR which needs to be cherry-picked to release-3.1 labels Nov 3, 2022
xuanyu66 and others added 2 commits November 3, 2022 14:40
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
@xuanyu66
Copy link
Collaborator Author

xuanyu66 commented Nov 3, 2022

/run-all-tests

1 similar comment
@xuanyu66
Copy link
Collaborator Author

xuanyu66 commented Nov 3, 2022

/run-all-tests

@xuanyu66
Copy link
Collaborator Author

xuanyu66 commented Nov 3, 2022

/run-all-tests

@xuanyu66
Copy link
Collaborator Author

xuanyu66 commented Nov 4, 2022

/run-all-tests

1 similar comment
@xuanyu66
Copy link
Collaborator Author

xuanyu66 commented Nov 4, 2022

/run-all-tests

@xuanyu66
Copy link
Collaborator Author

xuanyu66 commented Nov 4, 2022

/run-all-tests tidb=v6.1.0 tikv=v6.1.0

@xuanyu66
Copy link
Collaborator Author

xuanyu66 commented Nov 4, 2022

/run-all-tests

@xuanyu66
Copy link
Collaborator Author

xuanyu66 commented Nov 4, 2022

/run-all-tests

@xuanyu66
Copy link
Collaborator Author

xuanyu66 commented Nov 7, 2022

/run-all-tests tidb=v6.1.0 tikv=v6.1.0

1 similar comment
@xuanyu66
Copy link
Collaborator Author

xuanyu66 commented Nov 7, 2022

/run-all-tests tidb=v6.1.0 tikv=v6.1.0

@wfxxh
Copy link
Contributor

wfxxh commented Nov 8, 2022

Is there any cache in tispark ? When i use this PR to run my process,the tikv read task number is correct,same as table region number.If I only process one table it is success.But if I loop process several tables it cause executor OOM。

loop process several tables:

image

image

process the oom table when the loop is processed separately

image

my process

image

@xuanyu66
Copy link
Collaborator Author

xuanyu66 commented Nov 8, 2022

@wfxxh Yes, the cache is in com.pingcap.tispark.statistics.StatisticsManager
image

@wfxxh
Copy link
Contributor

wfxxh commented Nov 8, 2022

@xuanyu66 But the reasion when loop read tikv cause OOM is not find,It is may be a bug

@xuanyu66
Copy link
Collaborator Author

xuanyu66 commented Nov 8, 2022

@wfxxh Can you check if OOM will happen when using the released version(v3.1.1) and the master version?
If it only happens in this PR, I will hold on.

@wfxxh
Copy link
Contributor

wfxxh commented Nov 8, 2022

@xuanyu66 I have checked the v2.5.1 ,v3.1.1 and the master version.
The v2.5.1 version read task number is the same as this PR and equal to the table region number,but will cause OOM when loop read several table.
The v3.1.1 and the master version read task number is not correct,and equal to the table index region number.
The v2.5.1 version and the PR version is fast than v3.1.1 and master.
I think it is very strange,why read single table and exit spark job not OOM,loop several to read cause OOM.

@xuanyu66
Copy link
Collaborator Author

xuanyu66 commented Nov 9, 2022

@wfxxh So 2.5.1 and this PR will OOM while 3.1.1 and master don't have the OOM problem, am I right?

@wfxxh
Copy link
Contributor

wfxxh commented Nov 9, 2022

@xuanyu66 Yes,but even it cause OOM ,there will start a new executor pod,the spark job final status is success,and it was faster than the 3.1.1 version

@xuanyu66
Copy link
Collaborator Author

xuanyu66 commented Nov 9, 2022

@wfxxh Can we dump the memory snapshot to analyze it?

@wfxxh
Copy link
Contributor

wfxxh commented Nov 9, 2022

@xuanyu66 Yes I am trying to do it ,but I am run spark on k8s , I should mount a storage.

I am so sorry , I can not do it ,when the executor pod deleted ,the volumn deleted too.

@xuanyu66
Copy link
Collaborator Author

@wfxxh Can you increase your memory, and hold on for 10 minutes when the task is running. And dump the memory to analyze

@wfxxh
Copy link
Contributor

wfxxh commented Nov 10, 2022

@xuanyu66 How to hold on the process? sleep the thread? But the time when i sleep it may be not oom

@xuanyu66
Copy link
Collaborator Author

xuanyu66 commented Nov 11, 2022

@wfxxh use "-XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath="c:\temp\dump2.hprof" "
and use the hostpath

@wfxxh
Copy link
Contributor

wfxxh commented Nov 11, 2022

@xuanyu66 Yes,I have try this method ,but when the executor pod OOM and deleted,it will start a new pod ,and this will clear the hostpath dir.
image

@wfxxh
Copy link
Contributor

wfxxh commented Nov 11, 2022

@xuanyu66 I think you can merge this PR,because even it may be cause the executor oom ,but the final status is success,and it is faster

@xuanyu66
Copy link
Collaborator Author

@wfxxh You can simply use "-XX:HeapDumpPath="c:\temp-${randomstring}\dump2.hprof"

@wfxxh
Copy link
Contributor

wfxxh commented Nov 11, 2022

@xuanyu66 I do not konw what is your meaning. I say when the pod oom deleted ,and restart a new one ,the hostpath in host machine will be deleted too.
I will run the process on spark standalone cluster to test whether it is the k8s reasion

@xuanyu66
Copy link
Collaborator Author

xuanyu66 commented Nov 11, 2022

@wfxxh it's strange, normally hostpath will not be deleted.
please check persistentVolumeReclaimPolicy, maybe you are using recycle

@xuanyu66
Copy link
Collaborator Author

/merge

@ti-chi-bot
Copy link
Member

This pull request has been accepted and is ready to merge.

Commit hash: 5176f13

@ti-chi-bot ti-chi-bot merged commit 0fccb5a into master Nov 12, 2022
@ti-srebot
Copy link
Collaborator

cherry pick to release-release failed

@ti-chi-bot
Copy link
Member

In response to a cherrypick label: new pull request created: #2588.

ti-chi-bot pushed a commit to ti-chi-bot/tispark that referenced this pull request Nov 12, 2022
Signed-off-by: ti-chi-bot <ti-community-prow-bot@tidb.io>
@ti-chi-bot
Copy link
Member

In response to a cherrypick label: new pull request created: #2589.

ti-chi-bot pushed a commit to ti-chi-bot/tispark that referenced this pull request Nov 12, 2022
Signed-off-by: ti-chi-bot <ti-community-prow-bot@tidb.io>
@xuanyu66
Copy link
Collaborator Author

@wfxxh I have merged this PR and we will discuss the OOM problem here #2587

ti-chi-bot added a commit that referenced this pull request Nov 14, 2022
ti-chi-bot added a commit that referenced this pull request Nov 14, 2022
wfxxh pushed a commit to wanfangdata/tispark that referenced this pull request Jun 30, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs-cherry-pick-release-3.0 PR which needs to be cherry-picked to release-2.5 needs-cherry-pick-release-3.1 PR which needs to be cherry-picked to release-3.1 status/can-merge status/LGT2
Projects
None yet
Development

Successfully merging this pull request may close these issues.

tispark 3.x version read speed is slower than tispark 2.5
6 participants