-
Notifications
You must be signed in to change notification settings - Fork 244
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix statistics #2578
fix statistics #2578
Conversation
[REVIEW NOTIFICATION] This pull request has been approved by:
To complete the pull request process, please ask the reviewers in the list to review by filling The full list of commands accepted by this bot can be found here. Reviewer can indicate their review by submitting an approval review. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
license-eye has totally checked 485 files.
Valid | Invalid | Ignored | Fixed |
---|---|---|---|
484 | 1 | 0 | 0 |
Click to see the invalid file list
- core/src/main/scala/org/apache/spark/sql/catalyst/rule/TiStatisticsRuleFactory.scala
core/src/main/scala/org/apache/spark/sql/catalyst/rule/TiStatisticsRuleFactory.scala
Show resolved
Hide resolved
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
/run-all-tests |
1 similar comment
/run-all-tests |
/run-all-tests |
/run-all-tests |
1 similar comment
/run-all-tests |
/run-all-tests tidb=v6.1.0 tikv=v6.1.0 |
/run-all-tests |
/run-all-tests |
/run-all-tests tidb=v6.1.0 tikv=v6.1.0 |
1 similar comment
/run-all-tests tidb=v6.1.0 tikv=v6.1.0 |
Is there any cache in tispark ? When i use this PR to run my process,the tikv read task number is correct,same as table region number.If I only process one table it is success.But if I loop process several tables it cause executor OOM。 loop process several tables: process the oom table when the loop is processed separately my process |
@wfxxh Yes, the cache is in com.pingcap.tispark.statistics.StatisticsManager |
@xuanyu66 But the reasion when loop read tikv cause OOM is not find,It is may be a bug |
@wfxxh Can you check if OOM will happen when using the released version(v3.1.1) and the master version? |
@xuanyu66 I have checked the v2.5.1 ,v3.1.1 and the master version. |
@wfxxh So 2.5.1 and this PR will OOM while 3.1.1 and master don't have the OOM problem, am I right? |
@xuanyu66 Yes,but even it cause OOM ,there will start a new executor pod,the spark job final status is success,and it was faster than the 3.1.1 version |
@wfxxh Can we dump the memory snapshot to analyze it? |
@xuanyu66 Yes I am trying to do it ,but I am run spark on k8s , I should mount a storage. I am so sorry , I can not do it ,when the executor pod deleted ,the volumn deleted too. |
@wfxxh Can you increase your memory, and hold on for 10 minutes when the task is running. And dump the memory to analyze |
@xuanyu66 How to hold on the process? sleep the thread? But the time when i sleep it may be not oom |
@xuanyu66 Yes,I have try this method ,but when the executor pod OOM and deleted,it will start a new pod ,and this will clear the hostpath dir. |
@xuanyu66 I think you can merge this PR,because even it may be cause the executor oom ,but the final status is success,and it is faster |
@wfxxh You can simply use "-XX:HeapDumpPath="c:\temp-${randomstring}\dump2.hprof" |
@xuanyu66 I do not konw what is your meaning. I say when the pod oom deleted ,and restart a new one ,the hostpath in host machine will be deleted too. |
@wfxxh it's strange, normally hostpath will not be deleted. |
/merge |
This pull request has been accepted and is ready to merge. Commit hash: 5176f13
|
cherry pick to release-release failed |
In response to a cherrypick label: new pull request created: #2588. |
Signed-off-by: ti-chi-bot <ti-community-prow-bot@tidb.io>
In response to a cherrypick label: new pull request created: #2589. |
Signed-off-by: ti-chi-bot <ti-community-prow-bot@tidb.io>
What problem does this PR solve?
close #2573
In #2259, TiResolutionRule was deleted, which cause statistics would not be collected.
Thus in https://github.com/pingcap/tispark/pull/2300/files,
ENABLE_AUTO_LOAD_STATISTICS
was considered to be useless and deprecated.What is changed and how it works?
ENABLE_AUTO_LOAD_STATISTICS
is true.ENABLE_AUTO_LOAD_STATISTICS
.Check List
Tests
Related changes