Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[#1300]feat(mr): Support combine operation in map stage for mr engine. #1301

Merged
merged 5 commits into from
Nov 14, 2023

Conversation

qijiale76
Copy link
Contributor

@qijiale76 qijiale76 commented Nov 8, 2023

What changes were proposed in this pull request?

Support combine operation in map stage.

Why are the changes needed?

Fix: #1300

Does this PR introduce any user-facing change?

No.

How was this patch tested?

Tested by UT, especially WordCountTest

before this PR, Map Output

847 America 0001
847 America 0001
847 America 0001
847 America 0001
847 America 0001
847 America 0001
847 America 0001
847 America 0001
847 America 0001
847 America 0001
847 America 0001
847 America 0001
847 America 0001
847 America 0001
847 Chinese 0001
847 Chinese 0001
847 Chinese 0001
847 Chinese 0001
847 Chinese 0001
847 Chinese 0001
847 Chinese 0001
847 Chinese 0001
847 Chinese 0001
847 Chinese 0001
847 Chinese 0001
645 Japan 0001
645 Japan 0001
645 Japan 0001
645 Japan 0001
645 Japan 0001
645 Japan 0001
645 Japan 0001
645 Japan 0001
645 Japan 0001
645 Japan 0001
645 Japan 0001
645 apple 0001
645 apple 0001
645 apple 0001
645 apple 0001
645 apple 0001
645 apple 0001
645 apple 0001
645 apple 0001
645 apple 0001
645 apple 0001
645 apple 0001
645 apple 0001
645 apple 0001
645 apple 0001
645 apple 0001
746 banana 0001
746 banana 0001
746 banana 0001
746 banana 0001
746 banana 0001
746 banana 0001
746 banana 0001
746 banana 0001
746 banana 0001
746 banana 0001
746 cherry 0001
746 cherry 0001
746 cherry 0001
746 cherry 0001
746 cherry 0001
746 cherry 0001
746 cherry 0001
746 cherry 0001
746 cherry 0001
746 cherry 0001
746 cherry 0001
645 fruit 0001
645 fruit 0001
645 fruit 0001
645 fruit 0001
645 fruit 0001
645 fruit 0001
645 fruit 0001
645 fruit 0001
645 fruit 0001
645 fruit 0001
645 fruit 0001
645 fruit 0001
645 fruit 0001
645 fruit 0001
645 fruit 0001
746 tomato 0001
746 tomato 0001
746 tomato 0001
746 tomato 0001
746 tomato 0001
746 tomato 0001
746 tomato 0001
746 tomato 0001
746 tomato 0001
746 tomato 0001
746 tomato 0001
746 tomato 0001
746 tomato 0001

after this PR, MapOutput

847 America 13
847 Chinese 12
645 Japan 10
645 apple 14
746 banana 14
746 cherry 11
645 fruit 12
746 tomato 14

@qijiale76 qijiale76 changed the title [#1300]Improvement(mr): Support combine operation in map stage for mr engine. [#1300]feat(mr): Support combine operation in map stage for mr engine. Nov 8, 2023
@jerqi
Copy link
Contributor

jerqi commented Nov 8, 2023

@zhengchenyu Could you help me review this pr?

@codecov-commenter
Copy link

codecov-commenter commented Nov 8, 2023

Codecov Report

Merging #1301 (2329a26) into master (23e26fc) will increase coverage by 0.75%.
Report is 3 commits behind head on master.
The diff coverage is 76.92%.

@@             Coverage Diff              @@
##             master    #1301      +/-   ##
============================================
+ Coverage     53.65%   54.41%   +0.75%     
- Complexity     2690     2696       +6     
============================================
  Files           403      391      -12     
  Lines         23456    21280    -2176     
  Branches       1992     2006      +14     
============================================
- Hits          12586    11579    -1007     
+ Misses        10096     8997    -1099     
+ Partials        774      704      -70     
Files Coverage Δ
...pache/hadoop/mapred/RssCombineOutputCollector.java 100.00% <100.00%> (ø)
...java/org/apache/hadoop/mapred/SortWriteBuffer.java 87.92% <94.44%> (+0.49%) ⬆️
...rg/apache/hadoop/mapred/RssMapOutputCollector.java 0.00% <0.00%> (ø)
...g/apache/hadoop/mapred/SortWriteBufferManager.java 88.18% <61.11%> (-2.51%) ⬇️

... and 31 files with indirect coverage changes

📣 Codecov offers a browser extension for seamless coverage viewing on GitHub. Try it in Chrome or Firefox today!

@zhengchenyu
Copy link
Collaborator

@qijiale76 Can we add a new UT?

@jerqi
Copy link
Contributor

jerqi commented Nov 11, 2023

@zhengchenyu Committers have the privilege to rerun the failure jobs. You can merge this pr refer to other commits.

@zhengchenyu
Copy link
Collaborator

@zhengchenyu Committers have the privilege to rerun the failure jobs. You can merge this pr refer to other commits.

@jerqi OK! And do you have any comments on this PR?

@jerqi
Copy link
Contributor

jerqi commented Nov 13, 2023

@zhengchenyu Committers have the privilege to rerun the failure jobs. You can merge this pr refer to other commits.

@jerqi OK! And do you have any comments on this PR?

Could we have a config option to control whether to combine?

@qijiale76
Copy link
Contributor Author

Could we have a config option to control whether to combine?

Now combine is skipped if combinerRunner is null when conf.setCombinerClass(xxx.class) is not used.

I think it's not necessary to set up additional config options.

@jerqi
Copy link
Contributor

jerqi commented Nov 13, 2023

Could we have a config option to control whether to combine?

Now combine is skipped if combinerRunner is null when conf.setCombinerClass(xxx.class) is not used.

I think it's not necessary to set up additional config options.

OK.

@zhengchenyu zhengchenyu merged commit 45ffa5f into apache:master Nov 14, 2023
36 checks passed
@qijiale76
Copy link
Contributor Author

Thank you. @zhengchenyu @jerqi

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Improvement][mr] Support combine operation in map stage.
4 participants