Skip to content

Conversation

@mgaido91
Copy link
Contributor

@mgaido91 mgaido91 commented Dec 4, 2018

What changes were proposed in this pull request?

The PR starts from the comment in the main one and it aims at:

  • simplifying the code for MapConcat;
  • be more precise in checking the limit size.

How was this patch tested?

existing tests

@mgaido91
Copy link
Contributor Author

mgaido91 commented Dec 4, 2018

cc @cloud-fan


val index = keyToIndex.getOrDefault(key, -1)
if (index == -1) {
if (withSizeCheck && size >= ByteArrayMethods.MAX_ROUNDED_ARRAY_LENGTH) {
Copy link
Contributor

@cloud-fan cloud-fan Dec 4, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should always check the size. Such a big map is very likely to cause problems.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this flag is just for perf reasons, we can skip the check in some conditions and I didn't want to introduce perf overhead if not needed. If we remove the flag we would do the comparison for each item, also when it is not needed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmmm, I'd like to avoid premature optimization. Actually how much perf this can save? This code block is already doing some heavy work.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, let me remove it then, thanks.

val index = keyToIndex.getOrDefault(key, -1)
if (index == -1) {
if (withSizeCheck && size >= ByteArrayMethods.MAX_ROUNDED_ARRAY_LENGTH) {
throw new RuntimeException(s"Unsuccessful attempt to concat maps with $size elements " +
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: concat -> build

@cloud-fan
Copy link
Contributor

thanks for the cleanup!

@SparkQA
Copy link

SparkQA commented Dec 4, 2018

Test build #99664 has finished for PR 23217 at commit 54f0f31.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Dec 4, 2018

Test build #99670 has finished for PR 23217 at commit 38f3bfa.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@mgaido91
Copy link
Contributor Author

mgaido91 commented Dec 4, 2018

retest this please

@SparkQA
Copy link

SparkQA commented Dec 4, 2018

Test build #99668 has finished for PR 23217 at commit 724db5c.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Dec 4, 2018

Test build #99680 has finished for PR 23217 at commit 38f3bfa.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@cloud-fan
Copy link
Contributor

thanks, merging to master!

@asfgit asfgit closed this in 7143e9d Dec 5, 2018
jackylee-ch pushed a commit to jackylee-ch/spark that referenced this pull request Feb 18, 2019
…perly the limit size

## What changes were proposed in this pull request?

The PR starts from the [comment](apache#23124 (comment)) in the main one and it aims at:
 - simplifying the code for `MapConcat`;
 - be more precise in checking the limit size.

## How was this patch tested?

existing tests

Closes apache#23217 from mgaido91/SPARK-25829_followup.

Authored-by: Marco Gaido <marcogaido91@gmail.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants