Skip to content

[SPARK-45688][SPARK-45693][CORE] Clean up the deprecated API usage related to MapOps & Fix method += in trait Growable is deprecated #43578

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 5 commits into from

Conversation

panbingkun
Copy link
Contributor

@panbingkun panbingkun commented Oct 30, 2023

What changes were proposed in this pull request?

The pr aims to:

  • clean up the deprecated API usage related to MapOps.
  • fix method += in trait Growable is deprecated.

Why are the changes needed?

Eliminate warnings and no longer use deprecated scala APIs.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

  • Pass GA.
  • Manually test:
    build/sbt -Phadoop-3 -Pdocker-integration-tests -Pspark-ganglia-lgpl -Pkinesis-asl -Pkubernetes -Phive-thriftserver -Pconnect -Pyarn -Phive -Phadoop-cloud -Pvolcano -Pkubernetes-integration-tests Test/package streaming-kinesis-asl-assembly/assembly connect/assembly
    

Was this patch authored or co-authored using generative AI tooling?

No.

…lated to `MapOps` & Fix `method += in trait Growable is deprecated`
@github-actions github-actions bot added the CORE label Oct 30, 2023
@panbingkun panbingkun marked this pull request as draft October 30, 2023 06:41
@panbingkun panbingkun marked this pull request as ready for review October 30, 2023 08:35
assert(testMap4.size === 1)
assert(testMap4.get("k1").isDefined)
assert(testMap4("k1") === "v1")
testMap3.remove("k0")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if this modification aligns with the original test objective.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found that TimeStampedHashMap is now a utility class that is no longer used by Spark, and only test cases are still using it.

I personally have the following optional suggestions:

  1. Since it is private[spark] visibility, perhaps we can directly delete TimeStampedHashMap.
  2. Or we can mark it as deprecated in Spark 4.0, suppress the compilation warnings with @nowarn, and then remove it in Spark 4.1.
  3. Refactor TimeStampedHashMap into an immutable implementation to maintain the current usage.

cc @dongjoon-hyun FYI

Copy link
Contributor

@LuciferYang LuciferYang Nov 1, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

b18d708#diff-77b12178a7036c71135074c6ddf7d659e5a69906264d5e3061087e4352e304ed introduced this data structure

After #22339, this data structure is only being used in unit tests. So, after Spark 3.0, this data structure has not been used by any production code of Spark.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 for (1) direct deletion.

leftKey -> joinStat.copy(histogram = leftKeyStat.histogram),
rightKey -> joinStat.copy(histogram = rightKeyStat.histogram)
)
// Histograms are propagated as unchanged. During future estimation, they should be
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible not to modify keyStatsAfterJoin += and the relative position of the comments?

@@ -46,7 +46,8 @@ case class DescribeNamespaceExec(
}

if (isExtended) {
val properties = metadata.asScala -- CatalogV2Util.NAMESPACE_RESERVED_PROPERTIES
val properties = metadata.asScala.filterNot(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how about metadata.asScala.toMap --, then the subsequent properties don't need to call toMap.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good suggestion!

val toRemove =
CatalogV2Util.NAMESPACE_RESERVED_PROPERTIES.filter(expected.contains)
assert(expected -- toRemove === actual)
val toRemove = CatalogV2Util.NAMESPACE_RESERVED_PROPERTIES.toSet
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how about change to use immutable.Map

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay.

LuciferYang pushed a commit that referenced this pull request Nov 2, 2023
### What changes were proposed in this pull request?
The pr aims to delete `TimeStampedHashMap` and its UT.

### Why are the changes needed?
During Pr #43578, we found that the class `TimeStampedHashMap` is no longer in use. Based on the suggestion, we have removed it. #43578 (comment)

- First time this class `TimeStampedHashMap` be introduced:
b18d708#diff-77b12178a7036c71135074c6ddf7d659e5a69906264d5e3061087e4352e304ed introduced this data structure
- After #22339, this class `TimeStampedHashMap` is only being used in UT `TimeStampedHashMapSuite`.
So, after Spark 3.0, this data structure has not been used by any production code of Spark.

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
Pass GA.

### Was this patch authored or co-authored using generative AI tooling?
No.

Closes #43633 from panbingkun/remove_TimeStampedHashMap.

Authored-by: panbingkun <pbk1982@gmail.com>
Signed-off-by: yangjie01 <yangjie01@baidu.com>
@LuciferYang
Copy link
Contributor

let's rebase this one @panbingkun

@panbingkun
Copy link
Contributor Author

let's rebase this one @panbingkun

Okay, thank you very much.

…ated to MapOps & Fix method += in trait Growable is deprecated
@panbingkun
Copy link
Contributor Author

let's rebase this one @panbingkun

Done.

Copy link
Contributor

@LuciferYang LuciferYang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, LGTM if test pass

@panbingkun
Copy link
Contributor Author

+1, LGTM if test pass

GA has passed.
The issue of appveyor is not related to this PR.
#43631

@LuciferYang
Copy link
Contributor

friendly ping @dongjoon-hyun do you need to take another look?

@LuciferYang
Copy link
Contributor

Merged into master for Spark 4.0. Thanks @panbingkun and @dongjoon-hyun ~

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, late LGTM. :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants