Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use ConcurrentHashMap for DataType SerDe cache #1327

Merged
merged 2 commits into from
Apr 24, 2023
Merged

Conversation

pan3793
Copy link
Contributor

@pan3793 pan3793 commented Apr 24, 2023

Summary

The cache should be thread-safe since it is shared to multi-threads, otherwise may fail w/ the following errors

(fv-az617-691.33ka01qfponuxnhx0mx0ps1wwf.jx.internal.cloudapp.net executor driver): java.util.ConcurrentModificationException
	at java.base/java.util.HashMap.computeIfAbsent(HashMap.java:1135)
	at com.clickhouse.data.format.BinaryDataProcessor$DateTime32SerDe.of(BinaryDataProcessor.java:219)
	at com.clickhouse.data.format.ClickHouseRowBinaryProcessor.getDeserializer(ClickHouseRowBinaryProcessor.java:341)
	at com.clickhouse.data.ClickHouseDataProcessor$DefaultSerDe.<init>(ClickHouseDataProcessor.java:89)
	at com.clickhouse.data.ClickHouseDataProcessor.getInitializedSerDe(ClickHouseDataProcessor.java:286)
	at com.clickhouse.data.ClickHouseDataProcessor.lambda$records$0(ClickHouseDataProcessor.java:462)

https://github.com/housepower/spark-clickhouse-connector/actions/runs/4782401614/jobs/8501708204

Checklist

Delete items not relevant to your PR:

  • Unit and integration tests covering the common scenarios were added
  • A human-readable description of the changes was provided to include in CHANGELOG

@pan3793 pan3793 changed the title Use ConcurrentModificationException for DataType SerDe cache Use ConcurrentHashMap for DataType SerDe cache Apr 24, 2023
@pan3793
Copy link
Contributor Author

pan3793 commented Apr 24, 2023

HashMap allows null as a key but ConcurrentHashMap does not, the CI fails because ClickHouseConfig#timeZoneForDate may null


public static final Date32SerDe of(ClickHouseDataConfig config) {
TimeZone tz = ClickHouseChecker.nonNull(config, ClickHouseDataConfig.TYPE_NAME).getTimeZoneForDate();
return cache.computeIfAbsent(tz, Date32SerDe::new);
return cache.computeIfAbsent(tz == null ? ClickHouseValues.SYS_TIMEZONE : tz, Date32SerDe::new);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is consistent w/

public Date32SerDe(TimeZone tz) {
    this.zoneId = tz != null ? tz.toZoneId() : ClickHouseValues.SYS_ZONE;
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants