Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

API/Core: Make ScanReport and its related classes Immutable #5780

Merged
merged 1 commit into from
Sep 20, 2022

Conversation

nastra
Copy link
Contributor

@nastra nastra commented Sep 16, 2022

  • Use true immutable objects that are type-safe, thread-safe, null-safe
  • Get builder classes for free

This is relying on https://immutables.github.io/ (Apache License 2.0), which allows generating immutable objects and builders via annotation processing.

  • Immutable objects are serialization ready (including JSON and its binary forms)
  • Supports lazy, derived and optional attributes
  • Immutable objects are constructed once, in a consistent state, and can be safely shared
    • Will fail if mandatory attributes are missing
    • Cannot be sneakily modified when passed to other code
  • Immutable objects are naturally thread-safe and can therefore be safely shared among threads
    • No excessive copying
    • No excessive synchronization
  • Object definitions are pleasant to write and read
    • No boilerplate setter and getters
    • No ugly IDE-generated hashCode, equals and toString methods that end up being stored in source control.

Note that we are specifically preventing people from using Jackson-related annotations (@JsonSerialize & @JsonDeserialize) in order to avoid potential runtime library dependency issues where a clashing jackson lib would prevent jackson-related annotations to be used/processed.

build.gradle Outdated Show resolved Hide resolved
public Timer totalPlanningDuration() {
return totalPlanningDuration;
return metricsContext().timer(TOTAL_PLANNING_DURATION, TimeUnit.NANOSECONDS);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't it be simpler just to use a ScanMetrics.from(MetricsContext) method so that ScanMetrics doesn't need to keep track of the context? The from method could create all of the metrics and pass them into the builder and we wouldn't need all the generated code for initialization and making sure MetricsContext is only called once per counter/timer. If we also make these required, then there should be no problem with them being null.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see the MetricsContext parameter here less as a "this class needs to keep track of this" but rather "this is what's required to create an instance of this class". Your comment is perfectly valid here and I initially had the exact same thought.
Therefore it make sense to have a ScanMetrics.from(MetricsContext). In fact we have a ScanMetrics.of(MetricsContext) that does exactly that.

We could go ahead and change ScanMetrics.of(...) to:

public static ScanMetrics of(MetricsContext metricsContext) {
      return ImmutableScanMetrics.builder()
          .metricsContext(metricsContext)
          .totalPlanningDuration(
              metricsContext.timer(TOTAL_PLANNING_DURATION, TimeUnit.NANOSECONDS))
          .resultDataFiles(metricsContext.counter(RESULT_DATA_FILES, MetricsContext.Unit.COUNT))
          .resultDeleteFiles(metricsContext.counter(RESULT_DELETE_FILES, MetricsContext.Unit.COUNT))
          .scannedDataManifests(
              metricsContext.counter(SCANNED_DATA_MANIFESTS, MetricsContext.Unit.COUNT))
          .totalDataManifests(
              metricsContext.counter(TOTAL_DATA_MANIFESTS, MetricsContext.Unit.COUNT))
          .totalDeleteManifests(
              metricsContext.counter(TOTAL_DELETE_MANIFESTS, MetricsContext.Unit.COUNT))
          .totalFileSizeInBytes(
              metricsContext.counter(TOTAL_FILE_SIZE_IN_BYTES, MetricsContext.Unit.BYTES))
          .totalDeleteFileSizeInBytes(
              metricsContext.counter(TOTAL_DELETE_FILE_SIZE_IN_BYTES, MetricsContext.Unit.BYTES))
          .skippedDataManifests(
              metricsContext.counter(SKIPPED_DATA_MANIFESTS, MetricsContext.Unit.COUNT))
          .build();
    }

It would produce the exact same results. The only reason I went with @Value.Derived is because it does not allow manual creation of metric instances.
ScanMetrics.of(...) is just a utility method and it doesn't prevent anyone from using the Builder directly. Using @Value.Derived in this context means essentially that all those metric instances are derived from MetricsContext but cannot be manually set, thus providing a builder where only ImmutableScanMetrics.builder() .metricsContext(metricsContext).build() can be set.
The lazy initialization of metrics that @Value.Derived provides is just a nice thing to have that we get here, but my main motivation was to prevent the different counters/timers to be manually set. This gives us the exact same semantics we had in the previous ScanMetrics API, since all the metrics were initialized in the constructor.

versions.props Outdated Show resolved Hide resolved
* Use true immutable objects that are type-safe, thread-safe, null-safe
* Get builder classes for free

This is relying on https://immutables.github.io/ (Apache License 2.0), which allows generating immutable objects and builders via annotation processing.
* Immutable objects are serialization ready (including JSON and its binary forms)
* Supports lazy, derived and optional attributes
* Immutable objects are constructed once, in a consistent state, and can be safely shared
  * Will fail if mandatory attributes are missing
  * Cannot be sneakily modified when passed to other code
* Immutable objects are naturally thread-safe and can therefore be safely shared among threads
  * No excessive copying
  * No excessive synchronization
* Object definitions are pleasant to write and read
  * No boilerplate setter and getters
  * No ugly IDE-generated hashCode, equals and toString methods that end up being stored in source control.

Note that we are specifically preventing people from using Jackson-related annotations (`@JsonSerialize` & `@JsonDeserialize`) in order to avoid potential runtime library dependency issues where a clashing jackson lib would prevent jackson-related annotations to be used/processed.
@rdblue rdblue merged commit fe83ff5 into apache:master Sep 20, 2022
@rdblue
Copy link
Contributor

rdblue commented Sep 20, 2022

Thanks, @nastra! Looks great.

@nastra nastra deleted the immutable-scan-report branch September 21, 2022 05:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants