Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Felix19350 2nd submission. Further improved performance by improving the parsing logic #323

Merged
merged 1 commit into from
Jan 14, 2024

Conversation

felix19350
Copy link
Contributor

@felix19350 felix19350 commented Jan 11, 2024

Further improved performance by improving the parsing logic so that strings for city names are not allocated with each row.
Removed JVM options to control heap memory as they are now unnecessary.

Check List:

  • Tests pass (./test.sh <username> shows no differences between expected and actual outputs)
  • All formatting changes by the build are committed
  • Your launch script is named calculate_average_<username>.sh (make sure to match casing of your GH user name) and is executable
  • Output matches that of calculate_average_baseline.sh
  • Execution time: 00:22.813
  • Execution time of reference implementation: 4:03.22

@felix19350
Copy link
Contributor Author

@gunnarmorling rebased PR as per your request. Sorry about that!


public static Stream<AverageAggregatorTask> createStreamOf(List<MemorySegment> memorySegments) {
return memorySegments.stream().map(AverageAggregatorTask::new);
}

public Map<String, ResultRow> processChunk() {
final var result = new TreeMap<String, ResultRow>();
final var measurements = new HashMap<Integer, ResultRow>(EXPECTED_MAX_NUM_CITIES);
final var cityNames = new HashMap<Integer, String>(EXPECTED_MAX_NUM_CITIES);
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's used as the key here? How does it prevent collisions between entries?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm good point, it doesn't really. Let me see what I can do in terms of proper perfect hashing :)

Copy link
Contributor Author

@felix19350 felix19350 Jan 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gunnarmorling addressed your comment above (check the equals method on the CityRef class) and squeezed a bit more performance out of this while (hopefully) maintaining a relatively readable and idiomatic solution.

I'm seeing some fluctuation in my machine, I was able to go as low as 15s in some runs. Curious to see how this behaves in the test environment.

…trings for city names are not allocated with each row.
@gunnarmorling
Copy link
Owner

00:26.500. It fluctuates quite a bit indeed.

@gunnarmorling gunnarmorling merged commit bb5679f into gunnarmorling:main Jan 14, 2024
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants