Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
memory mapped files, branchless parsing, bitwiddle magic #5
memory mapped files, branchless parsing, bitwiddle magic #5
Changes from all commits
7ca2a69
58f01c3
0d02423
0f58035
581f099
bfb72f7
4079ba6
a27b369
d85b38f
b65fec6
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For my input I get IOOBE here:
Let me know if you need the file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yes please, perhaps there is some other bug that's platform dependent; do share.
Can you specify what platform you're running it on, and could you please also check this (improved) version:
https://github.com/royvanrijn/1brc/blob/8db31e6a36fbc305765a2393efb06ba6bff23f42/src/main/java/dev/morling/onebrc/CalculateAverage_royvanrijn.java
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Running on Windows. Will check the new version later (it is quite late here now and starting from tomorrow I won't have access to PC for the weekend).
Let me know if you can suggest how to send you the data file - I started bzipping it and it takes forever, but even part way through, the archive is 2Gb (you can mail me upload coordinates at dimitar.dimitrov at gmail dot com)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If possible, can you narrow it down? Perhaps run a very small test? Do they all crash, just this one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ddimtirov for future reference, the default compression level on zstd will be a lot faster and offer reasonable compression:
On a MacBook Pro 2020 - 2 GHz Quad-Core Intel Core i5:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See related #61
We've also added some basic samples within #82
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here it shouldn't be the number of leading ones?
@royvanrijn @gunnarmorling
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Weird, I thought by setting explicitly on 105 to LE would make the compatibility issues disappear. So running it on my machine would automatically mean it works on the target, although perhaps having a performance hit.
afk atm, I’ll check tomorrow, if somebody wants to fix it and tell me, be my guest 😂
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure actually, for these things I need an old school paper and a pencil :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have some local classes that test; problem is that I believe it works on my machine, just not on the target machine, I’ll check soon.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, annoying, it runs fine locally (the code that was pushed), sigh. Kind of debugging in the dark haha... a challenge!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thinking about it twice, I'm wrong at #5
let's say we have a byte[] data = { 0x01, 0x03 }
And we assume to have a short-based version of SWAR
reading the content of data with (a short) little-endian means:
0x0301
which have the less significant part at the lower address,
hence the binary hex SWAR result obtained (I'm using the Netty algorithm, but here should be the same) will be
0x8000
and, in order to find out 0x03, we have to use the trailing zeros (here 8 + 7 = 15 -> 15/8 = 1) .
Which means that is fine as it is!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@gunnarmorling Is this assumption acceptable? E.g.
1
,0
,2.00
are all valid doubles.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also there should be an acceptance test suite for the implementations, I am pretty sure this implementation does not produce the same output as the baseline.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm pretty sure it produces the exact same output, I check regularly with each change.
The input and output have one decimal place precision (as stated by the website "rounded to one fractional digit").
I'm internally storing the doubles as 10x integers because the precision is just a single digit, and I'm making sure the rounding is correct afterwards for the average.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The README only says about output format but not input
1brc/README.md
Lines 27 to 28 in e7e7deb
so its worth clarifying.
There are many rounding modes - this is also not specified, e.g. in go https://pkg.go.dev/math#Round is not the same as in java.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also reference implementation does ad-hoc rounding
Math.round(value * 10.0) / 10.0
but actual output depends on the string concatenation which performs another rounding, see https://docs.oracle.com/en/java/javase/21/docs/api/java.base/java/lang/Double.html#toString(double)There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Created #36 to address this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the the exact mode does not really matter but what matters is that the baseline is correct.
I propose to change baseline to use BigDecimal for results accumulation, use scale 1 (instead of round(x*10)/10) and HALF_UP rounding mode (as most common at school) at the final step:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But that's the thing, I don't think we can change the behavior of the reference implementation at this point, as it would render existing submissions invalid if they implement a different behavior. So I'd rather make the behavior of the RI explicit, also if it's not the most natural one (agreed that
HALF_UP
behavior would have been better).There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think its possible to fix RI because it uses double division and rounds twice.
Since there are no acceptance tests I bet a lot of implementations (those that do not parse and calculate values the same way) will not match RI anyways.
I think RI should favor correctness over performance, then it can be used to build acceptance test suite.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I've logged #49 for getting this one sorted out separately and get this PR merged. Let's continue the rounding topic over there. Thx!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hash code here has a data dependency: you either manually unroll this or just relax the hash code by using a var handle and use getLong amortizing the data dependency in batches, handing only the last 7 (or less) bytes separately, using the array.
In this way the most of computation would like resolve in much less loop iterations too, similar to https://github.com/apache/activemq-artemis/blob/25fc0342275b29cd73123523a46e6e94582597cd/artemis-commons/src/main/java/org/apache/activemq/artemis/utils/ByteUtil.java#L299