Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add rounding functions to the Column type #6817
Add rounding functions to the Column type #6817
Changes from 27 commits
8c34ebb
3d8b55c
2aa2302
5f0d00f
b8e5282
97ab8ec
07527b6
b0ed569
c3818dc
35d83a1
430090f
f97554d
2db1924
8a62af2
759de4d
d616cea
b85f558
82ed443
0eeb10e
e9be9d5
91babe2
01f0f65
d6aa9ef
4a4e886
25cbea1
ac93d31
45c1754
ea1fd0b
a48cf30
2fab69b
a31681f
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm really not sure if these operations should be implemented in Enso.
Currently most of the 'primitive' operations, especially on numeric columns are implemented in Java, for better efficiency.
Do we want to move towards doing these in Enso? I think we should measure what is the difference in performance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another issue.
Currently I see that the
ceil
method may return an EnsoBigInteger (refer to thedecimal/CeilNode.java
).However, currently the Table library storage does not support Big Integer storage - only
LongStorage
. So this will likely fail in some way.I think we need to have tests checking what happens in the scenario where the
double
value magnitude exceeds thelong
range, to verify the behaviour.IMO currently it could be valid to return
Nothing
in such a case and attach some kind ofArithmetic_Overflow
exception. Extending Table to handle Big Integers is a separate issue (currently unscheduled, @jdunkerley do we have plans to support this?)There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be clear about the first issue. The
map_over_storage
is OK. Its usage is warranted when the operation we perform on each row must run some Enso logic (some complicated Enso function that is not worth replicating into Java, or has so expensive logic anyway the benefit of inlining it to Java would be negligible).But I think we shouldn't use Enso operations for primitive operations and instead implement them directly in Java. That is the current architecture of the primitive-value storages in the Table library. We can discuss if it should be changed, but to do so I think we need to do some performance measurements to take informed decisions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think worth making this a vectorised operation on the NumberStorage.
We can use the Core_Utils to share the common implementation. Lets put it as a follow up ticket though please.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could make the
round
implementation inCore_Utils
, but then it would no longer be able to be written in Enso...As for all the others -
ceil
,floor
and I thinktruncate
- we seem to be doing just the basic Java operationsMath.ceil
etc. in the 'primitive' path. The only additional handling is to handle promotion toBigInteger
if the result exceeds the range oflong
. ButBigInteger
is currently not supported by Table anyway, so I don't think sharing the implementation of these gives us any value at this moment - because the pure Enso side needs the additional BigInteger handling that has some Truffle magic, and the Table side needs to just report a warning/failure in case of an overflow - the common part is just delegating toMath.ceil
and co. which IMO isn't enough to make sharing worth it.(But I'd make these vectorized nonetheless, I'm only unsure about
round
as that would require us moving the whole implementation into Java - it could make it more efficient for tables though, so may as well be worth it, but it's quite a bit more work).There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok - not need to put in
Core_Utils
then.Worth us seeing the relative performance of rounding 1,000,000 values as a map vs a vectorised op.