-
Couldn't load subscription status.
- Fork 3.9k
ARROW-4563: [Python] Validate decimal128() precision input #3647
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ARROW-4563: [Python] Validate decimal128() precision input #3647
Conversation
|
cc @pravindra for the gandiva changes. |
Also add a debug check on the C++ side.
26e2dcc to
5a4cd6a
Compare
| Decimal128Type::Decimal128Type(int32_t precision, int32_t scale) | ||
| : DecimalType(16, precision, scale) { | ||
| DCHECK_GE(precision, 1); | ||
| DCHECK_LE(precision, 38); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also, check for scale >= 0 && scale <= precision ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure. Why can't scale be arbitrary? It's simply an exponent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From this
Precision is the number of digits in a number. Scale is the number of digits to the right of the decimal point in a number. For example, the number 123.45 has a precision of 5 and a scale of 2.
The number of digits after the decimal must be >=0 and must be <= the digits in the number. Am I missing something ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is Microsoft-specific. There is no a priori reason why scale should be limited. Apparently Oracle allows scales between -128 and 127 (?):
https://docs.oracle.com/cd/A81042_01/DOC/server.816/a76965/c10datyp.htm#743
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the 0 to 38 thing is pretty ubiquitous in the database landscape
- Spark SQL https://spark.apache.org/docs/2.0.2/api/java/org/apache/spark/sql/types/DecimalType.html
- Impala https://github.com/apache/impala/blob/master/be/src/udf/udf.h#L664
- Hive https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Types#LanguageManualTypes-DecimalsdecimalDecimals
- Presto https://prestodb.github.io/docs/current/functions/decimal.html
MapD seems to be capped at 19 digits of precision instead of 38, presumably to fit in 64 bits https://www.omnisci.com/docs/latest/5_datatypes.html
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also add a debug check on the C++ side.