Skip to content

Commit

Permalink
[apache#2921] fix(catalog-lakehouse-iceberg): Add width param to buck…
Browse files Browse the repository at this point in the history
…et and truncate functions in Gravitino SortOrder (apache#2928)

Add width param to bucket and truncate functions in Gravitino SortOrder.
for example:
change `bucket([id]) `to `bucket(10, id)`
change `truncate([name]) `to `truncate(2, name)`

SortOrder in Iceberg supports `FunctionExpression`, such as `year,
month, bucket, truncate, etc`.
`truncate` and `bucket` functions both have two parameters, such as
`bucket(10, col1), truncate(2, col2)`.

However, in gravitino, when converting an iceberg sortorder with
`bucket` or `truncate` to gravitino sortOrder, there is only one
parameter in `bucket` and `truncate` functions.

This picture shows the details of the parameters of the `bucket` and
`truncate` functions in gravitino sortorder, we can test it in
`com.datastrato.gravitino.catalog.lakehouse.iceberg.converter.TestFromIcebergSortOrder#testFromSortOrder`

![image](https://github.com/datastrato/gravitino/assets/94670132/06fd65f5-7d33-4197-8871-dda02fd70a26)

And if we want to convert the gravitino sortOrder with `bucket` or
`truncate` to iceberg sortOrder, we will get the following error as the
first param is missing.
```
java.lang.IllegalArgumentException: Bucket sort should have 2 arguments

	at com.google.common.base.Preconditions.checkArgument(Preconditions.java:143)
	at com.datastrato.gravitino.catalog.lakehouse.iceberg.converter.ToIcebergSortOrder.toSortOrder(ToIcebergSortOrder.java:56)
	at com.datastrato.gravitino.catalog.lakehouse.iceberg.converter.TestFromIcebergSortOrder.testFromSortOrder(TestFromIcebergSortOrder.java:86)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.junit.platform.commons.util.ReflectionUtils.invokeMethod(ReflectionUtils.java:725)
	at org.junit.jupiter.engine.execution.MethodInvocation.proceed(MethodInvocation.java:60)
```

Fix: apache#2921

No

New UTs and ITs.
  • Loading branch information
caican00 committed May 23, 2024
1 parent 13cc345 commit 3654b15
Show file tree
Hide file tree
Showing 7 changed files with 280 additions and 120 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@

import com.datastrato.gravitino.rel.expressions.FunctionExpression;
import com.datastrato.gravitino.rel.expressions.NamedReference;
import com.datastrato.gravitino.rel.expressions.literals.Literals;
import com.datastrato.gravitino.rel.expressions.sorts.NullOrdering;
import com.datastrato.gravitino.rel.expressions.sorts.SortOrder;
import com.datastrato.gravitino.rel.expressions.sorts.SortOrders;
Expand Down Expand Up @@ -38,13 +39,13 @@ public SortOrder field(String sourceName, int id, SortDirection direction, NullO
@Override
public SortOrder bucket(
String sourceName, int id, int width, SortDirection direction, NullOrder nullOrder) {
return functionSortOrder("bucket", id, direction, nullOrder);
return functionSortOrder("bucket", width, id, direction, nullOrder);
}

@Override
public SortOrder truncate(
String sourceName, int id, int width, SortDirection direction, NullOrder nullOrder) {
return functionSortOrder("truncate", id, direction, nullOrder);
return functionSortOrder("truncate", width, id, direction, nullOrder);
}

@Override
Expand Down Expand Up @@ -80,6 +81,15 @@ private SortOrder functionSortOrder(
toGravitino(nullOrder));
}

private SortOrder functionSortOrder(
String name, int width, int id, SortDirection direction, NullOrder nullOrder) {
return SortOrders.of(
FunctionExpression.of(
name, Literals.integerLiteral(width), NamedReference.field(idToName.get(id))),
toGravitino(direction),
toGravitino(nullOrder));
}

private com.datastrato.gravitino.rel.expressions.sorts.SortDirection toGravitino(
SortDirection direction) {
return direction == SortDirection.ASC
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@
import com.datastrato.gravitino.rel.expressions.sorts.NullOrdering;
import com.datastrato.gravitino.rel.expressions.sorts.SortDirection;
import com.datastrato.gravitino.rel.expressions.sorts.SortOrder;
import com.datastrato.gravitino.rel.types.Types;
import com.google.common.base.Preconditions;
import java.util.Locale;
import org.apache.commons.lang3.ArrayUtils;
Expand Down Expand Up @@ -58,9 +59,10 @@ public static org.apache.iceberg.SortOrder toSortOrder(Schema schema, SortOrder[

Expression firstArg = sortFunc.arguments()[0];
Preconditions.checkArgument(
firstArg instanceof Literal && ((Literal<?>) firstArg).value() instanceof Integer,
firstArg instanceof Literal
&& ((Literal<?>) firstArg).dataType() instanceof Types.IntegerType,
"Bucket sort's first argument must be a integer literal");
int numBuckets = (Integer) ((Literal<?>) firstArg).value();
int numBuckets = Integer.parseInt(String.valueOf(((Literal<?>) firstArg).value()));

Expression secondArg = sortFunc.arguments()[1];
Preconditions.checkArgument(
Expand All @@ -77,9 +79,10 @@ public static org.apache.iceberg.SortOrder toSortOrder(Schema schema, SortOrder[

firstArg = sortFunc.arguments()[0];
Preconditions.checkArgument(
firstArg instanceof Literal && ((Literal<?>) firstArg).value() instanceof Integer,
firstArg instanceof Literal
&& ((Literal<?>) firstArg).dataType() instanceof Types.IntegerType,
"Truncate sort's first argument must be a integer literal");
int width = (Integer) ((Literal<?>) firstArg).value();
int width = Integer.parseInt(String.valueOf(((Literal<?>) firstArg).value()));

secondArg = sortFunc.arguments()[1];
Preconditions.checkArgument(
Expand Down
Loading

0 comments on commit 3654b15

Please sign in to comment.