Flink: Support filter pushdown in IcebergTableSource #1893

zhangjun0x01 · 2020-12-09T02:33:58Z

add filter push down for IcebergTableSource

zhangjun0x01 · 2020-12-09T02:35:00Z

@openinx could you help me review the pr when you have time ,thanks

openinx · 2020-12-09T09:31:19Z

Thanks @zhangjun0x01 for contributing, I will review this patch today or tomorrow.

yyanyy · 2020-12-17T03:01:07Z

flink/src/main/java/org/apache/iceberg/flink/FlinkFilters.java

+
+  public static Expression convert(org.apache.flink.table.expressions.Expression flinkExpression) {
+    if (!(flinkExpression instanceof CallExpression)) {
+      return null;


I'm not familiar with flink, I wonder if it should be a valid use case here and in other places that we return null; should we throw instead?

iceberg support the following Expressions:
https://iceberg.apache.org/api/#expressions
For some expressions supported by flink but not supported by iceberg, I did not convert them, because they cannot be used for iceberg table scan

If it is an unsupported expression, there is no need to do filter push down, I think we should not throw a exception

I see. Do we want to return Optional<Expression> here then? In this case it signals that the returned value could be null, so when we add the converted expression to the list we can decide to not add nulls, so that we don't have to do null check when calling toString?

thanks for your suggestion,I update it to Optional.
and I add a not push down test case which return a Optional.empty()

yyanyy · 2020-12-17T03:04:31Z

flink/src/main/java/org/apache/iceberg/flink/FlinkFilters.java

+          FieldReferenceExpression field = (FieldReferenceExpression) args.get(0);
+          List<ResolvedExpression> values = args.subList(1, args.size());
+
+          List<Object> expressions = values.stream().filter(


Nit: I think these are input values, not expressions

yyanyy · 2020-12-17T03:05:41Z

flink/src/main/java/org/apache/iceberg/flink/FlinkFilters.java

+  }
+
+  private static boolean literalOnRight(List<ResolvedExpression> args) {
+    return args.get(0) instanceof FieldReferenceExpression && args.get(1) instanceof ValueLiteralExpression ? true :


Nit: no need to have ? true : false

yyanyy · 2020-12-17T03:11:37Z

flink/src/main/java/org/apache/iceberg/flink/FlinkFilters.java

+  private FlinkFilters() {
+  }
+
+  private static final Map<BuiltInFunctionDefinition, Operation> FILTERS = ImmutableMap


Seems that this is mapping flink operations to iceberg operations in the following switch statement? If that's the case we probably don't really need it, and could directly switch based on the input flinkExpression. getFunctionDefinition()?
Also, there's a recent change that requires rewriting NaN in equals/notEquals to isNaN/notNaN as Iceberg's equals no longer accepts NaN as literal, so we will have to rewrite here as well. Here is a similar change done in spark filters.

Seems that this is mapping flink operations to iceberg operations in the following switch statement? If that's the case we probably don't really need it, and could directly switch based on the input flinkExpression. getFunctionDefinition()?

flinkExpression.getFunctionDefinition() return a implement class of FunctionDefinition,which cannot be used directly in switch,so we add a mapping,similar to SparkFilters

yyanyy · 2020-12-17T03:13:00Z

flink/src/main/java/org/apache/iceberg/flink/IcebergTableSource.java

@@ -112,6 +119,11 @@ public String explainSource() {
      explain += String.format(", LimitPushDown : %d", limit);
    }

+    if (filters != null) {
+      explain += String.format(", FilterPushDown,the filters :%s",
+          filters.stream().map(filter -> filter.toString()).collect(Collectors.joining(",")));


I think returning null in the filters class will cause NPE here as well

nit: it's more simple to rewrite this as:

explain += String.format(", FilterPushDown,the filters :%s", Joiner.on(",").join(filters));

flink/src/main/java/org/apache/iceberg/flink/IcebergTableSource.java

zhangjun0x01 · 2020-12-17T06:54:37Z

@yyanyy thanks for your review,I update all

yyanyy · 2020-12-17T20:01:37Z

flink/src/main/java/org/apache/iceberg/flink/FlinkFilters.java

+    Object value = valueLiteralExpression.getValueAs(clazz).get();
+
+    BuiltInFunctionDefinition functionDefinition = (BuiltInFunctionDefinition) call.getFunctionDefinition();
+    if (functionDefinition.equals(BuiltInFunctionDefinitions.EQUALS)) {


I think we may want to rewrite NOT_EQUALS to notNaN as well as notEquals in Iceberg also doesn't accept NaN as literal; I think SparkFilters doesn't do that because there's no NotEqualTo filter in Spark.

flink/src/main/java/org/apache/iceberg/flink/FlinkFilters.java

rdblue · 2020-12-18T18:58:11Z

flink/src/main/java/org/apache/iceberg/flink/FlinkFilters.java

+      switch (op) {
+        case IS_NULL:
+          FieldReferenceExpression isNullFilter = (FieldReferenceExpression) call.getResolvedChildren().get(0);
+          return Optional.of(Expressions.isNull(isNullFilter.getName()));


How does FieldReferenceExpression.getName() reference nested fields?

I tested it, for example, we have a table,the flink ddl is like this：

CREATE TABLE iceberg_nested_test ( id VARCHAR, title VARCHAR, properties ROW(`foo` VARCHAR) ) WITH ( 'connector'='iceberg' );

if the query sql is select * from iceberg_nested_test where properties is null ,it supports filter push down in flink, and the name is properties,if the sql is select * from iceberg_nested_test where properties.foo is null,it do not supports filter push down in flink,the code will do not enter the IcebergTableSource#applyPredicate method

Okay, sounds fine that Flink doesn't currently support predicate pushdown on nested fields. @openinx, any plans to change this?

Yes , flink does not support nested field push down now. Will need to file issue to address it in apache flink repo.

I create a flink issue to track this https://issues.apache.org/jira/browse/FLINK-20767

rdblue · 2020-12-18T18:59:34Z

flink/src/main/java/org/apache/iceberg/flink/FlinkFilters.java

+    }
+
+    String name = fieldReferenceExpression.getName();
+    Class clazz = valueLiteralExpression.getOutputDataType().getConversionClass();


Class is parameterized, so this should be Class<?>

rdblue · 2020-12-18T19:01:47Z

flink/src/main/java/org/apache/iceberg/flink/FlinkFilters.java

+
+    String name = fieldReferenceExpression.getName();
+    Class clazz = valueLiteralExpression.getOutputDataType().getConversionClass();
+    Object value = valueLiteralExpression.getValueAs(clazz).get();


ValueLiteralExpression allows the value to be null, in which case get here will throw an exception. How is this avoided? Does the parser reject col = null expressions?

@openinx may be able to help here, too.

Actually, a few lines down there is an assertion that the value isn't null. This looks like a bug to me.

I tested it ,if the sql is select * from mytable where data = null ,it do not supports filter push down in flink,and we do not can get any data.
if the sql is select * from mytable where data is null, it is normal ,and It will enter the IS_NULL branch of switch

I think it would be good to fix this case rather than making the assumption that Flink won't push the = null filter. Handling null will be good for maintainability.

I agreed with @rdblue that handling null in this function rather than assuming flink won't push down the null.

flink/src/main/java/org/apache/iceberg/flink/FlinkFilters.java

rdblue · 2020-12-18T19:08:53Z

flink/src/main/java/org/apache/iceberg/flink/FlinkFilters.java

+              expression -> {
+                if (expression instanceof ValueLiteralExpression) {
+                  return !((ValueLiteralExpression) flinkExpression).isNull();
+                }


This should not discard anything that is not a ValueLiteralExpression. Instead, if there is a non-literal this should either throw IllegalArgumentException or return Optional.empty to signal that the expression cannot be converted.

rdblue · 2020-12-18T19:11:41Z

flink/src/main/java/org/apache/iceberg/flink/FlinkFilters.java

+        case IN:
+          List<ResolvedExpression> args = call.getResolvedChildren();
+          FieldReferenceExpression field = (FieldReferenceExpression) args.get(0);
+          List<ResolvedExpression> values = args.subList(1, args.size());


I think this should be converted to List<ValueLiteralExpression> to simplify value conversion.

the IN,BETWEEN,NOT_BETWEEN will be auto convert in flink ,so we will not enter the IN block,
should we delete IN branch ?

If we're sure that flink won't enter the IN block, then I think we should remove this block. Pls add a comment saying IN will convert to multiple OR.

I remove IN block and add some comments

rdblue · 2020-12-18T19:14:41Z

flink/src/main/java/org/apache/iceberg/flink/FlinkFilters.java

+                  return !((ValueLiteralExpression) flinkExpression).isNull();
+                }
+
+                return false;


Null values can't be ignored. This should either return Optional.empty or throw IllegalArgumentException if there is a null value.

rdblue · 2020-12-18T19:19:47Z

flink/src/main/java/org/apache/iceberg/flink/FlinkFilters.java

+          return Optional.of(Expressions.in(field.getName(), inputValues));
+
+        case NOT:
+          Optional<Expression> child = convert(call.getResolvedChildren().get(0));


There are several calls to getResolvedChildren().get(0). I think that should be converted to a method that validates there is only one child and also validates the type:

private <T extends ResolvedExpression> Optional<T> getOnlyChild(CallExpression call, Class<T> expectedChildClass) { List<ResolvedExpression> children = call.getResolvedChildren(); if (children.size() != 1) { return Optional.empty(); } ResolvedExpression child = children.get(0); if (!expectedChildClass.isInstance(child)) { return Optional.empty(); } return Optional.of(expectedChildClass.cast(child)); }

rdblue · 2020-12-18T19:21:02Z

flink/src/main/java/org/apache/iceberg/flink/FlinkFilters.java

+        case NOT:
+          Optional<Expression> child = convert(call.getResolvedChildren().get(0));
+          if (child.isPresent()) {
+            return Optional.of(Expressions.not(child.get()));


This can be child.map(Expressions::not).

rdblue · 2020-12-18T19:24:41Z

flink/src/main/java/org/apache/iceberg/flink/FlinkFilters.java

+      }
+    }
+
+    return Optional.of(function.apply(name, value));


The literal value needs to be converted to Iceberg's internal representation before being passed to create an expression. Flink will return LocalDate, LocalTime, LocalDateTime, etc. just in the getValueAs method. And it isn't clear whether the value stored in the literal is the correct representation for other types as well.

@openinx, could you help recommend how to do the conversion here?

rdblue · 2020-12-18T19:26:16Z

flink/src/test/java/org/apache/iceberg/flink/TestFlinkTableSource.java

@@ -102,4 +105,154 @@ public void testLimitPushDown() {
    Assert.assertEquals("should have 1 record", 1, mixedResult.size());
    Assert.assertArrayEquals("Should produce the expected records", mixedResult.get(0), new Object[] {1, "a"});
  }
+
+  @Test
+  public void testFilterPushDown() {


Tests should be broken into individual methods that are each a test case. To share code, use @Before and @After and different test suites.

Can you also add a test case that listens for a ScanEvent and validates that the expression was correctly passed to Iceberg?

I add a listener to validate the pushdown for filter in TestFlinkTableSource

rdblue · 2020-12-18T19:29:45Z

flink/src/main/java/org/apache/iceberg/flink/FlinkFilters.java

+import org.apache.iceberg.relocated.com.google.common.collect.ImmutableMap;
+import org.apache.iceberg.util.NaNUtil;
+
+public class FlinkFilters {


This class needs an extensive test suite that checks the conversion from expected Flink expressions, not just a test for the source.

The conversion needs to cover at least these cases:

Equals with null

Not equals with null

In with null

Not in with null

Equals with NaN

Not equals with NaN

In with NaN

Not in with NaN

All inequalities with null

All inequalities with NaN

All expressions with a non-null and non-Nan value (preferably one string and one numeric)

Each data type that is supposed by Iceberg/Flink

I look up the flink doc and the source code, and tested it ，it seems that NaN and Infinity are not supported by flink now .
the data type is supported by flink : here

rdblue · 2020-12-18T19:30:42Z

Thanks for working on this @zhangjun0x01! It looks like a great start to me, and I'd really like to get this working in Flink.

zhangjun0x01 · 2020-12-21T11:26:21Z

Thanks for working on this @zhangjun0x01! It looks like a great start to me, and I'd really like to get this working in Flink.

@rdblue , thank you very much for your review,I am very sorry than some situations are not well considered,I would be careful next time, and I will update the PR later

rdblue · 2020-12-21T18:23:45Z

I am very sorry than some situations are not well considered

No problem, this is why we review!

openinx · 2020-12-22T13:45:52Z

I'm sorry that I did not review this PR in time before (was focusing on flink cdc DataStream/SQL test cases and more optimizations things after the next release 0.11.0), will take a look tomorrow.

rdblue · 2021-01-16T01:46:31Z

I'd like to get this into the 0.11.0 release, if possible. Thanks for working on this, @zhangjun0x01! It will be great to have this feature done.

zhangjun0x01 · 2021-01-16T11:52:20Z

I'd like to get this into the 0.11.0 release, if possible. Thanks for working on this, @zhangjun0x01! It will be great to have this feature done.

@rdblue thanks very much for your review,I updated it.

rdblue · 2021-01-16T18:13:27Z

flink/src/test/java/org/apache/iceberg/flink/TestFlinkFilters.java

+        org.apache.iceberg.expressions.Expressions.equal("field1", 1));
+
+    Assert.assertEquals("Predicate operation should match", expected.op(), not.op());
+    assertPredicatesMatch((UnboundPredicate<?>) expected.child(), (UnboundPredicate<?>) not.child());


These casts are unnecessary.

flink/src/test/java/org/apache/iceberg/flink/TestFlinkTableSource.java

rdblue · 2021-01-16T18:37:33Z

@zhangjun0x01, there are still a few things to fix in the tests, mostly minor. But I also found a major problem, which is that TestFlinkTableSource now takes a very long time to run. The problem is that the tests run for each format and for 3 different catalog configurations. That means each test runs 9 times and because it is a test that actually runs SQL it takes a long time. The whole suite takes much longer than needed; on my machine, it takes about 4 minutes.

The filter pushdown tests only need to run for one catalog and one file format because the purpose of those tests is to validate the assumptions of the FlinkFilter class with real filters from Flink SQL. The file format and catalog are orthogonal and we don't need to test each one of them. Can you change the parameterization to run with only Avro and a single catalog case?

flink/src/test/java/org/apache/iceberg/flink/TestFlinkTableSource.java

zhangjun0x01 · 2021-01-17T12:01:52Z

@zhangjun0x01, there are still a few things to fix in the tests, mostly minor. But I also found a major problem, which is that TestFlinkTableSource now takes a very long time to run. The problem is that the tests run for each format and for 3 different catalog configurations. That means each test runs 9 times and because it is a test that actually runs SQL it takes a long time. The whole suite takes much longer than needed; on my machine, it takes about 4 minutes.

The filter pushdown tests only need to run for one catalog and one file format because the purpose of those tests is to validate the assumptions of the FlinkFilter class with real filters from Flink SQL. The file format and catalog are orthogonal and we don't need to test each one of them. Can you change the parameterization to run with only Avro and a single catalog case?

I refactor it with HadoopCatalog and Avro type.

rdblue · 2021-01-18T20:50:17Z

Thanks, @zhangjun0x01! It is great that this will be in the 0.11.0 release. Thanks for getting it done!

github-actions bot added the flink label Dec 9, 2020

zhangjun0x01 force-pushed the filterPushDown branch from 2fe26ee to e05b6e9 Compare December 10, 2020 05:34

yyanyy reviewed Dec 17, 2020

View reviewed changes

zhangjun0x01 force-pushed the filterPushDown branch from e05b6e9 to 7f7cbf5 Compare December 17, 2020 06:51

yyanyy reviewed Dec 17, 2020

View reviewed changes

zhangjun0x01 force-pushed the filterPushDown branch 2 times, most recently from 658b61d to 629c5af Compare December 18, 2020 04:06

rdblue changed the title ~~Flink : add filter push down for IcebergTableSource~~ Flink: Support filter pushdown for IcebergTableSource Dec 18, 2020

rdblue changed the title ~~Flink: Support filter pushdown for IcebergTableSource~~ Flink: Support filter pushdown in IcebergTableSource Dec 18, 2020

rdblue reviewed Dec 18, 2020

View reviewed changes

flink/src/main/java/org/apache/iceberg/flink/FlinkFilters.java Show resolved Hide resolved

rdblue reviewed Dec 18, 2020

View reviewed changes

flink/src/main/java/org/apache/iceberg/flink/FlinkFilters.java Show resolved Hide resolved

rdblue reviewed Dec 18, 2020

View reviewed changes

flink/src/main/java/org/apache/iceberg/flink/FlinkFilters.java Outdated Show resolved Hide resolved

rdblue reviewed Dec 18, 2020

View reviewed changes

zhangjun0x01 force-pushed the filterPushDown branch 2 times, most recently from edad5ee to 7daffa6 Compare January 16, 2021 11:39

zhangjun0x01 force-pushed the filterPushDown branch 2 times, most recently from d33aa49 to 217d059 Compare January 16, 2021 14:33

rdblue reviewed Jan 16, 2021

View reviewed changes

flink/src/test/java/org/apache/iceberg/flink/TestFlinkTableSource.java Outdated Show resolved Hide resolved

rdblue reviewed Jan 16, 2021

View reviewed changes

flink/src/test/java/org/apache/iceberg/flink/TestFlinkTableSource.java Show resolved Hide resolved

rdblue reviewed Jan 16, 2021

View reviewed changes

flink/src/test/java/org/apache/iceberg/flink/TestFlinkTableSource.java Outdated Show resolved Hide resolved

zhangjun added 10 commits January 17, 2021 10:45

add filter push down

4981f60

fix some issue

f2e493a

update Expression to Optional

fbb0f3b

fix some bugs and add test case

5a573f2

fix some issues

f94ebaa

do not format sql when args length is 0

c8bd0d9

fix some issues

28980d1

fix issues

d8c39aa

fix some issues

38f9593

fix some issues, and add some test case

f6e7028

zhangjun0x01 force-pushed the filterPushDown branch from 217d059 to c93eaa7 Compare January 17, 2021 02:49

refactor TestFlinkTableSource and fix some issues

138fa46

zhangjun0x01 force-pushed the filterPushDown branch from c93eaa7 to 138fa46 Compare January 17, 2021 04:24

zhangjun0x01 mentioned this pull request Jan 18, 2021

Flink: Support inferring parallelism for batch read. #1936

Merged

rdblue approved these changes Jan 18, 2021

View reviewed changes

rdblue merged commit 70d309e into apache:master Jan 18, 2021

XuQianJin-Stars pushed a commit to XuQianJin-Stars/iceberg that referenced this pull request Mar 22, 2021

Flink: Support filter pushdown (apache#1893)

99413c3

Flink: Support filter pushdown in IcebergTableSource #1893

Flink: Support filter pushdown in IcebergTableSource #1893

Conversation

zhangjun0x01 commented Dec 9, 2020

zhangjun0x01 commented Dec 9, 2020

openinx commented Dec 9, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

yyanyy Dec 17, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

zhangjun0x01 commented Dec 17, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

zhangjun0x01 Dec 21, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

zhangjun0x01 Jan 5, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rdblue commented Dec 18, 2020

zhangjun0x01 commented Dec 21, 2020

rdblue commented Dec 21, 2020

openinx commented Dec 22, 2020

rdblue commented Jan 16, 2021

zhangjun0x01 commented Jan 16, 2021

Choose a reason for hiding this comment

rdblue commented Jan 16, 2021 • edited Loading

zhangjun0x01 commented Jan 17, 2021

rdblue commented Jan 18, 2021

yyanyy Dec 17, 2020 •

edited

Loading

zhangjun0x01 Dec 21, 2020 •

edited

Loading

zhangjun0x01 Jan 5, 2021 •

edited

Loading

rdblue commented Jan 16, 2021 •

edited

Loading