Flink: Ignore the Forbidden when creating a database #7795

Fokko · 2023-06-07T15:04:05Z

It can be that the user that runs a Flink job, doesn't have the privileges to create a database. In that case we just assume that it already exists.

flink/v1.15/flink/src/main/java/org/apache/iceberg/flink/FlinkCatalog.java

ConeyLiu · 2023-06-16T14:36:45Z

Looks reasonable to me.

stevenzwu · 2023-06-16T16:22:55Z

flink/v1.15/flink/src/main/java/org/apache/iceberg/flink/FlinkCatalog.java

+    if (!databaseExists(defaultDatabase)) {
+      try {
+        createDatabase(getDefaultDatabase(), ImmutableMap.of(), true);
+      } catch (DatabaseAlreadyExistException e) {


this is also a little weird since we already did the dataExists check earlier. I understand why this is needed to handle race condition when multiple jobs can run this code path concurrently.

The first attempt of ignoring ForbiddenException was still not ideal to me because it can mask real issue that default database wasn't created due to permission issue.

I checked the HiveCatalog and GenericInMemoryCatalog implementation from Flink code. They don't try to create the default database in the open method. Flink HiveCatalog just assert that default database exists. That behavior seems reasonable to me.

cc @pvary

Are you suggesting removing the create database call? The person that initially reported this issue, was very supprised that this call was in there. So I think that removing the call seems reasonable to me.

Yea, I was thinking about remove the create database call and just add a Preconditions check that the default database exists. that is also consistent with Flink's HiveCatalog implementation.

The Preconditions check is also debatable. I am not very clear about the expectation of catalog. is the default database always expected/required? not sure if @rdblue has any input.

I'd be okay with removing the check, although I think as long as attempting to create it doesn't cause a failure it is probably slightly better to create it.

@Fokko I am still leaning toward removing the create. This is a Flink Catalog impl (not Iceberg catalog). I think we can stick with Flink HiveCatalog style, where open method doesn't create the default database automatically.

@stevenzwu I agree. I would not expect writing data would also create a database. I'll update the PR

I've updated the tests as well. I think this one is good to go @stevenzwu

…ink-error

stevenzwu · 2023-06-26T17:37:29Z

thanks @Fokko for the fix. thanks @rdblue and @ConeyLiu for the review

…ethod (apache#7795)

…ethod (#7795) (#8039)

…ethod (apache#7795) (apache#8039)

* Hive: Set commit state as Unknown before throwing CommitStateUnknownException (apache#7931) (apache#8029) * Spark 3.4: WAP branch not propagated when using DELETE without WHERE (apache#7900) (apache#8028) * Core: Include all reachable snapshots with v1 format and REF snapshot mode (apache#7621) (apache#8027) * Spark 3.3: Backport 'WAP branch not propagated when using DELETE without WHERE' (apache#8033) (apache#8036) * Flink: remove the creation of default database in FlinkCatalog open method (apache#7795) (apache#8039) * Core: Handle optional fields (apache#8050) (apache#8064) * Core: Handle allow optional fields We expect: - current-snapshot-id - properties - snapshots to be there, but they are actually optional. * Use AssertJ * Core: Abort file groups should be under same lock as committerService (apache#7933) (apache#8060) * Spark 3.4: Fix rewrite_position_deletes for certain partition types (apache#8059) * Spark 3.3: Fix rewrite_position_deletes for certain partition types (apache#8059) (apache#8069) * Spark: Add actions for disater recovery. * Fix the compile error. * Fix merge conflicts and formatting * All tests are working and code integrated with Spark 3.3 * Fix union error and snapshots test * Fix Spark broadcast error * Add RewritePositionDeleteFilesSparkAction --------- Co-authored-by: Eduard Tudenhoefner <etudenhoefner@gmail.com> Co-authored-by: Fokko Driesprong <fokko@apache.org> Co-authored-by: Xianyang Liu <liu-xianyang@hotmail.com> Co-authored-by: Szehon Ho <szehon.apache@gmail.com> Co-authored-by: Yufei Gu <yufei_gu@apple.com> Co-authored-by: yufeigu <yufei@apache.org> Co-authored-by: Laith Alzyoud <laith.alzyoud@revolut.com> Co-authored-by: vaultah <4944562+vaultah@users.noreply.github.com>

…ethod (apache#7795)

Flink: Ignore the Forbidden when creating a database

93a0178

It can be that the user that runs a Flink job, doesn't have the privileges to create a database. In that case we just assume that it already exists.

Fokko requested a review from stevenzwu June 7, 2023 15:04

github-actions bot added the flink label Jun 7, 2023

stevenzwu reviewed Jun 14, 2023

View reviewed changes

flink/v1.15/flink/src/main/java/org/apache/iceberg/flink/FlinkCatalog.java Outdated Show resolved Hide resolved

Check for the database instead

4269e50

Fokko force-pushed the fd-consume-flink-error branch from 03432eb to 4269e50 Compare June 15, 2023 21:04

ConeyLiu approved these changes Jun 16, 2023

View reviewed changes

stevenzwu reviewed Jun 16, 2023

View reviewed changes

Fokko added 3 commits June 24, 2023 08:29

Remove the create database

26301d0

Update the tests

67fc16f

Merge branch 'master' of github.com:apache/iceberg into fd-consume-fl…

9be044d

…ink-error

stevenzwu approved these changes Jun 26, 2023

View reviewed changes

stevenzwu merged commit f5bb0c0 into apache:master Jun 26, 2023

Fokko deleted the fd-consume-flink-error branch June 26, 2023 18:05

Fokko added a commit to Fokko/iceberg that referenced this pull request Jul 11, 2023

Flink: remove the creation of default database in FlinkCatalog open m…

793cb71

…ethod (apache#7795)

Fokko mentioned this pull request Jul 11, 2023

Flink: Remove the creation of default database in FlinkCatalog #8039

Merged

Fokko added a commit that referenced this pull request Jul 12, 2023

Flink: remove the creation of default database in FlinkCatalog open m…

e730f84

…ethod (#7795) (#8039)

nastra pushed a commit to nastra/iceberg that referenced this pull request Jul 18, 2023

Flink: remove the creation of default database in FlinkCatalog open m…

aa64a16

…ethod (apache#7795) (apache#8039)

nastra pushed a commit to nastra/iceberg that referenced this pull request Aug 15, 2023

Flink: remove the creation of default database in FlinkCatalog open m…

c39008b

…ethod (apache#7795) (apache#8039)

rodmeneses pushed a commit to rodmeneses/iceberg that referenced this pull request Feb 19, 2024

Flink: remove the creation of default database in FlinkCatalog open m…

5d19750

…ethod (apache#7795)

Fokko mentioned this pull request Apr 19, 2024

Kafka-connect: Handle namespace creation for auto table creation #10186

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Flink: Ignore the Forbidden when creating a database #7795

Flink: Ignore the Forbidden when creating a database #7795

Fokko commented Jun 7, 2023

ConeyLiu commented Jun 16, 2023

stevenzwu Jun 16, 2023 •

edited by Fokko

Loading

Fokko Jun 19, 2023

stevenzwu Jun 19, 2023

rdblue Jun 20, 2023

stevenzwu Jun 23, 2023

Fokko Jun 24, 2023

Fokko Jun 26, 2023 •

edited

Loading

stevenzwu commented Jun 26, 2023

Flink: Ignore the Forbidden when creating a database #7795

Flink: Ignore the Forbidden when creating a database #7795

Conversation

Fokko commented Jun 7, 2023

ConeyLiu commented Jun 16, 2023

stevenzwu Jun 16, 2023 • edited by Fokko Loading

Choose a reason for hiding this comment

Fokko Jun 19, 2023

Choose a reason for hiding this comment

stevenzwu Jun 19, 2023

Choose a reason for hiding this comment

rdblue Jun 20, 2023

Choose a reason for hiding this comment

stevenzwu Jun 23, 2023

Choose a reason for hiding this comment

Fokko Jun 24, 2023

Choose a reason for hiding this comment

Fokko Jun 26, 2023 • edited Loading

Choose a reason for hiding this comment

stevenzwu commented Jun 26, 2023

stevenzwu Jun 16, 2023 •

edited by Fokko

Loading

Fokko Jun 26, 2023 •

edited

Loading