Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug report] hive catalog include iceberg table? #3403

Closed
mygrsun opened this issue May 15, 2024 · 16 comments · Fixed by #3703
Closed

[Bug report] hive catalog include iceberg table? #3403

mygrsun opened this issue May 15, 2024 · 16 comments · Fixed by #3703
Assignees
Labels
0.5.2 Release v0.5.2 0.6.0 Release v0.6.0 bug Something isn't working

Comments

@mygrsun
Copy link
Contributor

mygrsun commented May 15, 2024

Version

main branch

Describe what's wrong

a schema in hive catalog have the iceberg table
image

but iceberg catalog dont't hava hive table
image

Error message and/or stacktrace

empty

How to reproduce

use beeline to create hive table and iceberg table in the same database

Additional context

No response

@mygrsun mygrsun added the bug Something isn't working label May 15, 2024
@jerryshao
Copy link
Contributor

@mchades Can you please take a look. From a cursory glance, I feel that Hive catalog should filter out non-hive table when fetching from HMS, WDYT?

@jerryshao
Copy link
Contributor

@mygrsun do you want take a try if you want to fix it?

@mchades
Copy link
Contributor

mchades commented May 15, 2024

@mchades Can you please take a look. From a cursory glance, I feel that Hive catalog should filter out non-hive table when fetching from HMS, WDYT?

Does the table in HMS not belong to Hive? How to distinguish whether a table in HMS belongs to Hive or Iceberg? If it is distinguished by the values of InputFormat and OutputFormat properties, then what kind of table should an Iceberg table created through Hive belong to?

@jerryshao
Copy link
Contributor

there is a reserved property or others to distinguish whether it is a Hive table or Iceberg. For hudi or others, I think they should also have a flag to differentiate.

@mchades
Copy link
Contributor

mchades commented May 15, 2024

If I directly show tables in Hive, can I also see the Iceberg table?

@jerryshao
Copy link
Contributor

I guess it will, you can take a try. Probably you can list iceberg table in hive, but not from Iceberg catalog.

@FANNG1
Copy link
Contributor

FANNG1 commented May 15, 2024

Iceberg catalog use a specific parameter table_type to check whether it's Iceberg table

      List<String> tableNames = clients.run(client -> client.getAllTables(database));
      List<TableIdentifier> tableIdentifiers;

      if (listAllTables) {
        tableIdentifiers =
            tableNames.stream()
                .map(t -> TableIdentifier.of(namespace, t))
                .collect(Collectors.toList());
      } else {
        List<Table> tableObjects =
            clients.run(client -> client.getTableObjectsByName(database, tableNames));
        tableIdentifiers =
            tableObjects.stream()
                .filter(
                    table ->
                        table.getParameters() != null
                            && BaseMetastoreTableOperations.ICEBERG_TABLE_TYPE_VALUE
                                .equalsIgnoreCase(
                                    table
                                        .getParameters()
                                        .get(BaseMetastoreTableOperations.TABLE_TYPE_PROP)))
                .map(table -> TableIdentifier.of(namespace, table.getTableName()))
                .collect(Collectors.toList());
      }

@mchades
Copy link
Contributor

mchades commented May 20, 2024

@mygrsun How do you distinguish between Hive tables and Iceberg tables, and what behavior do you expect?

@mygrsun
Copy link
Contributor Author

mygrsun commented May 21, 2024

@mygrsun How do you distinguish between Hive tables and Iceberg tables, and what behavior do you expect?

we want to get the distinguish list of iceberg and hive。I think the way provided by FANNG1 is ok

@mchades
Copy link
Contributor

mchades commented May 27, 2024

@mygrsun do you want to fix this?

@mygrsun
Copy link
Contributor Author

mygrsun commented May 28, 2024

@mygrsun do you want to fix this?

yes ,i have the plan to fix it.

@mchades
Copy link
Contributor

mchades commented May 28, 2024

@mygrsun do you want to fix this?

yes ,i have the plan to fix it.

great! Can your fix catch up with the 0.5.1 release? We plan to release it this week

@mygrsun
Copy link
Contributor Author

mygrsun commented May 28, 2024

check my design ,please.

To be able to list both all tables and just list hive tables without iceberg.

my design is add a property in the catalog property .
using the property to control list all table or just list hive table without iceberg.
the property name is:list-table-with-iceberg
public static final String LIST_TABLE_WITH_ICEBERG = "list-table-with-iceberg";

do you think this is ok?
@FANNG1 @mchades

@mchades
Copy link
Contributor

mchades commented May 28, 2024

check my design ,please.

To be able to list both all tables and just list hive tables without iceberg.

my design is add a property in the catalog property . using the property to control list all table or just list hive table without iceberg. the property name is:list-table-with-iceberg public static final String LIST_TABLE_WITH_ICEBERG = "list-table-with-iceberg";

I saw that the Iceberg community has also encountered similar issues before. It is worth noting that when there are too many tables, filtering tables may cause performance issues.

So I think we should add a list-all-tables property with a default value of true in the Hive catalog. This is consistent with the behavior of the Hive client, and users can set it to false when they need to filter. WDYT? @mygrsun @FANNG1 @jerryshao

@mygrsun
Copy link
Contributor Author

mygrsun commented May 28, 2024

check my design ,please.
To be able to list both all tables and just list hive tables without iceberg.
my design is add a property in the catalog property . using the property to control list all table or just list hive table without iceberg. the property name is:list-table-with-iceberg public static final String LIST_TABLE_WITH_ICEBERG = "list-table-with-iceberg";

I saw that the Iceberg community has also encountered similar issues before. It is worth noting that when there are too many tables, filtering tables may cause performance issues.

So I think we should add a list-all-tables property with a default value of true in the Hive catalog. This is consistent with the behavior of the Hive client, and users can set it to false when they need to filter. WDYT? @mygrsun @FANNG1 @jerryshao

i think is okay.

@mygrsun
Copy link
Contributor Author

mygrsun commented May 28, 2024

@mygrsun do you want to fix this?

yes ,i have the plan to fix it.

great! Can your fix catch up with the 0.5.1 release? We plan to release it this week

yes,i can。

mygrsun pushed a commit to mygrsun/gravitino that referenced this issue May 31, 2024
mygrsun pushed a commit to mygrsun/gravitino that referenced this issue Jun 1, 2024
mygrsun pushed a commit to mygrsun/gravitino that referenced this issue Jun 1, 2024
mygrsun pushed a commit to mygrsun/gravitino that referenced this issue Jun 1, 2024
mygrsun pushed a commit to mygrsun/gravitino that referenced this issue Jun 3, 2024
mygrsun pushed a commit to mygrsun/gravitino that referenced this issue Jun 3, 2024
mygrsun pushed a commit to mygrsun/gravitino that referenced this issue Jun 5, 2024
@mchades mchades added the 0.6.0 Release v0.6.0 label Jun 5, 2024
yuqi1129 pushed a commit that referenced this issue Jun 5, 2024
…3703)

<!--
1. Title: [#<issue>] <type>(<scope>): <subject>
   Examples:
     - "[#123] feat(operator): support xxx"
     - "[#233] fix: check null before access result in xxx"
     - "[MINOR] refactor: fix typo in variable name"
     - "[MINOR] docs: fix typo in README"
     - "[#255] test: fix flaky test NameOfTheTest"
   Reference: https://www.conventionalcommits.org/en/v1.0.0/
2. If the PR is unfinished, please mark this PR as draft.
-->

### What changes were proposed in this pull request?

Add a Hive catalog property "list-all-tables". Using this property to
control whether the Iceberg table is displayed in the Hive table list.

### Why are the changes needed?

The bug is a schema has the Iceberg tables in the Hive catalog

Fix: #3403

### Does this PR introduce _any_ user-facing change?

N/A.

### How was this patch tested?

1.create a hive catalog with "list-all-tables " property.
2.crate a database and a iceberg table  in the catalog by hive beeline
3.check whether the table is displayed in the catalog .

---------

Co-authored-by: ericqin <ericqin@tencent.com>
github-actions bot pushed a commit that referenced this issue Jun 5, 2024
…3703)

<!--
1. Title: [#<issue>] <type>(<scope>): <subject>
   Examples:
     - "[#123] feat(operator): support xxx"
     - "[#233] fix: check null before access result in xxx"
     - "[MINOR] refactor: fix typo in variable name"
     - "[MINOR] docs: fix typo in README"
     - "[#255] test: fix flaky test NameOfTheTest"
   Reference: https://www.conventionalcommits.org/en/v1.0.0/
2. If the PR is unfinished, please mark this PR as draft.
-->

### What changes were proposed in this pull request?

Add a Hive catalog property "list-all-tables". Using this property to
control whether the Iceberg table is displayed in the Hive table list.

### Why are the changes needed?

The bug is a schema has the Iceberg tables in the Hive catalog

Fix: #3403

### Does this PR introduce _any_ user-facing change?

N/A.

### How was this patch tested?

1.create a hive catalog with "list-all-tables " property.
2.crate a database and a iceberg table  in the catalog by hive beeline
3.check whether the table is displayed in the catalog .

---------

Co-authored-by: ericqin <ericqin@tencent.com>
FANNG1 pushed a commit that referenced this issue Jun 6, 2024
…#3794)

### What changes were proposed in this pull request?

Add a Hive catalog property "list-all-tables". Using this property to
control whether the Iceberg table is displayed in the Hive table list.

### Why are the changes needed?

The bug is a schema has the Iceberg tables in the Hive catalog

Fix: #3403

### Does this PR introduce _any_ user-facing change?

N/A.

### How was this patch tested?

1.create a hive catalog with "list-all-tables " property.
2.crate a database and a iceberg table  in the catalog by hive beeline
3.check whether the table is displayed in the catalog .

Co-authored-by: mygrsun <codeqin@gmail.com>
Co-authored-by: ericqin <ericqin@tencent.com>
@FANNG1 FANNG1 added the 0.5.2 Release v0.5.2 label Jun 6, 2024
diqiu50 pushed a commit to diqiu50/gravitino that referenced this issue Jun 13, 2024
…ables (apache#3703)

<!--
1. Title: [#<issue>] <type>(<scope>): <subject>
   Examples:
     - "[apache#123] feat(operator): support xxx"
     - "[apache#233] fix: check null before access result in xxx"
     - "[MINOR] refactor: fix typo in variable name"
     - "[MINOR] docs: fix typo in README"
     - "[apache#255] test: fix flaky test NameOfTheTest"
   Reference: https://www.conventionalcommits.org/en/v1.0.0/
2. If the PR is unfinished, please mark this PR as draft.
-->

### What changes were proposed in this pull request?

Add a Hive catalog property "list-all-tables". Using this property to
control whether the Iceberg table is displayed in the Hive table list.

### Why are the changes needed?

The bug is a schema has the Iceberg tables in the Hive catalog

Fix: apache#3403

### Does this PR introduce _any_ user-facing change?

N/A.

### How was this patch tested?

1.create a hive catalog with "list-all-tables " property.
2.crate a database and a iceberg table  in the catalog by hive beeline
3.check whether the table is displayed in the catalog .

---------

Co-authored-by: ericqin <ericqin@tencent.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
0.5.2 Release v0.5.2 0.6.0 Release v0.6.0 bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants