Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Core: Add tests for catalogs supporting empty namespaces #9890

Merged
merged 1 commit into from
Jan 13, 2025

Conversation

nastra
Copy link
Contributor

@nastra nastra commented Mar 7, 2024

From all the catalogs that extend CatalogTests, only Hive + REST don't support creating a table/view and also listing tables/views in an empty namespace.

JDBC + InMemory do support that and therefore override the supportsEmptyNamespace() feature in their tests.

Additionally, this PR also fixes listing tables/views for InMemory as previously this would list all available tables/views, instead of just the ones in that namespace.

I've decided to not adjust the Nessie tests, since the behavior deviates from the current behavior that can be found for JDBC + InMemory. While Nessie doesn't support creation/removal of empty namespaces, it does support creating a table/view in an empty namespace. Listing tables/views in an empty namespace returns all available tables/views.

@@ -102,12 +102,12 @@ protected JdbcCatalog catalog() {
}

@Override
protected boolean supportsNamespaceProperties() {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Intellij flags this because supportsNamespaceProperties() is true by default in CatalogTests, so no need to override this method here

@@ -140,17 +142,17 @@ protected boolean supportsNestedNamespaces() {
return true;
}

@Override
protected boolean supportsNamespaceProperties() {
return true;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Intellij flags this because supportsNamespaceProperties() is true by default in CatalogTests, so no need to override this method here

@nastra nastra force-pushed the empty-namespace-handling branch from a3ddfab to 61cfa9d Compare March 7, 2024 09:41
@nastra nastra changed the title Handle listing tables/views in empty namespaces Core: Handle listing tables/views in empty namespaces Mar 7, 2024
@nastra nastra force-pushed the empty-namespace-handling branch from 61cfa9d to 6fecd82 Compare March 7, 2024 10:42
@github-actions github-actions bot added the spark label Mar 7, 2024
@nastra nastra force-pushed the empty-namespace-handling branch from 6fecd82 to ea594d8 Compare March 7, 2024 11:14
@@ -468,6 +468,10 @@ private JdbcUtil() {}

static Namespace stringToNamespace(String namespace) {
Preconditions.checkArgument(namespace != null, "Invalid namespace %s", namespace);
if (namespace.isEmpty()) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

previously this method would return the identifier as .tblName instead of tblName when using an empty namespace, because L475 passes an array with a single empty string to Namespace.of().

I guess the question here is: is Namespace.of("") the same as Namespace.empty() and should we maybe fix Namespace.of()?
It seems there are currently no existing tests that would indicate what the correct behavior should be in this case

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think they should be treated the same, I also hope it doesn't come up very often

@nastra nastra force-pushed the empty-namespace-handling branch 2 times, most recently from 3a9fde1 to e4cee23 Compare March 7, 2024 11:46
@@ -987,6 +991,38 @@ public void testListTables() {
assertEmpty("Should not contain ns_2.table_1 after drop", catalog, ns2);
}

@Test
public void emptyNamespace() {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added this test because JDBC was previously returning namespace "" (empty string) when listing all namespaces


// TODO this lists all available tables and deviates from the behavior in other catalogs and
// should probably be fixed
Assertions.assertThat(catalog().listTables(Namespace.empty()))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@snazy, @adutra, @dimas-b: Nessie doesn't support empty namespace creation. So, should we allow listing tables from empty namespace?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But nessie supports creating a table/view withing the empty namespace, which is a valid use case. If creating a table/view in an empty namespace is supported, then listing should be supported as well

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An empty namespace is not an "entity", right? Why would we want to "create" it?

That said, creating tables in the "empty" / "root" namespace should work with Nessie, IIRC.

@nastra nastra force-pushed the empty-namespace-handling branch from f062549 to 95b0416 Compare March 8, 2024 10:58
@@ -987,6 +995,61 @@ public void testListTables() {
assertEmpty("Should not contain ns_2.table_1 after drop", catalog, ns2);
}

@Test
public void listNamespacesWithEmptyNamespace() {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this behavior applies for all catalogs

@nastra nastra force-pushed the empty-namespace-handling branch from 95b0416 to b6129ef Compare March 8, 2024 11:37
@github-actions github-actions bot removed the NESSIE label Mar 8, 2024
@@ -288,6 +288,7 @@ public List<Namespace> listNamespaces(Namespace namespace) throws NoSuchNamespac

List<Namespace> filteredNamespaces =
namespaces.keySet().stream()
.filter(n -> !n.isEmpty())
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this would previously fail in L303 when calling listNamespaces(Namespace.empty())

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So does this one support empty namespaces or not?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just wondering what the behavior should be here, since it is a valid namespace so shouldn't we be able to list it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, InMemoryCatalog supports empty namespaces and this comes back to the TODO I added in https://github.com/apache/iceberg/pull/9890/files#diff-de393039ffcf049ebb22892ac1b9aad88d53274c1e86431b45e776e4568246f8R1037 to determine what the listing behavior with empty namespaces should be

@nastra nastra changed the title Core: Handle listing tables/views in empty namespaces Core: Add tests for catalogs supporting empty namespaces Mar 8, 2024
Object[] v1 = row(NAMESPACE.toString(), "v1", false);
Object[] v2 = row(NAMESPACE.toString(), "prefixV2", false);
Object[] v3 = row(NAMESPACE.toString(), "prefixV3", false);
assertThat(sql("SHOW VIEWS")).contains(v2, v3, v1, tempView);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: (ocd) It tares me up that you define v1, v2, v3 and then check contains v2, v3, v1

@@ -45,4 +47,10 @@ public void testFilterAndRemovePrefix() {

assertThat(expected).isEqualTo(actual);
}

@Test
public void emptyNamespaceInIdentifier() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Object[] v1 = row(NAMESPACE.toString(), "v1", false);
Object[] v2 = row(NAMESPACE.toString(), "prefixV2", false);
Object[] v3 = row(NAMESPACE.toString(), "prefixV3", false);
assertThat(sql("SHOW VIEWS")).contains(v2, v3, v1, tempView);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the order here

Copy link
Member

@RussellSpitzer RussellSpitzer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, only small issue with test rows being defined, 1, 2, 3 then used 2, 3, 1. I keep thinking there is a secret there but I know there isn't.

Copy link

This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull request requires a review, please simply write any comment. If closed, you can revive the PR at any time and @mention a reviewer or discuss it on the dev@iceberg.apache.org list. Thank you for your contributions.

@github-actions github-actions bot added the stale label Oct 22, 2024
@nastra nastra removed the stale label Oct 22, 2024
@nastra nastra force-pushed the empty-namespace-handling branch 3 times, most recently from 6ae6aec to 39f4033 Compare December 4, 2024 14:09
@nastra nastra force-pushed the empty-namespace-handling branch from 39f4033 to e02ef8a Compare December 4, 2024 14:28
Copy link
Contributor

@Fokko Fokko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, thanks @nastra. Thanks @RussellSpitzer for the review, let's get this in and follow up on the TODO in a separate PR

@Fokko Fokko merged commit 472ec6c into apache:main Jan 13, 2025
49 checks passed
@nastra nastra deleted the empty-namespace-handling branch January 14, 2025 06:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants