Skip to content

Implement GenericTableCatalogAdapter #1264

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 16 commits into from
Apr 2, 2025
Original file line number Diff line number Diff line change
Expand Up @@ -183,4 +183,11 @@ protected FeatureConfiguration(
"How many times to retry refreshing metadata when the previous error was retryable")
.defaultValue(2)
.buildFeatureConfiguration();

public static final FeatureConfiguration<Boolean> ENABLE_GENERIC_TABLES =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor Question: how do we configure this? Do we just put the key in the file application.properties like this?

ENABLE_GENERIC_TABLES=true

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The syntax is a little different, but basically yeah. It will be similar to existing configs, so:

polaris.features.defaults."ENABLE_GENERIC_TABLES"=false

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we will need a doc for this config. Not a blocker though.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The config descriptions are meant to be self-describing, but I really want us to auto-generate some docs based on these. I think we've talked about this for a while but not sure if anyone is working on it yet

PolarisConfiguration.<Boolean>builder()
.key("ENABLE_GENERIC_TABLES")
.description("If true, the generic-tables endpoints are enabled")
.defaultValue(false)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess we want it to be enabled by default. Do we?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we want to keep this as false until the whole feature finish development. Once all server change is done, @eric-maynard will have a PR to switch it to true.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now the REST API is not wired up, so I think it's okay to keep it flagged off for the time being. Once the series of PR is finished and the feature is "complete" we should enable by default

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding is that we will enable by default when all the relevant code parts are ready. The same shall apply for policy endpoints

.buildFeatureConfiguration();
}
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@
import com.fasterxml.jackson.annotation.JsonValue;
import jakarta.annotation.Nonnull;
import jakarta.annotation.Nullable;
import java.util.List;

/** List of privileges */
public enum PolarisPrivilege {
Expand All @@ -41,21 +42,45 @@ public enum PolarisPrivilege {
TABLE_CREATE(6, PolarisEntityType.NAMESPACE),
VIEW_CREATE(7, PolarisEntityType.NAMESPACE),
NAMESPACE_DROP(8, PolarisEntityType.NAMESPACE),
TABLE_DROP(9, PolarisEntityType.TABLE_LIKE, PolarisEntitySubType.ICEBERG_TABLE),
TABLE_DROP(
9,
PolarisEntityType.TABLE_LIKE,
List.of(PolarisEntitySubType.ICEBERG_TABLE, PolarisEntitySubType.GENERIC_TABLE),
PolarisEntityType.CATALOG_ROLE),
VIEW_DROP(10, PolarisEntityType.TABLE_LIKE, PolarisEntitySubType.ICEBERG_VIEW),
NAMESPACE_LIST(11, PolarisEntityType.NAMESPACE),
TABLE_LIST(12, PolarisEntityType.NAMESPACE),
VIEW_LIST(13, PolarisEntityType.NAMESPACE),
NAMESPACE_READ_PROPERTIES(14, PolarisEntityType.NAMESPACE),
TABLE_READ_PROPERTIES(15, PolarisEntityType.TABLE_LIKE, PolarisEntitySubType.ICEBERG_TABLE),
TABLE_READ_PROPERTIES(
15,
PolarisEntityType.TABLE_LIKE,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you know what READ_PROPERTIES privilege means here? does it mean we if there is not read properties privilege, the load table response will not have any properties show up in the property fields ? if that is the case, we probably need to make sure we respect that for generic table also.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes that's right, the privilege we require for loadGenericTable requires this

List.of(PolarisEntitySubType.ICEBERG_TABLE, PolarisEntitySubType.GENERIC_TABLE),
PolarisEntityType.CATALOG_ROLE),
VIEW_READ_PROPERTIES(16, PolarisEntityType.TABLE_LIKE, PolarisEntitySubType.ICEBERG_VIEW),
NAMESPACE_WRITE_PROPERTIES(17, PolarisEntityType.NAMESPACE),
TABLE_WRITE_PROPERTIES(18, PolarisEntityType.TABLE_LIKE, PolarisEntitySubType.ICEBERG_TABLE),
TABLE_WRITE_PROPERTIES(
18,
PolarisEntityType.TABLE_LIKE,
List.of(PolarisEntitySubType.ICEBERG_TABLE, PolarisEntitySubType.GENERIC_TABLE),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we are providing that capabilities to generic tables today. we can probably still keep it here for consistency, but will have no effect.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we remove it in that case? We can always add it back once we need it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it makes sense to be clear about how we expect these privileges to relate to the new entity type; if we add generic tables to some but not others it sort of implies that we only expect those privileges to be related to generic tables.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Think a bit more. It should be required for credential vending in the future with these privileges. I still don't think it's necessary to add now, but I'm fine to add them.

PolarisEntityType.CATALOG_ROLE),
VIEW_WRITE_PROPERTIES(19, PolarisEntityType.TABLE_LIKE, PolarisEntitySubType.ICEBERG_VIEW),
TABLE_READ_DATA(20, PolarisEntityType.TABLE_LIKE, PolarisEntitySubType.ICEBERG_TABLE),
TABLE_WRITE_DATA(21, PolarisEntityType.TABLE_LIKE, PolarisEntitySubType.ICEBERG_TABLE),
TABLE_READ_DATA(
20,
PolarisEntityType.TABLE_LIKE,
List.of(PolarisEntitySubType.ICEBERG_TABLE, PolarisEntitySubType.GENERIC_TABLE),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, do we need this privilege for Generic table?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, this is similar to above

PolarisEntityType.CATALOG_ROLE),
TABLE_WRITE_DATA(
21,
PolarisEntityType.TABLE_LIKE,
List.of(PolarisEntitySubType.ICEBERG_TABLE, PolarisEntitySubType.GENERIC_TABLE),
PolarisEntityType.CATALOG_ROLE),
NAMESPACE_FULL_METADATA(22, PolarisEntityType.NAMESPACE),
TABLE_FULL_METADATA(23, PolarisEntityType.TABLE_LIKE, PolarisEntitySubType.ICEBERG_TABLE),
TABLE_FULL_METADATA(
23,
PolarisEntityType.TABLE_LIKE,
List.of(PolarisEntitySubType.ICEBERG_TABLE, PolarisEntitySubType.GENERIC_TABLE),
PolarisEntityType.CATALOG_ROLE),
VIEW_FULL_METADATA(24, PolarisEntityType.TABLE_LIKE, PolarisEntitySubType.ICEBERG_VIEW),
CATALOG_CREATE(25, PolarisEntityType.ROOT),
CATALOG_DROP(26, PolarisEntityType.CATALOG),
Expand All @@ -70,12 +95,19 @@ public enum PolarisPrivilege {
CATALOG_ROLE_LIST_GRANTS(35, PolarisEntityType.PRINCIPAL),
CATALOG_LIST_GRANTS(36, PolarisEntityType.CATALOG),
NAMESPACE_LIST_GRANTS(37, PolarisEntityType.NAMESPACE),
TABLE_LIST_GRANTS(38, PolarisEntityType.TABLE_LIKE, PolarisEntitySubType.ICEBERG_TABLE),
TABLE_LIST_GRANTS(
38,
PolarisEntityType.TABLE_LIKE,
List.of(PolarisEntitySubType.ICEBERG_TABLE, PolarisEntitySubType.GENERIC_TABLE),
PolarisEntityType.CATALOG_ROLE),
VIEW_LIST_GRANTS(39, PolarisEntityType.TABLE_LIKE, PolarisEntitySubType.ICEBERG_VIEW),
CATALOG_MANAGE_GRANTS_ON_SECURABLE(40, PolarisEntityType.CATALOG),
NAMESPACE_MANAGE_GRANTS_ON_SECURABLE(41, PolarisEntityType.NAMESPACE),
TABLE_MANAGE_GRANTS_ON_SECURABLE(
42, PolarisEntityType.TABLE_LIKE, PolarisEntitySubType.ICEBERG_TABLE),
42,
PolarisEntityType.TABLE_LIKE,
List.of(PolarisEntitySubType.ICEBERG_TABLE, PolarisEntitySubType.GENERIC_TABLE),
PolarisEntityType.CATALOG_ROLE),
VIEW_MANAGE_GRANTS_ON_SECURABLE(
43, PolarisEntityType.TABLE_LIKE, PolarisEntitySubType.ICEBERG_VIEW),
PRINCIPAL_CREATE(44, PolarisEntityType.ROOT),
Expand Down Expand Up @@ -111,31 +143,48 @@ public enum PolarisPrivilege {
*
* @param code internal code associated to this privilege
* @param securableType securable type
* @param securableSubType securable subtype, mostly NULL_SUBTYPE
* @param securableSubTypes securable subtypes, mostly NULL_SUBTYPE
* @param granteeType grantee type, generally a ROLE
*/
PolarisPrivilege(
int code,
@Nonnull PolarisEntityType securableType,
@Nonnull PolarisEntitySubType securableSubType,
@Nonnull List<PolarisEntitySubType> securableSubTypes,
@Nonnull PolarisEntityType granteeType) {
this.code = code;
this.securableType = securableType;
this.securableSubType = securableSubType;
this.securableSubTypes = securableSubTypes;
this.granteeType = granteeType;
}

/**
* Shorthand for a single securable subtype
*
* @param code internal code associated to this privilege
* @param securableType securable type
* @param securableSubType securable subtype, mostly NULL_SUBTYPE
* @param granteeType grantee type, generally a ROLE
*/
PolarisPrivilege(
int code,
@Nonnull PolarisEntityType securableType,
@Nonnull PolarisEntitySubType securableSubType,
@Nonnull PolarisEntityType granteeType) {
this(code, securableType, List.of(securableSubType), granteeType);
}

/**
* Simple constructor, when the grantee is a role and the securable subtype is NULL_SUBTYPE
*
* @param code internal code associated to this privilege
* @param securableType securable type
*/
PolarisPrivilege(int code, @Nonnull PolarisEntityType securableType) {
this.code = code;
this.securableType = securableType;
this.securableSubType = PolarisEntitySubType.NULL_SUBTYPE;
this.granteeType = PolarisEntityType.CATALOG_ROLE;
this(
code,
securableType,
List.of(PolarisEntitySubType.NULL_SUBTYPE),
PolarisEntityType.CATALOG_ROLE);
}

/**
Expand All @@ -149,10 +198,7 @@ public enum PolarisPrivilege {
int code,
@Nonnull PolarisEntityType securableType,
@Nonnull PolarisEntitySubType securableSubType) {
this.code = code;
this.securableType = securableType;
this.securableSubType = securableSubType;
this.granteeType = PolarisEntityType.CATALOG_ROLE;
this(code, securableType, List.of(securableSubType), PolarisEntityType.CATALOG_ROLE);
}

// internal code used to represent this privilege
Expand All @@ -162,7 +208,7 @@ public enum PolarisPrivilege {
private final PolarisEntityType securableType;

// the subtype of the securable for this privilege
private final PolarisEntitySubType securableSubType;
private final List<PolarisEntitySubType> securableSubTypes;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would expect that we will need to update somewhere for the authentication implementation to make sure it an understand the list of subtypes now, but didn't see it, did I miss it somewhere? I am also fine that we do the actual privilege in a separate PR.

Copy link
Contributor Author

@eric-maynard eric-maynard Mar 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like this check is not actually enforced since no test changes were needed

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I double checked how authorizeOrThrow is implemented https://github.com/apache/polaris/blob/main/polaris-core/src/main/java/org/apache/polaris/core/auth/PolarisAuthorizerImpl.java#L510, it seems that is just checking if the user role have the corresponding privilege attached for a given op, it doesn't really check if the target matches the privilege definition, so the definition seems really just an definition, and not used anywhere

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it is not used, is it OK to remove to reduce the complexity? Or do we foresee any near-term requirement?

Copy link
Contributor Author

@eric-maynard eric-maynard Apr 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yun expressed a concern about us not updating these types in a previous PR and I think it's still valid. Even if the securableSubType is not enforced, we should probably update it to reflect how we want these grants to be used. Otherwise, we could just yank securableSubType out of the current code altogether.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think for definition clearness and completeness, the definition extension is needed here. The entity type and subtype used in the privilege here today seems really just a definition purpose to me (none of them seems used for validation), for that purpose, i think make sure it have a complete subtype list necessary.


// the type of the securable for this privilege
private final PolarisEntityType granteeType;
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,7 @@
public class GenericTableEntity extends TableLikeEntity {

public static final String FORMAT_KEY = "format";
public static final String DOC_KEY = "doc";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a blocker: either doc or description is fine to me. I like description a bit more, it is more commonly used.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The spec calls this field doc, otherwise I agree

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The "doc" is used as the API field, to keep things consistent, how about let's just keep it as doc. I was using doc since it was some term used in iceberg https://github.com/apache/iceberg/blob/main/open-api/rest-catalog-open-api.yaml#L2079. i was originally intended to use name "comment", it was a more commonly used name in other table catalog like unity and gravitino


public GenericTableEntity(PolarisBaseEntity sourceEntity) {
super(sourceEntity);
Expand All @@ -52,6 +53,11 @@ public String getFormat() {
return getInternalPropertiesAsMap().get(GenericTableEntity.FORMAT_KEY);
}

@JsonIgnore
public String getDoc() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Non-blocking question] Just to confirm my understanding, we reserve "properties" for generic table's location, connection info, etc, so we put other information like format and doc in the internalProperties? Do we have any convention like things in the internal properties should not expose to users?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's right, internalProperties and properties work that way right now. This is specifically relevant when we need to build the properties map in the response

return getInternalPropertiesAsMap().get(GenericTableEntity.DOC_KEY);
}

public static class Builder
extends PolarisEntity.BaseBuilder<GenericTableEntity, GenericTableEntity.Builder> {
public Builder(TableIdentifier tableIdentifier, String format) {
Expand All @@ -68,6 +74,11 @@ public GenericTableEntity.Builder setFormat(String format) {
return this;
}

public GenericTableEntity.Builder setDoc(String doc) {
internalProperties.put(GenericTableEntity.DOC_KEY, doc);
return this;
}

public GenericTableEntity.Builder setTableIdentifier(TableIdentifier identifier) {
Namespace namespace = identifier.namespace();
setParentNamespace(namespace);
Expand Down
2 changes: 2 additions & 0 deletions quarkus/service/build.gradle.kts
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,8 @@ dependencies {
implementation(project(":polaris-core"))
implementation(project(":polaris-api-management-service"))
implementation(project(":polaris-api-iceberg-service"))
implementation(project(":polaris-api-catalog-service"))

implementation(project(":polaris-service-common"))
implementation(project(":polaris-quarkus-defaults"))

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -76,6 +76,7 @@
import org.apache.polaris.core.persistence.transactional.TransactionalPersistence;
import org.apache.polaris.service.admin.PolarisAdminService;
import org.apache.polaris.service.catalog.PolarisPassthroughResolutionView;
import org.apache.polaris.service.catalog.generic.GenericTableCatalog;
import org.apache.polaris.service.catalog.iceberg.IcebergCatalog;
import org.apache.polaris.service.catalog.io.FileIOFactory;
import org.apache.polaris.service.config.DefaultConfigurationStore;
Expand Down Expand Up @@ -131,6 +132,10 @@ public Map<String, String> getConfigOverrides() {
// One table directly under ns1
protected static final TableIdentifier TABLE_NS1_1 = TableIdentifier.of(NS1, "layer1_table");

// A generic table directly under ns1
protected static final TableIdentifier TABLE_NS1_1_GENERIC =
TableIdentifier.of(NS1, "layer1_table_generic");

// Two tables under ns1a
protected static final TableIdentifier TABLE_NS1A_1 = TableIdentifier.of(NS1A, "table1");
protected static final TableIdentifier TABLE_NS1A_2 = TableIdentifier.of(NS1A, "table2");
Expand Down Expand Up @@ -175,6 +180,7 @@ public Map<String, String> getConfigOverrides() {
@Inject protected FileIOFactory fileIOFactory;

protected IcebergCatalog baseCatalog;
protected GenericTableCatalog genericTableCatalog;
protected PolarisAdminService adminService;
protected PolarisEntityManager entityManager;
protected PolarisMetaStoreManager metaStoreManager;
Expand All @@ -201,7 +207,9 @@ public void before(TestInfo testInfo) {

Map<String, Object> configMap =
Map.of(
"ALLOW_SPECIFYING_FILE_IO_IMPL", true, "ALLOW_EXTERNAL_METADATA_FILE_LOCATION", true);
"ALLOW_SPECIFYING_FILE_IO_IMPL", true,
"ALLOW_EXTERNAL_METADATA_FILE_LOCATION", true,
"ENABLE_GENERIC_TABLES", true);
polarisContext =
new PolarisCallContext(
managerFactory.getOrCreateSessionSupplier(realmContext).get(),
Expand Down Expand Up @@ -302,6 +310,8 @@ public void before(TestInfo testInfo) {
baseCatalog.buildTable(TABLE_NS1B_1, SCHEMA).create();
baseCatalog.buildTable(TABLE_NS2_1, SCHEMA).create();

genericTableCatalog.createGenericTable(TABLE_NS1_1_GENERIC, "format", "doc", Map.of());

baseCatalog
.buildView(VIEW_NS1_1)
.withSchema(SCHEMA)
Expand Down Expand Up @@ -442,6 +452,8 @@ private void initBaseCatalog() {
CATALOG_NAME,
ImmutableMap.of(
CatalogProperties.FILE_IO_IMPL, "org.apache.iceberg.inmemory.InMemoryFileIO"));
this.genericTableCatalog =
new GenericTableCatalog(metaStoreManager, callContext, passthroughView);
}

@Alternative
Expand Down
Loading