Skip to content

Event Listeners #922

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
May 7, 2025
Merged

Conversation

andrew4699
Copy link
Contributor

@andrew4699 andrew4699 commented Jan 31, 2025

Implementation of event listeners discussed here.

I decided to keep this implementation generic and not take a dependency on Jakarta Events nor Vertx busses. It's easy to extend this, either within Polaris or in an external PolarisEventListener, and handle events however one wishes.

Some high level notes:

  • PolarisEventListener is the main interface with all the event methods such as onBeforeRequestRateLimited
  • DefaultPolarisEventListener is an empty implementation which allows users to only partially implement event handlers
  • polaris.events.type is the config that lets you specify your event listener implementation

@andrew4699 andrew4699 force-pushed the aguterman-event-listeners branch 3 times, most recently from 74b27ce to e65726f Compare February 26, 2025 22:44
@andrew4699 andrew4699 force-pushed the aguterman-event-listeners branch from e65726f to f7c8391 Compare February 26, 2025 22:55
@andrew4699 andrew4699 changed the title [PROTOTYPE] Event listeners using a custom interface (not Jakarta Events) Event Listeners Feb 26, 2025
@andrew4699 andrew4699 marked this pull request as ready for review February 26, 2025 23:08
@andrew4699 andrew4699 requested a review from eric-maynard March 4, 2025 17:05
* DefaultPolarisEventListener and only have to handle a subset of events. Event details are
* documented under the event objects themselves.
*/
public interface PolarisEventListener {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we really need the Polaris prefix here.

Also, unfortunately, I think we need javadocs on the methods.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea we could get rid of the Polaris prefix.

WDYT about @link javadocs that point to the actual event data class? I just didn't wanna duplicate the comment in 2 places. Spark seems to have fairly vague comments on the methods that don't provide much more info than the method name itself, but also Spark doesn't seem to have the detailed comments on the event data classes themselves.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that's fair, let's just add @links

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The "Polaris" prefix would deserve a discuss thread imho. Many components have this prefix, but not all.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Many have it, and many don't -- agreed that we should standardize and agreed that a discussion thread is a good place to do that.

For this individual decision on this PR, I'm not sure a decision either way would require a huge discussion given the current lack of standardization.

eric-maynard
eric-maynard previously approved these changes Mar 6, 2025
@github-project-automation github-project-automation bot moved this from PRs In Progress to Ready to merge in Basic Kanban Board Mar 6, 2025
* DefaultPolarisEventListener and only have to handle a subset of events. Event details are
* documented under the event objects themselves.
*/
public interface PolarisEventListener {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The "Polaris" prefix would deserve a discuss thread imho. Many components have this prefix, but not all.


void onBeforeTableCommit(BeforeTableCommitEvent event);

void onAfterTableCommit(AfterTableCommitEvent event);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally speaking, what are the guarantees about "after" commits? Are they triggered if the operation failed, or just when they succeeded? But then, do we need events for failed operations as well?

I see that AfterAttemptTaskEvent has a boolean success flag, but other "after" events do not. Also, maybe the full throwable is preferable than just a boolean flag.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that we should be as explicit as possible about this, although I think it might be ambitious to guarantee the exact semantics for each event out of the gate. At the very least, we should try our best to define the behavior with tests.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we just need to rename to make it clear. I suggested to use a name like TableCommitted to indicate this event happens only if the table committed successfully. #922 (comment)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Javadoc should describe the semantics to a reasonable extent. It does describe the table commit event as not being emitted during failure, and that each attempt of a task regardless of success is emitted.

Docs aside, there's a question of which semantics we choose to actually implement, and since this is meant to be a broad interface there will be differing opinions on which make the most sense. Rather than trying to pick the perfect one for everyone, folks can always add new events and/or fields to existing events that represent different semantics.

@@ -1254,6 +1267,7 @@ public void doRefresh() {
if (latestLocation == null) {
disableRefresh();
} else {
polarisEventListener.onBeforeRefreshTable(new BeforeTableRefreshEvent(tableIdentifier));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This approach of manually triggering events is fine as a starting point, but I think a more elegant approach would be to create a decorator catalog that would decorate calls to each method with before/after events.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea that would be interesting. It might not be ideal to couple events to methods though, as that could force you to unnecessarily extract things into a separate method.

Copy link
Contributor

@flyrain flyrain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think adding a new event in Polaris shouldn't break existing listeners. We will need to consider a different design.

* DefaultPolarisEventListener and only have to handle a subset of events. Event details are
* documented under the event objects themselves.
*/
public interface PolarisEventListener {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding a new event shouldn't require recompilation or redeployment of listeners. How do we archive that with the current design? One of solutions is to remove the specific type in the listener interface, only keep the a generic event type, like the following:

  public void handleEvent(Event event) {
        switch(event.getEventType()) {
            case "BeforeTableCommit":
                handleBeforeTableCommit(event.getPayload());
                break;
           case "TableCommitted":
                handleTableCommitted(event.getPayload());
                break;        
            default:
                // Ignore or log unknown event
        }
    }

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I disagree -- if I was maintaining a listener implementation and a new event type was added I would prefer that things fail at build time rather than have my listener silently ignore the new event

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's a common use case that one implementation only cares about a subset of Polaris events, instead of all of them. For example, user A may only care about the table commit event, so it's fine to ignore other events for user A's listener, including new added events.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To Andrew's point, if you want that behavior then you just have your implementation extend DefaultPolarisEventListener.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My bad. I thought we need to recompile to allow the subclass to get the new methods from the parent class. Java did a good job to keep the binary compatibility, so that the subclass doesn't have to recompile, the caller can still resolve the methods from its parent, https://docs.oracle.com/javase/specs/jls/se8/html/jls-13.html. My mindset for this part is still with c++. Sorry for the confusion, @eric-maynard and @andrew4699.

Copy link
Contributor Author

@andrew4699 andrew4699 Mar 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there's use cases for both wanting to know about new events (ie auditing) and wanting to ignore them (if you only care about a very specific event). Given that, I'm not sure we should sell a single extension point but rather give the option.

I don't think I understand quite enough about how people use dependencies in Java nor the plugin pattern to fully understand the implications of that though. If something is to break, it would ideally happen at compile time and not at runtime.

Also I agree that DefaultPolarisEventListener isn't a great name for an extension point.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current implementation gives you both options

Does it? If the interface PolarisEventListener is public, it's part of the API. If it's part of the API, you cannot add regular methods to it, even if you implement them in DefaultPolarisEventListener.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@andrew4699 , two use cases are both valid. However, it shouldn't notify users in a way that breaks existing listeners. If a listener need to include all new events, it has to recompile and redeployment anyways. At the same time, Polaris couldn't just break other listeners who may only care a subset of events.

For example, Polaris will introduce events that covers a wide range of activities like the followings:

  1. Table/namespace create/update events
  2. Catalog create/update events
  3. Principal/role create/update events.

It's common that some listeners only care about table/namespace events, as they are essential for catalog federation, while other listeners only care about principal/role events for identity federation. Adding a new table related event shouldn't break the identity federation listeners.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy to disagree & commit to one option here and do this iteratively, but let me make 1 more case for the current approach:

Spark does almost exactly what this does. SparkListenerInterface is public and they just recommend against using it. The only difference in Spark is that SparkListener (our DefaultPolarisEventListener) is abstract. However, I'm not sure if we have an elegant way to specify "no event listener" with our config singleton pattern, so a non-abstract default listener may actually be needed.

If the interface PolarisEventListener is public, it's part of the API. If it's part of the API, you cannot add regular methods to it, even if you implement them in DefaultPolarisEventListener.

I think as a general rule it's good to apply this principle, but users are getting an explicit warning here. We could make the warning even louder.

Copy link
Contributor

@eric-maynard eric-maynard Mar 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SparkListenerInterface is actually private[spark] in Scala, but Scala's private is pretty fake anyway. Java or Scala callers can easily access it.

Does it? If the interface PolarisEventListener is public, it's part of the API. If it's part of the API, you cannot add regular methods to it, even if you implement them in DefaultPolarisEventListener.

The current implementation gives you both the options to:

  1. Implement a listener that extends PolarisEventListener and must implement exactly the methods that appear there or fail at build time (or runtime to some ClassNotFoundException if you don't build against the dependency you actually run against).
  2. Implement a listener that extends DefaultPolarisEventListener and can optionally override various methods of that class. If new methods appear, your already-built artifact may continue working against a different version of polaris than the one you built against.

In both cases, you also have the option to go into the code, add new events, and handle those events in your listener(s).

It seems like Yufei thinks that people running against a different version than the one they built against will be a common pattern we should support whereas I considered this something to be avoided and discouraged through our design here.


In the end I think @flyrain's suggestion that we "hide" the interface like Spark does is fine; expert users who want it can break glass via fork or something and use it anyway. If the DefaultPolarisEventListener being the more accessible extension point turns out to be problematic in the future (e.g. users are surprised by new events being ignored) then we can always expose the interface and update our guidance. In general, I expect anyone deploying a custom listener to be a power-user.

*
* @param tableIdentifier The identifier of the table that was refreshed.
*/
public record AfterTableRefreshEvent(TableIdentifier tableIdentifier) implements PolarisEvent {}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curious the use case of this event. I'd assume it's not for the performance instrument.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same naming suggestion here, the name would be TableRefreshed

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Broadly speaking I think the service doesn't need to be too opinionated on how events might be consumed; the service just emits them. You could imagine something as simple as wanting to keep a counter of how many tables were refreshed in the last 24h.

@andrew4699
Copy link
Contributor Author

I think adding a new event in Polaris shouldn't break existing listeners. We will need to consider a different design.

That's what DefaultPolarisEventListener is for. Users who are concerned with that issue should extend DefaultPolarisEventListener instead of PolarisEventListener.

@andrew4699
Copy link
Contributor Author

I updated to use an ABC instead of an interface and rebased. It's ready now.

flyrain
flyrain previously approved these changes Mar 18, 2025
Copy link
Contributor

@flyrain flyrain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks @andrew4699 !

snazy
snazy previously requested changes Mar 18, 2025
Copy link
Member

@snazy snazy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that the event types are missing some important attributes and some event types are exposing implementation specifics.

I'll comment on the dev-ML as well, but overall I don't understand how this is intended to be used ("what are the concrete use cases for this?").


/** Event listener that stores all emitted events forever. Not recommended for use in production. */
@ApplicationScoped
@Identifier("test")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is test-only code and shouldn't be reachable in production code at all.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, but I don't think that the argument that there's already a bad pattern is good enough to justify this.

Copy link
Contributor Author

@andrew4699 andrew4699 Mar 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm confused why you approved this earlier then #594

It's not that I agree with the pattern (I don't and have expressed this), it's that maintaining consistency is less confusing to all developers, and this PR doesn't make it materially harder to change the pattern in the future. A separate discussion should happen about changing this but that's out of scope of this PR.

* Represents an event listener that can respond to notable moments during Polaris's execution.
* Event details are documented under the event objects themselves.
*/
public abstract class PolarisEventListener {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the plan/approach for this type when new events are added?
For example: generic tables - are those handled via on*Table* or do those get separate events?

Style: this is rather an interface.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think there's anything special about event listeners for the question of "if we implement X, how will that affect Y?" The same principle of maintaining reasonable behavior for existing users should apply.

See #922 (comment) for why it's an ABC instead of an interface.

* @param success Whether or not the attempt succeeded.
*/
public record AfterTaskAttemptedEvent(
long taskEntityId, CallContext callContext, int attempt, boolean success)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

taskEntityId is a persistence internal property, attempt is actually wrong - see how loadTasks is implemented.
I'm also not sure this event makes sense, given that task handling is incomplete.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you elaborate on the use case of this one?

Copy link
Contributor Author

@andrew4699 andrew4699 Mar 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you elaborate on the attempt being wrong? I don't see what you're referring to.

Here are some threads talking about use cases:

Also the design doc, mailing list thread, and this PR talk about events specifically being intended to be super flexible because different Polaris users will have different needs. Not all events will be useful for everyone.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Flexible" is not the same as "implementation specific".
Also, there's nothing in the code that could emit this one.
So this one should actually be removed.

If there's a specific use case, let's discuss the use case specifically and please not add stuff that's not usable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, there's nothing in the code that could emit this one.

specific use case

Logging.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as below - logging isn't a good use case.
Logging services already allow that - just implement the necessary logging in the code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

* @param method The request's HTTP method
* @param absolutePath The request's absolute path
*/
public record BeforeRequestRateLimitedEvent(String method, String absolutePath)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the use case of this one and how would an implementation behave in case of a DoS attack attempt?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See #922 (comment)

Can you elaborate on your DoS concerns? If you're using the no-op listener, nothing changes. It's up to you to respond to events in your preferred way.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to counter the question: What's the concrete use case for this one?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Logging.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that "logging" is a good enough reason for an event.
Logging can happen from the limiter, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

People have different logging requirements. Logs are a controversial change, which you explained elegantly here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I'm strongly -1 on this event. DoS w/ millions or billions of requests are already an issue - this event just makes such a situation worse.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you elaborate on your DoS concerns? If you're using the no-op listener, nothing changes. It's up to you to respond to events in your preferred way.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's the only one that's there - "noop". The code yields exactly nothing. The events can't be consumed by anybody. I would really like to see concrete use cases for Apache Polaris users, how the architecture would look like and how things are integrated.

The missing attributes that I noticed let me think that the consuming side (aka use cases) hasn't been thought through.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code yields exactly nothing. The events can't be consumed by anybody.

Isn't exactly the opposite true? After this PR, events can be consumed by everybody!

I've listed demonstrable use cases for each event in the design doc.

@@ -1229,11 +1243,14 @@ public void doRefresh() {
Set.of(PolarisStorageActions.READ));
return TableMetadataParser.read(fileIO, metadataLocation);
});
polarisEventListener.onAfterTableRefreshed(new AfterTableRefreshedEvent(tableIdentifier));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't failures deserve to be emitted?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe. Events are supposed to be flexible and cover a variety of needs, so there isn't a clear minimum nor maximum for what should be implemented. This PR aims to introduce the concept and cover a large surface area, but by design it's easy to add more features.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the point of having an "onBefore*" but not always an "onAfter*" then?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The point is that there's no common list of events that cover what everyone needs, so lets implement a big chunk of them. The beauty of open source is that if someone wants more events, it's easy for them after this PR to add them :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, strong -1 on "lets implement a big chunk just because it's possible".

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but it's impossible to know all concrete use cases

Exactly that! But to me the justification "let's implement everything we can think of because it's possible" does not work. Having code just because we can have code - sorry, no.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, "because it's possible" is not the justification. There are use cases for each event implemented here, and there should be a demonstrable one for events added in the future. The design doc says this:

The intent is for the list of events to grow with Polaris and for adding an event to feel like a relatively lightweight change, compared to refactoring an entire component into one/many swappable interfaces. Not everything should be an event though; there should be at least 3 criteria for adding a new event:

  • Difficult to achieve through other means (configuration/dependency injection/etc)
  • A use case can be demonstrated, although it may not be obviously useful to all users of the OSS project
  • Cannot be folded into existing event listeners

All 3 criteria are true for events in this PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, "because it's possible" is not the justification.

You mentioned above: there's no common list of events that cover what everyone needs, so lets implement a big chunk of them

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm asking for concrete requirements.

Copy link
Contributor Author

@andrew4699 andrew4699 Mar 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so lets implement a big chunk of them

Replace "them" with "events with demonstrable use cases".

I'm asking for concrete requirements.

Ok, I've added a new column to the design doc with example use cases for each event.

@github-project-automation github-project-automation bot moved this from Ready to merge to PRs In Progress in Basic Kanban Board Mar 18, 2025
@andrew4699 andrew4699 requested a review from snazy March 21, 2025 00:43
@andrew4699 andrew4699 dismissed stale reviews from flyrain and eric-maynard via ab2560d March 30, 2025 05:38
@andrew4699 andrew4699 force-pushed the aguterman-event-listeners branch 4 times, most recently from 4cd0a4d to 95b4db7 Compare April 17, 2025 22:21
eric-maynard
eric-maynard previously approved these changes Apr 18, 2025
Copy link
Contributor

@eric-maynard eric-maynard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the updated examples in the doc, this LGTM!

@eric-maynard
Copy link
Contributor

eric-maynard commented Apr 18, 2025

To quickly capture my understanding wrt. how this relates to, and differs from, the events APIs proposed in Iceberg and the observability APIs proposed for Polaris, I mocked up an example of what an architecture might look like:

Screenshot 2025-04-17 at 8 26 25 PM

The tl;dr for readers confused about how these events accomplish observability is that they don't... they allow Polaris to emit events to some external source which might later be used to power observability. They also allow you to do other handy things -- like file a Jira if more than 100 tables are created in a given namespace -- if you're willing to write a custom event listener.

@andrew4699
Copy link
Contributor Author

andrew4699 commented Apr 18, 2025

Yes exactly. Also solving atomicity (ie of table commit + event) is out of scope of this PR. The model for these event listeners is that they are as if you are adding new specific code to Polaris. Like Polaris, the code running in these event listeners can be interrupted at any time, due to ie a process crash. The external-facing events implementation can choose to build on top of this or not.

@eric-maynard
Copy link
Contributor

@snazy given the examples added to the doc in March are there still specific events you have concerns about emitting?

@andrew4699 andrew4699 force-pushed the aguterman-event-listeners branch from 0189d55 to b891810 Compare May 2, 2025 18:19
@andrew4699 andrew4699 requested a review from HonahX as a code owner May 5, 2025 16:43
@eric-maynard eric-maynard dismissed snazy’s stale review May 5, 2025 17:16

@snazy can you take a second look to confirm whether you still have concerns about certain events? I looked through the linked doc and the example there make sense to me. In the meantime, we have added a lot of new actions where we should consider putting an event listener hook e.g. policy creation.

@github-project-automation github-project-automation bot moved this from PRs In Progress to Ready to merge in Basic Kanban Board May 6, 2025
@eric-maynard eric-maynard merged commit e5e0c0d into apache:main May 7, 2025
6 checks passed
@github-project-automation github-project-automation bot moved this from Ready to merge to Done in Basic Kanban Board May 7, 2025
snazy added a commit to snazy/polaris that referenced this pull request May 22, 2025
* Policy Store: PolicyMappingRecord with Persistence Impl (apache#1104)

* Spark: Setup repository code structure and build (apache#1190)

* Added freshness aware table loading using metadata file location for ETag (apache#1037)

* Pulled in iceberg 1.8.0 spec changes for freshness aware table loading and added feature to Polaris

* Changed etag support to use entityId:version tuple

* fixed getresponse call

* Changed etagged response to record and gave default implementation to ETaggableEntity

* Made iceberg rest spec docs clearer

* Added HTTP Compliant ETag and IfNoneMatch representations and separated persistence from etag logic

* Changed ETag to be a record and improved semantics of IfNoneMatch

* Fixed semantics of if none match

* Removed ETag representation, consolidated in IfNoneMatch

* fixed if none match parsing

* Added table entity retrieval method to table operations

* removed accidental commit of pycache folders

* Fixed formatting

* Changed to use metadata location hash

* Ran formatting

* use sha256

* Moved out ETag functions to utility class and removed ETaggedLoadTableResponse

* Addressed comments

* Fixed IcebergTableLikeEntity package rename

* main: Update dependency io.opentelemetry.semconv:opentelemetry-semconv to v1.31.0 (apache#1288)

* Update LICENSE and NOTICE in the distributions (admin and server) (apache#1258)

* Gradle/Quarkus: make imageBuild task depend on jandex (apache#1290)

* Core: Clarify the atomicity of BasePersistence methods (apache#1274)

* Implement GenericTableCatalogAdapter (apache#1264)

* rebase

* more fixes

* autolint

* working on tests

* stable test

* autolint

* polish

* changes per review

* some changes per review

* grants

* autolint

* changes per review

* changes per review

* typofix

* Improve code-containment and efficiency of etag-aware loading (apache#1296)

* Improve code-containment and efficiency of etag-aware loading

-Make the hash generation resilient against null metadataLocation
-Use getResolvedPath instead of getPassthroughResolvedPath to avoid redundant persistence round-trip
-Only try to calculate the etag for comparison against ifNoneMatch if the ifNoneMatch is actually provided

* Add strict null-checking at callsites to generateETag, disallow passing null to generator

* Add TODO to refactor shared logic for etag generation

* Core: Add Endpoints and resource paths for Generic Table (apache#1286)

* main: Update dependency com.nimbusds:nimbus-jose-jwt to v10.1 (apache#1299)

* [JDBC] Part1 : ADD SQL script for Polaris setup (apache#1276)

* main: Update registry.access.redhat.com/ubi9/openjdk-21-runtime Docker tag to v1.22-1.1743605859 (apache#1300)

* done (apache#1297)

* Add Polaris Community Meeting for April 3, 2025 (apache#1304)

* Use config-file to define errorprone rule (apache#1233)

Also enabled a couple more simple rules, and adding suppressions/fixes for/to the code.

The two rules `EqualsGetClass` and `UnusedMethod`, which I think are useful, are not enabled yet, because that would mean actual code changes, which I do not want to do in this PR.

The rule `PatternMatchingInstanceof`, introduced in apache#393, is disabled in this PR. It does not work before errorrpone 2.37.0 (via apache#1213) - requires additional changes to enable the rule (see apache#1215).

* Add Yun as a contributor (apache#1310)

* Refactor CatalogHandler to comply with ErrorProne rules (apache#1312)

Fix the CI error after apache#1233

* Implement PolicyCatalog Stage 1: CRUD + ListPolicies (apache#1294)

* main: Update dependency io.opentelemetry:opentelemetry-bom to v1.49.0 (apache#1316)

* main: Update docker.io/jaegertracing/all-in-one Docker tag to v1.68.0 (apache#1317)

* main: Update dependency boto3 to v1.37.28 (apache#1328)

* main: Update dependency software.amazon.awssdk:bom to v2.31.16 (apache#1329)

* Make `BasePolaritsMetaStoreManagerTest` and `(Base)ResolverTest` reusable (apache#1308)

Moves the test cases into the `Base*` classes and make sure the classes can be reused by other persistence implementations.

* main: Update dependency io.opentelemetry.semconv:opentelemetry-semconv to v1.32.0 (apache#1293)

* main: Update mockito monorepo to v5.17.0 (apache#1311)

* PySpark Update AWS Region (apache#1302)

Co-authored-by: Travis Bowen <travis.bowen@snowflake.com>

* main: Update dependency com.nimbusds:nimbus-jose-jwt to v10.2 (apache#1334)

* main: Update dependency com.diffplug.spotless:spotless-plugin-gradle to v7.0.3 (apache#1335)

* Maven publication: Produce correct `<scm><tag>` in `pom.xml` (apache#1330)

`project.scm.tag` in a Maven pom is intended to refer to the SCM (Git) tag. We currently publish `main`, which is incorrect.

This change omits the SCM tag for snapshot builds, but emits the Git tag for releases.

* Remove `@StaticInitSafe` annotation (apache#1331)

There was an issue around mapped configurations having the `@StaticInitSafe` annotation that led to _two_ instances (a "static" one and a "somewhet application-scoped" one) - this was fixed in Quarkus 3.21. One bug in smallrye-config is fixed for Quarkus > 3.21.0, another issue however remains.

Since `@StaticInitSafe` annotated configs seem to cause some weird issues, it seems legit to remote that annotation altogether. This approach was [taken in Nessie](projectnessie/nessie#10606) as well. Investigations (via practical experiments) have proven that there's no measurable impact (runtime + heap) when doing this - and that's also been confirmed by Quarkus + Smallrye-config maintainers.

Hence this change remotes that annotation from the code base.

* Build/Release: Add a "generate digest" task and use for source tarball and Quarkus distributables (apache#1271)

* Ensure that digest and signature are generated for both Polaris-Server and admin tar/zip distribution
* Move "generate digest" functionality to a Gradle task

* main: Update dependency com.google.errorprone:error_prone_core to v2.37.0 (apache#1213)

* main: Update Quarkus Platform and Group to v3.21.1 (apache#1291)

* main: Update dependency io.netty:netty-codec-http2 to v4.2.0.Final (apache#1301)

* Remove unnecessary `clean` and `--no-build-cache` from Gradle invocations (apache#1338)

`quarkusAppPartsBuild --rerun` is the right way to force a Docker image build.

* Generalize bootstrapping in servers (apache#1313)

* Remove `instanceof` checks from `QuarkusProducers`.

* Remove the now unused `onStartup` method from `InMemoryPolarisMetaStoreManagerFactory`.

* Instead, call the good old `bootstrapRealms` method from `QuarkusProducers`.

* Add new config property to control which MetaStore types are bootstrapped automatically (defaults to `in-memory` as before).

* There is no bootstrap behaviour change in this PR, only refactorings to simplify code.

* Add info log message to indicate when a realm is bootstrapped in runtime using preset credentials.

Future enhancements may include pulling preset credentials from a secret manager like Vault for bootstrapping (s discussed in comments on apache#1228).

* main: Update actions/stale digest to 816d9db (apache#1341)

* main: Update dependency com.adobe.testing:s3mock-testcontainers to v4 (apache#1342)

* main: Update dependency org.eclipse.persistence:eclipselink to v4.0.6 (apache#1343)

* main: Update dependency io.quarkus to v3.21.2 (apache#1344)

* main: Update dependency com.google.guava:guava to v33.4.7-jre (apache#1340)

Co-authored-by: Robert Stupp <snazy@snazy.de>

* Spark: Add Namespaces and View support for SparkCatalog (apache#1332)

* Demote technical log messages to DEBUG in PolarisCallContextCatalogFactory (apache#1346)

These messages appear to be logging low-level technical details
about what is going on in the factory and are not likely to be
of interest to most users on a daily basis.

* Core/Service: Implement PolicyCatalog Stage 2: detach/attach/getApplicablePolicies (apache#1314)

* Spec: Add 'inherited' and 'namespace' Fields to GetApplicablePolicies API Response (apache#1277)

* Properly track bootstrappedRealms in InMemoryPolarisMetaStoreManagerFactory (apache#1352)

Fixes apache#1351

* Implement GenericTableCatalogAdapter; admin-related fixes (apache#1298)

* initial commit:

* debugging

* some polish

* autolint

* spec change

* bugfix

* bugfix

* various fixes

* another missing admin location

* autolint

* false by default

* fixes per review

* autolint

* more fixes

* DRY

* revert small change for a better error

* integration test

* extra test

* autolint

* stable

* wip

* rework subtypes a bit

* stable again

* autolint

* apply new lint rule

* errorprone again

* adjustments per review

* update golden files

* add another test

* clean up logic in PolarisAdminService

* autolint

* more fixes per review

* format

* Update versions in distribution LICENSE and NOTICE (apache#1350)

* Spark: Add CreateTable and LoadTable implementation for SparkCatalog (apache#1303)

* Add a weigher to the EntityCache based on approximate entity size (apache#490)

* initial commit

* autolint

* resolve conflicts

* autolint

* pull main

* Add multiplier

* account for name, too

* adjust multiplier

* add config

* autolint

* remove old cast

* more tests, fixes per review

* add precise weight test

* autolint

* populate credentials field for loadTableResponse (apache#1225)

* populate credentials field for loadTableResponse

* spotless

* spotless

* remove unused hashset

* fix merge

* fix empty credential case

* spotlessApply

---------

Co-authored-by: David Lu <dalu@hubspot.com>

* main: Update dependency io.smallrye.common:smallrye-common-annotation to v2.12.0 (apache#1355)

* Build: Avoid adding duplicated projects for Intelij IDE usage (apache#1333)

* main: Update dependency org.junit:junit-bom to v5.12.2 (apache#1354)

* main: Update dependency org.apache.commons:commons-text to v1.13.1 (apache#1358)

* main: Update dependency boto3 to v1.37.33 (apache#1360)

* main: Update dependency software.amazon.awssdk:bom to v2.31.21 (apache#1361)

* main: Update dependency io.micrometer:micrometer-bom to v1.14.6 (apache#1362)

* main: Update dependency com.google.guava:guava to v33.4.8-jre (apache#1366)

* Update LICENSE/NOTICE with latest versions (apache#1364)

* Use "clean" LICENSE and NOTICE in published jar artifacts (apache#1292)

* main: Update dependency io.projectreactor.netty:reactor-netty-http to v1.2.5 (apache#1372)

* Add `Varint` type for variable-length integer encoding (apache#1229)

* main: Update docker.io/prom/prometheus Docker tag to v3.3.0 (apache#1375)

* Set version to 0.10.0-beta in prepaaration for the next release (apache#1370)

* Update the link to OpenAPI in the documentation (apache#1379)

* Integration test for Spark Client (apache#1349)

* add integration test

* add change

* add comments

* rebase main

* update class comments

* add base integration

* clean up comments

* main: Update dependency net.ltgt.gradle:gradle-errorprone-plugin to v4.2.0 (apache#1392)

* Add generic table documentations (apache#1374)

* add generic table documentation (incomplete)

* fix table and spacing

* remove documentation in client api since there is no implementation yet

* remove spacing

* minor fix - proof read

* review fix, wording

* add generic table documentation (incomplete)

* fix table and spacing

* remove documentation in client api since there is no implementation yet

* remove spacing

* minor fix - proof read

* review fix, wording

* proof read - punctuation fix

* change table privilege reference

* Unblock test `listNamespacesWithEmptyNamespace` (apache#1289)

* Unblock test `listNamespacesWithEmptyNamespace`

* Use `containsExactly` to simplify the test

* Fix empty namespace behavior

* Address comments

* Block dropping empty namespace

* Improve error messages

* Revamp the Quick Start page (apache#1367)

* First Draft with AWS

* try again

* try again

* try again

* try again

* try again

* try now

* should work

* AWS First Draft Complete

* ensure file changed

* Azure First Draft Complete

* Azure First Draft, pt. 2

* Azure Completed

* GCP First Draft

* GCP Verified

* File structure fixed

* Remove Trino-specific tutorial

* Restructured Quick Start

* Addresses minor comments from @eric-maynard

* Added reference to Deploying Polaris in Production

* Fix MD Link Checker

---------

Co-authored-by: Adnan Hemani <adnan.hemani@snowflake.com>

* Update README with links to new Quickstart experience (apache#1393)

* Update the StorageConfiguration to invoke singleton client objects, a… (apache#1386)

* Update the StorageConfiguration to invoke singleton client objects, and add a test

* Fix formatting

* using guava suppliers

* Add aws region

* Cleanup and mock test

* Spark: Add rest table operations (drop, list, purge and rename etc) for Spark Client (apache#1368)

* Initial MVP implementation of Catalog Federation to remote Iceberg REST Catalogs (apache#1305)

* Initial prototype of catalog federation just passing special properties into internal properties.

Make Resolver federation-aware to properly handle "best-effort" resolution of
passthrough facade entities.

Targets will automatically reflect the longest-path that we happen to have stored
locally and resolve grants against that path (including the degenerate case
where the longest-path is just the catalog itself).

This provides Catalog-level RBAC for passthrough federation.

Sketch out persistence-layer flow for how connection secrets might be pushed
down into a secrets-management layer.

* Defined internal representation classes for connection config

* Construct and initialize federated iceberg catalog based on connection config

* Apply the same spec renames to the internal ConnectionConfiguration representations.

* Manually pick @XJDKC fixes for integration tests and omittign secrets in response objects

* Fix internal connection structs with updated naming from spec PR

* Push CreateCatalogRequest down to PolarisAdminService::createCatalog just like UpdateCatalogRequest in updateCatalog.

This is needed if we're going to make PolarisAdminService handle secrets management without ever putting the secrets
into a CatalogEntity.

* Add new interface UserSecretsManager along with a default implementation

The default UnsafeInMemorySecretsManager just uses an inmemory ConcurrentHashMap
to store secrets, but structurally illustrates the full flow of intended
implementations.

For mutual protection against a compromise of a secret store or the core
persistence store, the default implementation demonstrates storing only
an encrypted secret in the secret store, and a one-time-pad key in the
returned referencePayload; other implementations using standard crypto
protocols may choose to instead only utilize the remote secret store as
the encryption keystore while storing the ciphertext in the referencePayload
(e.g. using a KMS engine with Vault vs using a KV engine).

Additionally, it demonstrates the use of an integrity check by storing a
basic hashCode in the referencePayload as well.

* Wire in UserSecretsManager to createCatalog and federated Iceberg API handlers

Update the internal DPOs corresponding to the various ConnectionConfigInfo API objects
to no longer contain any possible fields for inline secrets, instead holding the
JSON-serializable UserSecretReference corresponding to external/offloaded secrets.

CreateCatalog for federated catalogs containing secrets will now first extract
UserSecretReferences from the CreateCatalogRequest, and the CatalogEntity will
populate the DPOs corresponding to ConnectionConfigInfos in a secondary pass
by pulling out the relevant extracted UserSecretReferences.

For federated catalog requests, when reconstituting the actual sensitive
secret configs, the UserSecretsManager will be used to obtain the secrets
by using the stored UserSecretReferences.

Remove vestigial internal properties from earlier prototypes.

* Since we already use commons-codec DigestUtils.sha256Hex, use that for the hash in UnsafeInMemorySecretsManager
just for consistency and to illustrate a typical scenario using a cryptographic hash.

* Rename the persistence-objects corresponding to API model objects with a new naming
convention that just takes the API model object name and appends "Dpo" as a suffix;

* Use UserSecretsManagerFactory to Produce the UserSecretsManager (#1)

* Move PolarisAuthenticationParameters to a top-level property according to the latest spec

* Create a Factory for UserSecretsManager

* Fix a typo in UnsafeInMemorySecretsManagerFactory

* Gate all federation logic behind a new FeatureConfiguration - ENABLE_CATALOG_FEDERATION

* Also rename some variables and method names to be consistent with prior rename to ConnectionConfigInfoDpo

* Change ConnectionType and AuthenticationType to be stored as int codes in persistence objects.

Address PR feedback for various nits and javadoc comments.

* Add javadoc comment to IcebergCatalogPropertiesProvider

* Add some constraints on the expected format of the URN in UserSecretReference and placeholders
for next steps where we'd provide a ResolvingUserSecretsManager for example if the runtime ever
needs to delegate to two different implementations of UserSecretsManager for different entities.

Reduce the `forEntity` argument to just PolarisEntityCore to make it more clear that the
implementation is supposed to extract the necessary identifier info from forEntity for
backend cleanup and tracking purposes.

---------

Co-authored-by: Rulin Xing <rulin.xing+oss@snowflake.com>
Co-authored-by: Rulin Xing <xjdkcsq3@gmail.com>

* Add Adnan and Neelesh to collaborators list (apache#1396)

* Replace authentication filters with Quarkus Security (apache#1373)

* Implement PolicyCatalogHandler and Add Policy Privileges Stage 1: CRUD + ListPolicies (apache#1357)

* Add PolicyCatalogHandler and tests

* Fix style

* Address review comments

* Address review comments 2

* fix nit

* Remove CallContext.getAuthenticatedPrincipal() (apache#1400)

* main: Update dependency info.picocli:picocli-codegen to v4.7.7 (apache#1408)

* main: Update dependency com.google.errorprone:error_prone_core to v2.38.0 (apache#1404)

* Add Polaris Community Meeting 2025-04-17 (apache#1409)

* main: Update dependency boto3 to v1.37.37 (apache#1412)

* EclipseLink: add PrimaryKey to policy mapping records JPA model (apache#1403)

* Re-instate dependencies between Docker Compose services (apache#1407)

* Do not rotate bootstrapped root credentials (apache#1414)

* Add Getting Started Button to the Apache Polaris Webshite Homepage (apache#1406)

* Core: change to return ApplicablePolicies (apache#1415)

* Rename the Snapshot Retention policy (apache#1284)

* Rename the Snapshot Retention policy

* Resolve comments

* Resolve comments

---------

Co-authored-by: Yufei Gu <yufei.apache.org>

* main: Update dependency com.adobe.testing:s3mock-testcontainers to v4.1.0 (apache#1419)

* rename snapshotRetention to snashotExpiry (apache#1420)

* main: Update registry.access.redhat.com/ubi9/openjdk-21-runtime Docker tag to v1.22-1.1744796716 (apache#1394)

* main: Update dependency software.amazon.awssdk:bom to v2.31.26 (apache#1413)

* main: Update dependency com.adobe.testing:s3mock-testcontainers to v4.1.1 (apache#1425)

* Fix releaseEmailTemplate task (apache#1384)

* Update distributions LICENSE and NOTICE with AWS SDK 2.31.26 update (apache#1423)

* Support snapshots=refs (apache#1405)

* initial commit

* autolint

* small revert

* rebase

* autolint

* simpler

* autolint

* tests

* autolint

* stable

* fix leak

* ready for review

* improved test

* autolint

* logic flip again

* Update service/common/src/main/java/org/apache/polaris/service/catalog/iceberg/IcebergCatalogHandler.java

Co-authored-by: Alexandre Dutra <adutra@users.noreply.github.com>

* Update integration-tests/src/main/java/org/apache/polaris/service/it/env/CatalogApi.java

Co-authored-by: Alexandre Dutra <adutra@users.noreply.github.com>

* adjustments for committed suggestions

* autolint

---------

Co-authored-by: Alexandre Dutra <adutra@users.noreply.github.com>

* Remove activatedPrincipalRoles property from AuthenticatedPolarisPrincipal (apache#1410)

This seems to be a leftover from when ActiveRolesProvider was introduced. The setter was still used, but the getter wasn't, which hints at the fact that this property can be safely removed.

As a bonus, AuthenticatedPolarisPrincipal now becomes immutable, which is imho a very good thing.

* Implement PolicyCatalogHandler and Add Policy Privileges Stage 2: AttachPolicy + DetachPolicy (apache#1416)

* add auth test for attach/detach

* apply formatter

* refactor authorizePolicyAttachmentOperation

* address comment

* better naming

* Ship eclipselink and PostgreSQL JDBC driver by default in Polaris distribution (apache#1411)

* Fix Connection Config DPOs (apache#1422)

* Fix connection config dpos

* Run spotlessApply

* Doc: Fix the issue that html tags are not working in Hugo (apache#1382)

* Implement PolicyCatalogHandler Stage 3: GetApplicablePolicies (apache#1421)

* [JDBC] Part2: Add Relational JDBC module (apache#1287)

* Bump version to 0.11.0-beta-incubating-SNAPSHOT (apache#1429)

* Make entity lookups by id honor the specified entity type (apache#1401)

* Make entity lookups by id honor the specified entity type

All implementations of `TransactionalPersistence.lookupEntityInCurrentTxn()` are currently ignoring the `typeCode` parameter completely and could potentially return an entity of the wrong type.

This can become very concerning during authentication, since a principal lookup could return some entity that is not a principal, and that would be considered a successful authentication.

* review

* Remove "test" Authenticator (apache#1399)

* Propagate SQLException as "caused by" (apache#1430)

* Remove logging for DbOps (apache#1433)

* Spark: Add regtests for Spark client to test built jars (apache#1402)

* main: Update dependency com.google.cloud:google-cloud-storage-bom to v2.51.0 (apache#1436)

* main: Update dependency org.testcontainers:testcontainers-bom to v1.21.0 (apache#1437)

* main: Update actions/setup-python digest to a26af69 (apache#1440)

* Spark-IT: use correct configurations (apache#1444)

... do not let Spark leak into Quarkus

* PolarisRestCatalogIntegrationTest: Always purge generic tables (apache#1443)

* Add missing Postgresql dependency (apache#1447)

* Add Request Timeouts  (apache#1431)

* add timeout

* add iceberg exception mapping

* dont use quarkus bom, disable timeout

* nits

* Fix sparks sql regtests with up to date config (apache#1454)

* Refactor BasePolarisTableOperations & BasePolarisViewOperations (apache#1426)

* initial copy paste

* Reorder

* view copy paste

* fixes, polish

* stable

* yank

* CODE_COPIED_TO_POLARIS comments

* autolint

* update license

* typofix

* update comments

* autolint

* Use .sha512 extension instead of -sha512 (apache#1449)

* main: Update dependency org.eclipse.microprofile.fault-tolerance:microprofile-fault-tolerance-api to v4.1.2 (apache#1451)

* Doc: Update Local Root Principal Credentials in Quickstart (apache#1452)

* Update the Getting Started Workflow with each Cloud Provider's Blob Storage (apache#1435)

* AWS First Draft

* Debug

* revert typo

* Add JQ to docker runtime

* Debug, pt2

* debug

* debug

* Allow Instance Profile Roles

* change random suffix

* change instance profile to regular IAM roles

* AWS Final Draft

* Azure First Draft

* debug

* Azure First Draft

* debug

* typo

* GCP First Try

* GCP Complete

* GCP Final

* add all jars to Spark

* refactor

* Implement PolicyCatalogAdapter (apache#1438)

* Generic Table/Policy Store: Move feature config check to Adapter and some small refactoring (apache#1465)

* update refs (apache#1464)

* [JDBC] Part3: Plumb JDBC module to Quarkus (apache#1371)

* Allow BasePolarisTableOperations to skip refreshing metadata after a commit (apache#1456)

* initial commit

* fix another test

* changes per comments

* visibility

* changes per review

* autolint

* oops

* main: Update dependency com.fasterxml.jackson:jackson-bom to v2.19.0 (apache#1455)

* Doc: Added set custom credentials instruction in README (apache#1461)

* Doc: Add policy documentation (apache#1460)

* main: Update dependency software.amazon.awssdk:bom to v2.31.30 (apache#1475)

* main: Update dependency gradle to v8.14 (apache#1459)

* main: Update dependency gradle to v8.14

* fix PR

---------

Co-authored-by: Robert Stupp <snazy@snazy.de>

* Remove unused class TokenInfoExchangeResponse (apache#1479)

This is an oversight from apache#1399.

* Upgrade Polaris to Iceberg 1.9.0 (apache#1309)

* Doc: Update on access-control policy docs (apache#1472)

* main: Update Quarkus Platform and Group (apache#1381)

* Added link to the Spark-Jupyter Notebook Getting Started from the main Getting Started Page (apache#1453)

* Added link to the Spark-Jupyter Notebook Getting Started from the main Quickstart page

* Typo

Co-authored-by: Eric Maynard <emaynard@apache.org>

* Suggestions as per @eric-maynard's review

* Fix Typo

---------

Co-authored-by: Eric Maynard <emaynard@apache.org>

* [JDBC] Support Policy (apache#1468)

* Refactor EntityCache into an interface (apache#1193)

* Refactor EntityCache to an interface

* fix

* spotless

* Remove unused PolarisCredentialVendor.validateAccessToLocations() (apache#1480)

* Remove unused PolarisCredentialVendor.validateAccessToLocations()

* review: remove ValidateAccessResult and comments

* Policy Store: Check whether Policy is in use before dropping and support `detach-all` flag (apache#1467)

* fix error (apache#1492)

* Ensure writeToPolicyMappingRecord update existing record if primary key equals in EclipseLink Persistence Impl (apache#1469)

* update PolicyMappingRecord if not exists

* update test

* add TODO

* Eliminate getCurrentContext() call in PolarisAuthorizerImpl (apache#1494)

* Add getting-started for Polaris Spark Client with Delta tables (apache#1488)

* Fix: Pull Postgres image automatically (apache#1495)

* Fix Outdated Information and add Information regarding `docker compose down` to Quickstart  (apache#1497)

* Fix Outdated Information and Add Information regarding docker compose down to Quickstart

* Revision 2

* Remove shutdown from README

* typo

* Upgrade Iceberg REST Spec to match Iceberg 1.8 (apache#1283)

* prep for review

* reset

* more changes

* fixes

* github action change

* another build change

* try api revert

* re-all

* custom type mappings, rebuild

* autolint

* polish

* yank custom types

* update

* autolint

* wip

* Revert build changes

* example

* autolint

* Fix FileIOExceptionsTest to conform to new Iceberg 1.8 API (apache#1501)

It looks like after apache#1283, this test no longer compiles as the Iceberg API has changed. I'm not sure how this wasn't caught by CI on that PR itself.

* JDBC: Optimize writeEntity calls (apache#1496)

* Remove transaction from atomic writes

* remove if-else

* main: Update registry.access.redhat.com/ubi9/openjdk-21-runtime Docker tag to v1.22-1.1745840590 (apache#1499)

* Support for external identity providers (apache#1397)

* JDBC: create objects without reflection (apache#1434)

* Include quarkus-container-image and README in the binary distributions (apache#1493)

* Site: Fix Management and Catalog Spec links (apache#1507)

* Lazy iteration over JDBC ResultSet (apache#1487)

* refactor

* autolint

* polish

* autolint

* changes per review

* autolint

* unwrapping caller

* changes per review

* Update distributions LICENSE and NOTICE with artifacts and versions sync (apache#1509)

* Avoid using deprecated `NestedField.of()` (apache#1514)

* Fix compile warning: unknown enum constant Id.NAME (apache#1513)

* Doc: Add getting started with JDBC source (apache#1470)

* Site: Add Polaris Spark client webpage under unreleased (apache#1503)

* Add new committers (apache#1518)

* Docs: Fix the wrong catalog name in `using polaris` page (apache#1471)

* fix

Signed-off-by: owenowenisme <mses010108@gmail.com>

* update docker compose

Signed-off-by: owenowenisme <mses010108@gmail.com>

---------

Signed-off-by: owenowenisme <mses010108@gmail.com>

* main: Update dependency org.apache.commons:commons-configuration2 to v2.12.0 (apache#1481)

* main: Update dependency com.google.cloud:google-cloud-storage-bom to v2.52.1 (apache#1485)

* main: Update dependency com.azure:azure-sdk-bom to v1.2.34 (apache#1490)

* main: Update docker.io/prom/prometheus Docker tag to v3.3.1 (apache#1510)

* Add new committers on website (apache#1521)

* main: Update dependency software.amazon.awssdk:bom to v2.31.35 (apache#1524)

* fix overlapping menu item on the nav bar (apache#1520)

* fix overlapping menu item on the nav bar

* prevent dropdowns expanding inside the navbar

* Additional refs update for iceberg 1.9.0 (apache#1491)

* Additional refs update for iceberg 1.9.0

* Additional refs update for iceberg 1.9.0

* Additional refs update for iceberg 1.9.0

* Fix typo on Pierre's github URL (apache#1527)

* Refactor storage access configuration handling (apache#1504)

* Refactor storage access configuration handling

This is a step towards supporting non-AWS S3 storage, but this
refactoring is relevant to all storage backends.

There is no change to existing behaviours.

* Rename PolarisCredentialProperty to StorageAccessProperty
  and introduce non-credential properties (as an example for now)

* StorageAccessProperty values are ultimately meant to be
  produced by PolarisStorageIntegration implementations

* Some previous entries in StorageAccessProperty are not really
  credential properties, but their treatment is not changed in this
  PR to maintain exactly the same bahaviour as before.

* Add AccessConfig to represent both credential and non-credential
  properties related to storage access.

* [JDBC] : Deprecate EclipseLink (apache#1515)

* Auto-bootstrap: add verbose logging (apache#1376)

Log explicit messages around auto-bootstrapping and unnecessary/left-over secrets that are (still) available.

* Add nightly build GH action to publish SNAPSHOT on Nexus (apache#1383)

* Add nightly build GH action to publish SNAPSHOT on Nexus (apache#1383)

* Build: Fix `fetchAsfProjectName` and make the publishing extension more flexible (apache#1442)

The added flexibility is intended to be ported to the multiple project in the polaris-tools repository.

(Follow up of apache#1384)

* Poetry v2 (apache#898)

* PEP 621 and Poetry v2

* PEP 621 and Poetry v2

* Update min python to 3.9

* Add back flask8 for apache#1096

* Add Integration tests for Delta tables for Spark Client (apache#1500)

* main: Update dependency com.google.cloud:google-cloud-storage-bom to v2.52.2 (apache#1536)

* main: Update dependency poetry to v2.1.3 (apache#1534)

* main: Update dependency io.netty:netty-codec-http2 to v4.2.1.Final (apache#1533)

* main: Update dependency boto3 to v1.38.10 (apache#1525)

* Fix test failure (apache#1541)

* Fix the URL of the KEYS file in the release vote email template (apache#1538)

* Event Listeners (apache#922)

Implementation of event listeners discussed [here](https://lists.apache.org/thread/03yz5wolkvy8l7rbcwjnqdq1bl8p065v).

I decided to keep this implementation generic and not take a dependency on Jakarta Events nor Vertx busses. It's easy to extend this, either within Polaris or in an external PolarisEventListener, and handle events however one wishes.

Some high level notes:
- PolarisEventListener is the main interface with all the event methods such as `onBeforeRequestRateLimited`
- DefaultPolarisEventListener is an empty implementation which allows users to only partially implement event handlers
- `polaris.events.type` is the config that lets you specify your event listener implementation

* Update metastores.md (apache#1537)

* Update metastores.md

* Resolve comment.

* Resolve comment.

---------

Co-authored-by: Yufei Gu <yufei.apache.org>

* Doc: Document the Concept of realm (apache#1478)

* main: Update dependency boto3 to v1.38.11 (apache#1542)

* Fix compile warning: [unchecked] unchecked cast (apache#1544)

Use `Class.cast()` instead of implicit cast.

* NoSQL: Adopt to "Make entity lookups by id honor the specified entity type (apache#1401)"

* NoSQL: Filter on correct subtype

* NoSQL: merge/rebase 2025/04/30

* additional merge-relaged changes

---------

Signed-off-by: owenowenisme <mses010108@gmail.com>
Co-authored-by: Honah (Jonas) J. <honahx@apache.org>
Co-authored-by: gh-yzou <167037035+gh-yzou@users.noreply.github.com>
Co-authored-by: Mansehaj Singh <msehajs@gmail.com>
Co-authored-by: Mend Renovate <bot@renovateapp.com>
Co-authored-by: JB Onofré <jbonofre@apache.org>
Co-authored-by: Alexandre Dutra <adutra@users.noreply.github.com>
Co-authored-by: Yufei Gu <yufei@apache.org>
Co-authored-by: Eric Maynard <eric.maynard+oss@snowflake.com>
Co-authored-by: Dennis Huo <7410123+dennishuo@users.noreply.github.com>
Co-authored-by: Prashant Singh <35593236+singhpk234@users.noreply.github.com>
Co-authored-by: Travis Bowen <122238243+travis-bowen@users.noreply.github.com>
Co-authored-by: Travis Bowen <travis.bowen@snowflake.com>
Co-authored-by: Dmitri Bourlatchkov <dmitri.bourlatchkov@dremio.com>
Co-authored-by: Juichang Lu <wolflex888@gmail.com>
Co-authored-by: David Lu <dalu@hubspot.com>
Co-authored-by: gfakbar20 <gfakbar20@gmail.com>
Co-authored-by: Liam Bao <90495036+liamzwbao@users.noreply.github.com>
Co-authored-by: Adnan Hemani <adnan.h@berkeley.edu>
Co-authored-by: Adnan Hemani <adnan.hemani@snowflake.com>
Co-authored-by: Neelesh Salian <nssalian@users.noreply.github.com>
Co-authored-by: Rulin Xing <rulin.xing+oss@snowflake.com>
Co-authored-by: Rulin Xing <xjdkcsq3@gmail.com>
Co-authored-by: fabio-rizzo-01 <fabio.rizzocascio@jpmorgan.com>
Co-authored-by: Pierre Laporte <pierre@pingtimeout.fr>
Co-authored-by: Richard Liu <35082658+RichardLiu2001@users.noreply.github.com>
Co-authored-by: Michael Collado <40346148+collado-mike@users.noreply.github.com>
Co-authored-by: Owen Lin (You-Cheng Lin) <106612301+owenowenisme@users.noreply.github.com>
Co-authored-by: Eric Maynard <emaynard@apache.org>
Co-authored-by: Andrew Guterman <andrew.guterman1@gmail.com>
Co-authored-by: MonkeyCanCode <yongzheng0809@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants