-
Notifications
You must be signed in to change notification settings - Fork 185
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Working impl of HK2 dependency injection #493
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a couple of quick comments... I'll do a full review a bit later :)
...s-service/src/main/java/org/apache/polaris/service/persistence/cache/EntityCacheFactory.java
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the nice improvement towards CDI!
hk2 adds a DI implementation, but in a couple of places this conflicts with the way Jackson is used as a DI-ish mechanism to instantiate things, leading to multiple DI-(ish) implementations.
I'm not sure how an approach to add "generic" extensions/plugins would look with hk2, as we cannot expect to have initialization code for all extensions/plugins in the "main" code base.
polaris-core/src/main/java/org/apache/polaris/core/persistence/MetaStoreManagerFactory.java
Outdated
Show resolved
Hide resolved
...is-core/src/main/java/org/apache/polaris/core/storage/PolarisStorageIntegrationProvider.java
Outdated
Show resolved
Hide resolved
new AbstractBinder() { | ||
@Override | ||
protected void configure() { | ||
bind(RealmScopeContext.class) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like there's a lot of manual coding needed for hk2, which feels like it makes things complicated, when we eventually have a plugin/extension mechanism.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should be considered the smallest change necessary to get CDI working. There are improvements that can be made, e.g., classpath scanning for detecting Resource and Service impls, but the primary goal right now is to have runtime framework dependent on @Inject
for wiring. Once that's done, lots of other changes can happen in parallel - e.g., improvements to the authorizer, authentication changes, as well as improvements to the CDI framework. I think this accomplishes the smallest change necessary to allow those other changes to progress.
polaris-service/src/main/java/org/apache/polaris/service/PolarisApplication.java
Show resolved
Hide resolved
polaris-service/src/main/java/org/apache/polaris/service/config/PolarisApplicationConfig.java
Outdated
Show resolved
Hide resolved
...main/java/org/apache/polaris/service/persistence/InMemoryPolarisMetaStoreManagerFactory.java
Outdated
Show resolved
Hide resolved
polaris-service/src/main/java/org/apache/polaris/service/tracing/TracingFilter.java
Show resolved
Hide resolved
@@ -54,4 +54,4 @@ USER root | |||
RUN chmod -R go+rwx /home/spark/regtests | |||
USER spark | |||
|
|||
CMD ["./run.sh"] | |||
ENTRYPOINT ["./run.sh"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Change to this PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added when I was debugging this test failure- e5367d9 . TBH, I'm not sure how tests in main were passing
public void dispose(T instance) {} | ||
} | ||
|
||
public <T> T findService(Class<T> serviceClass) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems to be a CDI concern rather than a configuration concern. I mean that it is perfectly fine to drive CDI with config settings, but findService
does not appear to fit the DI paradigm 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't, but the servlet filters don't use DI, so the PolarisApplication has to find them somehow
[org.apache.polaris.service.auth.TestInlineBearerTokenPolarisAuthenticator]S | ||
contract={io.dropwizard.auth.Authenticator} | ||
name=test | ||
qualifier={jakarta.inject.Named} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I understand the purpose of this declaration. It seems rather redundant to me. The class of TestInlineBearerTokenPolarisAuthenticator
already has the @Named("test")
annotation and implements Authenticator
. This file does not appear to add any new information to what is already known from class files 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is something that would normally be generated at build time by scanning the build artifacts. But that requires using the HK2-specific annotations (@Contract
and @Service
), which I think we agreed to avoid
Hm, looking into the hk2 github repo it looks like its rather unmaintained (minus version bumps and rather tiny changes). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall this PR looks good to me as it enables CDI and paves the way to further improvements / simplifications 👍
I'd also be willing to work on extracting services (generated API code + impl. classes) into a dedicated gradle module (to be pulled into DW runtime), unless you want to work on this from your side @collado-mike :) |
|
||
@SuppressWarnings("unchecked") | ||
@Override | ||
public <U> U findOrCreate(ActiveDescriptor<U> activeDescriptor, ServiceHandle<?> root) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to implement @RealmScope
as a @RequestScoped
with custom bean producers that handle the realm lookup and caching? I think that might be more intuitive from the CDI perspective... and probably more portable. WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd worry that the intention was not clear - readers would just have to know that an item annotated with @RequestScoped
could actually live beyond the scope of a single request. In fact, they could be used by multiple concurrent requests, while the @RequestScoped
annotation would convey that it's safe for an object to mutate state possibly leading to thread safety errors. I think all of the CDI providers offer support for custom scopes and it doesn't seem like a lot of work for what amounts to (IMO) clear intentions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good point about concurrent access
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we are mixing concepts here, a request scoped bean should not outlive the request. In the Quarkus PR the RealmContext
is a request-scoped bean, and one instance of it is created for each request. It two requests target the same realm, they will get two distinct instances of RealmContext
, but each would have the same realm ID.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're specifically talking about beans that should be reused across requests
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not have a global store for realm-specific stuff and expose it to callers via thin request-scoped wrappers? Disposing of the wrapper does not have to dispose of the long-lived data.
... besides, we never destroy realm-scoped beans anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems to me that's exactly what the @RealmScoped
annotation is doing. The RealmScopeContext
maintains the cache per-realm and manages the instantiated values for each ActiveDescriptor
for the currently active realm. This is a centralized place where we can do a lot of cleanup of stale realms (e.g., somebody made a call two days ago, but hasn't invoked an API since), rather than having to sprinkle this cache eviction in a bunch of different bean-specific factories.
I think the contention is whether we should be relying on a ThreadLocal to access the current RealmContext
. I can see where there are flaws with this approach (e.g., we already ran into one during the asynchronous cleanup of metadata files after table drop), but at the same time, I think most (every?) implementation of @RequestScoped
relies on ThreadLocal variables (e.g., AFAICT, ArC relies on ThreadLocalCurrentContext in its RequestContext ). Injecting an accessor to retrieve the current realm makes a lot of sense to me, but at the end of the day, I think we're still going to be relying on ThreadLocals.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Request scopes are probably implemented on top of some thread locals, but the important point here is that whatever mechanism is used to store and retrieve contextual data, it has to be managed by the runtime.
…iceLocator and handle DI for configured instances
377cbde
to
2f60a7b
Compare
That sounds good to me! I have other things to work on :) |
polaris-core/src/main/java/org/apache/polaris/core/context/RealmScope.java
Outdated
Show resolved
Hide resolved
@@ -39,8 +40,8 @@ | |||
* Default implementation of the {@link OAuth2ApiService} that generates a JWT token for the client | |||
* if the client secret matches. | |||
*/ | |||
@JsonTypeName("default") | |||
public class DefaultOAuth2ApiService implements OAuth2ApiService, HasMetaStoreManagerFactory { | |||
@Named("default") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that @Named
has some shortcomings. From CDI spec 3.0:
The use of
@Named
as an injection point qualifier is not recommended, except in the case of integration with legacy code that uses string-based names to identify beans.
One of them is that the name is application-wide, so you cannot have two beans named @Named("default")
.
Instead, we could use some other qualifier. People generally use @Identifier
from Smallrye-Common library. The Quarkus PR is using that for now.
This article explains the pros and cons of each annotation:
polaris-service/src/main/java/org/apache/polaris/service/auth/DefaultOAuth2ApiService.java
Outdated
Show resolved
Hide resolved
polaris-service/src/main/java/org/apache/polaris/service/auth/DefaultPolarisAuthenticator.java
Outdated
Show resolved
Hide resolved
polaris-service/src/main/java/org/apache/polaris/service/auth/JWTRSAKeyPairFactory.java
Outdated
Show resolved
Hide resolved
polaris-service/src/main/java/org/apache/polaris/service/auth/JWTSymmetricKeyFactory.java
Show resolved
Hide resolved
polaris-service/src/main/java/org/apache/polaris/service/context/DefaultContextResolver.java
Outdated
Show resolved
Hide resolved
private static Logger LOGGER = LoggerFactory.getLogger(EntityCacheFactory.class); | ||
@Inject PolarisMetaStoreManager metaStoreManager; | ||
|
||
@RealmScope |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again I think it's much easier to turn this bean into application-scoped and have it hold a map of entity caches keyed by realm id, like RealmEntityManagerFactory
.
...ore/src/main/java/org/apache/polaris/core/storage/PolarisStorageIntegrationProviderImpl.java
Outdated
Show resolved
Hide resolved
@@ -48,17 +46,11 @@ protected void run( | |||
Bootstrap<PolarisApplicationConfig> bootstrap, | |||
Namespace namespace, | |||
PolarisApplicationConfig configuration) { | |||
MetaStoreManagerFactory metaStoreManagerFactory = configuration.getMetaStoreManagerFactory(); | |||
MetaStoreManagerFactory metaStoreManagerFactory = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Commands (boostrap/purge) are going to be fun to adapt to CDI. I suggest we look into them later but my feeling is that we would need to move them to separate modules.
@adutra I haven't seen this behavior during testing. E.g., the |
I suspect HK2 might not be fully compliant with the spec. From CDI spec 4.0:
So in our case, if both For example with Quarkus, if I declare the following beans: @Singleton @Named("foo")
public class Foo1 {}
@Singleton @Named("foo")
public class Foo2 {} I get a deployment error when the app starts (even if the beans aren't injected anywhere):
To get past this issue, I would suggest that we copy SmallRye's |
Yeah, it's possible to use the custom qualifiers. I think it's fine to switch the annotations - but copying the |
Copying wouldn't break Quarkus afaict, it's just a matter of telling it which identifier we intend to use. That said, if you are OK adding the dependency, that's even better 👍 |
@adutra , I think I've addressed all of your comments. The one outstanding concern is the support for the In the meantime, however, I think the basic CDI support is blocking work on several other tasks (e.g., the authentication/authorization PRs Dmitri and I have open, fixing the bootstrapping, table metadata logging). With this PR, I think we can work on better/more complete CDI support (e.g., through Quarkus) without any significant changes to the work done here and it can also accommodate the other work that's ongoing. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm approving in the spirit of allowing us to move forward. There will be issues to solve for sure, but I think we can fix them along the way. 👍
@adutra , I'll need another approval because I had to merge from main to resolve conflicts 🙄 |
Description
As discussed in the mailing list, this change proposes using the HK2 framework to support CDI-compliant dependency injection using the existing Dropwizard application structure and yaml configuration. This change does not make a logic changes or signature changes except to remove some of the injection-oriented interfaces (e.g.,
ConfigurationStoreAware
) and updates constructors to accept dependencies via@Inject
annotations.The
PolarisApplicationConfig
is configured to be managed by HK2, but populated by the Jackson during the typical Dropwizard configuration parsing. Because thePolarisApplicationConfig
is managed by HK2, it can invoke a@PostConstruct
method that registers configured beans with theserviceLocator
using the config class's own getter methods asFactory
implementations. Because the other beans are not managed by HK2 (they can't be, since they declare dependencies on other beans that may not yet be instantiated by Jackson), the factory manually callsinject
on each bean before returning.Jersey has its own
serviceLocator
instance under the hood, so all of the beans bound to thePolarisApplicationConfig
instance are registered again with the JerseyserviceLocator
so that they can be found by the jersey components at runtime (I tried using the HK2 bridge that supposedly allows connecting twoserviceLocator
s, but it didn't seem to work and also the licensing on that package suggests possible GPL violation).This also introduces a
@RealmScope
that declares certain dependencies to be created and managed per-realm, rather than per-application or per-request. Some of the instances, such as thePolarisMetaStoreManager
and theEntityCache
are intended to be reused across requests, but of course, cannot be reused across realms, as some entities (such asroot
principal andservice_admin
principal role) have the same name across realms.One additional test class is added to ensure that the
@RealmScope
dEntityCache
is reused across requests, but is not reused across different realms.Fixes # (issue)
Type of change
Please delete options that are not relevant.
How Has This Been Tested?
Please describe the tests that you ran to verify your changes. Provide instructions so we can reproduce. Please also list any relevant details for your test configuration
Test Configuration:
Checklist:
Please delete options that are not relevant.