-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hosted Client #3362
Hosted Client #3362
Conversation
5c57757
to
e7d7a2e
Compare
Update: the current revision implements the local client as an automatic option for when calls are made on a non-Orleans thread. This means that |
If we generally agree on the approach then I'll fix the merge conflicts, improve tests, and move things to a better namespace. I believe this new approach is superior from an end-user perspective.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Beside that one comment it looks good to me, I like the simplified functionality, it is something I "envisioned" as "must have" when did the HTTP prototyping.
} | ||
} | ||
|
||
public class LocalObjectData |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason why it is made public, since everything is internal, it has Dispose, but not IDisposable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No reason, I should change it. The Dispose should be moved above this class, it's just an illusion
@ReubenBond, I need to look more in the code (I can't do it now), but if u do ask my opinion, at this stage, I think I would advise against this approach, due to all the reasons you mentioned yourself. I would stick with strong isolation and faster bugs discovery. |
Thanks @gabikliot. Initially I was against it, too, but I convinced myself otherwise. Calling a grain from a non-grain context is not what's unsafe: what's unsafe is mutating grain internals from a non-grain context. It's a useful check, but there are more accurate checks we can make. Eg, we could perform checks whenever accessing Today, we wouldn't complain if the user had this in their grain code: Task.Run(async () =>
{
this.State.Value++;
// we will complain if the storage provider is implemented using grains, but not otherwise.
await this.WriteStateAsync();
}); Of course, we could add those checks anyway - nothing to do with this PR. I'm just saying that IMO we're checking for the wrong thing. So we should consider whether it's worthwhile to lose that check, gain local client (which fixes our issue last week with non-SystemTarget services making grain calls) and add more rigorous checks on the base classes. |
f7533b7
to
ba5ae4a
Compare
src/Orleans/Core/ClusterClient.cs
Outdated
Utils.SafeExecute(() => outsideClient.Disconnect()); | ||
} | ||
|
||
Utils.SafeExecute(() => outsideClient.Reset(gracefully)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why was the Reset
method removed from the interface, forcing a need to downcast?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The interface is for all IRuntimeClient
, but Reset(bool)
is only implemented in OutsideRuntimeClient
- it would throw on InsideRuntimeClient
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might make sense to have some kind of specific ClusterClient
or an interface for the external runtime client.
@@ -732,7 +732,7 @@ public StreamDirectory GetStreamDirectory() | |||
throw new OrleansException("Failed to register " + typeof(TExtension).Name); | |||
} | |||
|
|||
IAddressable currentGrain = this.CurrentActivationData.GrainInstance; | |||
IAddressable currentGrain = (runtimeContext.ActivationContext as SchedulingContext)?.Activation.GrainInstance; | |||
var currentTypedGrain = currentGrain.AsReference<TExtensionInterface>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you are using ?.
above, so I assume there is a chance for it to be null. If that's the case, then the AsReference call might throw a NullReferenceException
@@ -175,11 +177,25 @@ internal static void AddDefaultServices(IServiceCollection services) | |||
|
|||
services.AddSingleton(typeof(IKeyedServiceCollection<,>), typeof(KeyedServiceCollection<,>)); | |||
|
|||
// Local client support |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
merge conflict
This looks like a clever hack to me. I'm sure we can make it work, but I'm afraid it'll still be a hack. The sheer number of if (RuntimeContext.Current == null) in this PR gives me the hacky vibe. Couldn't we instead make an honest instance/subclass of
While this statement may be true, it seems a bit misleading to me. The goal here is to have an efficient embedded client to be used by transports like HTTP or gRPC. Hence, the question of safety of invoking a grain from a non-grain context I think is orthogonal.
Did I miss something? |
The issue with having separate inside/outside The
I don't think you're really missing anything, but our takes are a little different. I figured that the current approach was less error prone than the alternatives whilst retaining efficiency. One alternative is to make the A non-solution (in my view) is to require that all grain referenced be explicitly re-bound before use - that spills too much complexity into user code and is error prone. The main thing I wanted to achieve agreement on before performing cleanup was the idea of a |
Hi, this would be helpful in many scenarios. |
Hi, |
@ivanpaisqcl We haven't decided yet. It's a very powerful feature, and we want to carefully assess the downsides of enabling potentially very error prone cases. |
@sergeybykov Is there any timeline to make the first decision. We could really use such a feature, however if it is not going to be anytime soon, we may have to decide on an alternative. |
@ivanpaisqcl I'm on the same boat as you expecting this feature to be released. In meanwhile, I'm creating a client in the same process and making it to connect directly to the localhost instead of using the membership provider until we have this feature merged. |
@galvesribeiro - I'm doing the same :) |
@shlomiw it is not the ideal since you still have serialization and local networking happening but, at least it is a temporary work around and should not be complicated to move to LocalClient once it is merged. |
My intention is to revisit this after we release 2.0 |
91ea5e1
to
a434413
Compare
We had a team meeting to reach consensus on this PR. As such, I've added a new extension method to enable this functionality: Let me know if the name makes you feel suitably nauseous (it's supposed to) or if you want it changed to, eg, |
If that is the case, how do we get an I'm not following the decision (too confuse)... IMHO, a clear way to be opt-in would be:
That way, we could get an |
My concern with just a global opt in, is that it's mostly framework code that will benefit from this feature. So having a framework component opt in on behalf of the user will most likely behave as the mandatory choice, even if the end user himself doesn't want the risk of making calls outside of the grain context. So having this method is not really making it safer. |
I don't see that the major use case would be framework/component developers... In fact, I never heard anyone asking for it to have an idea of embed it on their frameworks. It is quite the opposite... Everyone who I see asking for it (including myself) are looking for a way to embed SignalR/HTTP/TCP server on Orleans and make local calls to grains without have to go thru the membership/serialization/networking dance or localhost hacks that we are doing Today. Yes, I agree with you that it must be opt-in in a sense that the user must explicitly ask for the |
You can get
Issues with this approach:
Note that
I pretty much agree with that.
What is the risk here? User experience is a top priority. I am yet to see a more fine-grained opt-in/out mechanism which doesn't significantly degrade the user experience. Users, particularly non-advanced users, should not have any concept of what is executing in an Orleans scheduler context and what is not. My problem with opt-in/opt-out is that it forces users to know about these concepts. From my perspective, the only reason for an opt-out is if there is an issue with the implementation or design of the feature which results in unintended behavior. |
@ReubenBond - once again, I'm very excited to use this feature soon! My usages:
How I'd address that? This way you'd have the same experience when going from other contexts to Orleans and back again, only with IClusterClient. And as I see it, if it's possible, I wouldn't add this |
Enabling or disabling this feature should be done before the silo is built, IMHO. I believe the behaviour would be unexpected if a call to GetHostedClient (or similar) at one place in the code allowed a grain reference obtained in another place in the code to start being callable. Certainly the main way to access this feature is by accessing I still believe the best UX is for this feature to be enabled by default. @shlomiw this PR enables all of your scenarios without extra work (other than enabling it, for now) |
I agree to ReubenBond. When you are new to Orleans the whole backend vs frontend thing is very "scary". |
Having this global opt in method or enabled by default is the same, so if it's not more granular, then I don't mind any further how we decide to enable it. |
@jdom I don't consider making calls from outside of a grain context to be unsafe at all. The only distinction in the code path is setting the return address of the message, but that information is not accessible from user code. When a call is made from outside of a grain context, the return address will include the hosted client's id. The return call path doesn't hit HostedClient, it's the same as for calls from grains: callback is looked up in InsideRuntimeClient based on message id and executed |
So, when we call it, will it return a |
The high-level argument for opt-in is that this feature changes the existing behavior, in certain cases significantly, and it is hard for us to predict the impact of that rather subtle change. Hence, making users that know why they need it, such as people in this discussion, add an explicit line of code seems least risky at this point, as it'll have zero unintended impact on others. The global opt-in option allows us later to change it to a more granular way of configuring the behavior (if we figure out how to do that) and even to flip the default. I think we shouldn't change the default until at least Orleans 3.0. This will give us time to learn how the feature is used and if any adjustments are needed. |
The reason I mentioned global opt in is almost the same as default, is that because this will be used mainly by infrastructure components (BTW @galvesribeiro I don't mean just from the Orleans team, but I mean not grain code, from whomever that comes from), then as soon a as placement directors, storage implementations, etc start using it, then the user will not have a real chance to opt out for his grain code (where the main risk is), as core components will need it. |
I think some of the difficulty here is that we're attempting to address a number of disparate needs that can all be addressed in some way by enabling grain calls from external threading contexts, so that capability is what's being added. My concern with this is that context matters for a significant set of logic. Providers (storage, streams, ..) as well as facets (Transactions, future storage and streams), reminders, and other feature sets were all developed and tested with the understanding that they would be called from Orleans threading context and the single thread protections they provide. Many of these systems don't really depend on that and work fine from outside context (barring grain calls) but changing the threading expectations of existing components without testing is definitely risky. This change does not imply that these systems should now be expected to be context agnostic, as EnableExternalContextCalls only makes context irrelevant for grain calls, but it definitely blurs the lines (IMO). How do we communicate that context is only irrelevant for grain calls in a way that doesn't lead users to the assumption that this is also true for most of the other capabilities? The current expectations are that any capability that is exposed via IClusterClient is safe to use from outside of an Orleans threading context. Many of the asks in this thread are for an IClusterClient which is available in the same process as the silo without explicitly configuring it and without the unnecessary networking overhead. If we confine our commitment to this capability (alone), the lines are no longer blurred, as the current rules remain intact, we're just making it easier and more performant as users no longer need to configure an IClusterClient themselves to talk to an in-process silo, nor incur the performance cost of the unnecessary networking. Even if we keep the current implementation as is, and change EnableExternalContextCalls to EnableSiloClient (or something of the sort), the expected and supported patterns remain the same as is, preventing ambiguity or the expectation that users can ignore context (as it -does- matter). Please keep in mind that having clear threading expectations are not only important for application developers but also to framework extension developers. If a team or community contributor wants to develop a new facet, stream provider, grain storage, or whatever, they need to understand the threading expectations, so having these expectations well defined is important to the extensibility, maintainability, and testability of the framework. Side note: I'm not fond of the "hosted client" name, as multiple silos per host may be possible at some point. This client is associated with the silo, not the host. |
Yes! The only thing we want with this, is to avoid networking and serialization when calls are made from the same process, period. We still want to have the same threading guarantees that we have Today and that is why I'm saying that from an consumer perspective, I expect this to be just another implementation of |
To be clear, I'm entirely ok with this PR, with the soul exception of the EnableExternalContextCalls name, which in my view should indicate the enabling of a local cluster client, not a behavioral change, as long as we don't claim support of a behavioral change (at this time). The current expectations that any calls outside of an Orleans threading context should be made via an IClusterClient should remain in tact (IMO). Should the behavioral changes introduced in this PR in relation to calls via grain references prove safe and we verify other systems function correctly without the threading guarantees provided by the Orleans threading context, then we can enable this by default and officially change what behaviors we support. After we're confident we can support the behavior change for grains and other supporting systems. |
@galvesribeiro it's not as simple as @jason-bragg what about |
@@ -145,5 +160,13 @@ private string MakeErrorMsg(string what, Exception exc) | |||
return string.Format("Error from storage provider {0} during {1} for grain Type={2} Pk={3} Id={4} Error={5}" + Environment.NewLine + " {6}", | |||
$"{this.store.GetType().Name}.{this.name}", what, name, grainRef.GrainId.ToDetailedString(), grainRef, errorCode, LogFormatter.PrintException(exc)); | |||
} | |||
|
|||
private static void CheckRuntimeContext() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd put [MethodImpl(MethodImplOptions.AggressiveInlining)] on this method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The method probably shouldn't be inlined as-is: first we should split the throw into a different method.
Submit a PR? 😛
This PR adds seamless support to silos for calling grain methods, using streams, local objects, and any other client functionality from outside of the context of an activation or system target.
The primary scenarios this enables are:
IPlacementDirectors
, without having to useTaskScheduler
capturing tricks.Since this enables grain calls from non-grain threads, this also adds some guard rails: methods on
Grain
/Grain<T>
are barred from outside of an activation context by virtue ofRuntimeContext
checks onGrainRuntime
andGrainStorageBridge
.CI looks good, VSO looks good, load tests look good.
Reviewed by @dVakulen.
I changed the name from LocalClient/Silo-Local Client to Hosted Client. I think that sounds better, but please let me know if you feel that's an unsuitable name.
How it works
IHostedClient
intoISiloMessageCenter
andIClientObserverRegistrar
.IHostedClient
creates and registers its own ClientId so that it can receive calls.IHostedClient
if the id matchesIRuntimeClient
methods behave differently depending on whetherRuntimeContext.Current
isnull
(hosted client) or not (running in activation context). Generally speaking, instead of throwing an exception when the context is null, we handle the operation via the hosted client.InsideRuntimeClient
pathway, but calls to grain observers (aka LocalObject aka InvokableObject) are routed to theIHostedClient
implementation, which maintains a thread to scheduleMessages
which are destined for local objects. This logic was largely extracted fromOutsideRuntimeClient
and into a shared class,InvokableObjectManager
. So this thread is only used for those calls and not for regular calls. This Grain Observer flow is an area to optimize in future (via shared executor service), but is not in need of optimization now: performance for those objects ought to be the same as it is for the external client today.OLD, busted description (ignore this)
This PR is a work-in-progress of an
IClusterClient
implementation which communicates directly with a silo in the same process.The current implementation requires serialization for the
Message
objects sent between the client and silo. This is becauseGrainReferences
must have theirIRuntimeClient
property set to the appropriate value (in this case either theLocalClient
instance or theInsideRuntimeClient
instance).An alternative implementation might be feasible: instead of providing a completely separate
IRuntimeClient
implementation we could modifyInsideRuntimeClient
so that when it's being called from a non-Orleans context (RuntimeContext.Current == null
) it defers to the implementations in this PR'sLocalClient
. As a result, issues such as #3273 and #3256 might not arise. However this also means that it's more difficult to catch bugs where we should be executing within aSystemTarget
orGrain
but have not correctly set theRuntimeContext
. It's not inherently wrong to make grain calls from within a grain viaTask.Run()
, but that is the primary way we catch things which are bad (eg, modifying grain state). Another way to catch these kinds of bugs is to insert checks onGrain
andGrain<T>
members: that way, accessingGrainFactory
or callingWriteStateAsync()
from withinTask.Run()
would be disallowed (it's permitted today). I think this is a good approach - we can gain some coverage there, even if we're also losing some coverage because of this PR.@gabikliot & others, do you have thoughts on that previous paragraph?
This ought to help with use cases such as alternative networking layers such as for HTTP & gRPC support.
ILocalClientGateway
into the siloILocalClient
implementations can be created and customized based upon the silo instance (maybe this should be changed to one!)