-
Notifications
You must be signed in to change notification settings - Fork 166
[WIP] Allow POJOs as CloudEvent data, round 3 #211
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Allow POJOs as CloudEvent data, round 3 #211
Conversation
Signed-off-by: Francesco Guardiani <francescoguard@gmail.com>
Signed-off-by: Francesco Guardiani <francescoguard@gmail.com>
| */ | ||
| @Nullable | ||
| byte[] getData(); | ||
| Object getData(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That would make me abandon the SDK altogether and introduce my own type. A strong no (-1).
Motivation:
Currently, one does not need core to serialize an instance of CloudEvent. If you change it to Object, it would mean that one will have to do something like:
if (getData() instanceof byte[]) {
// Ok, I know how to deal with it
} else {
// panic!
}This is very JS-style plus makes the whole point of the api module obsolete.
It is "fine"-ish if getData will return Data type that gives methods like asBytes, asByteBuffer, or a custom type that will let accessing the original object, but making it Object is a no go IMO.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok ok i'm interested in exploring the Data alternative, what about (names are provisional):
public interface Data {
public byte[] asBytes();
public @Nullable Object raw();
}There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to understand your idea, what should it look like the Data interface for you?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
few things to keep in mind:
-
Some implementations that care about the performance may want to implement
Dataon the same class that implements theCloudEventtype, so the method names should be carefully selected to not clash or confuse. -
This can as well be a set of methods of
CloudEventlikedataAsBytes(),dataAsByteBuffer(), etc etc. not genericdataAs(Class)tho. -
before we make 5 more rounds of PRs, I would like to step back and ask the question that was already asked:
Why do we even need to expose "raw" object access inCloudEventon a first place? What are we trying to solve?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-
so i wonder if
dataAsBytes()anddataAsObject()just solves the problem we have here (I still prefer to have a sub interface and then I implement it in the same class of the cloudevent impl).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dataAsObject() would require doing a mandatory instanceof check on the value to proceed with processing. You can't even use it in the JSON format to encode the body because it may return byte[] of already seriallized payload. You must have a concrete type to hint the serializer to use some other field other than dataAsBytes().
How is it better than doing an instanceof check on the CloudEvent itself?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can't even use it in the JSON format to encode the body because it may return byte[] of already seriallized payload.
I can and the implementation is pretty straightforward (look at the code here https://github.com/cloudevents/sdk-java/pull/211/files#diff-75b992e6fc5d4cd3890ea6e37c7fe2a4R114)
You must have a concrete type to hint the serializer to use some other field other than dataAsBytes().
WDYM? Every serializer just use val.getClass() to start the serialization.
How is it better than doing an instanceof check on the CloudEvent itself?
Of course you always need the instanceof, but makes simpler the usage and the impl:
- 90% of users won't have to subclass to put a concrete type of their choice, they just use the
CloudEventprovided by core - 10% of users subclass overriding the signature, and they return the concrete type, removing the need to use the
instanceof - Makes explicit that the data is dynamic by nature in a
CloudEvent. We can't build in this sdk all the differentJsonCloudEvent,XmlCloudEventbecause data can be anything... - Because it doesn't assume anything on the nature of
CloudEventdata, the builders/deserializers don't need to return different impls of the event depending on the payload, they always use the sameCloudEventV1andCloudEventV03
My feeling about subclassing CloudEvent depending on the type of data is that you're making the wrong thing "dynamic", while the Data idea makes more sense to me because is exactly the data field the dynamic one. It's another layer of indirection, but makes more explicit the concept
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can and the implementation is pretty straightforward (look at the code here https://github.com/cloudevents/sdk-java/pull/211/files#diff-75b992e6fc5d4cd3890ea6e37c7fe2a4R114)
this is not a "straightforward implementation", this is if {} else {} for a single case (byte[]) while it may also be ByteBuffer, InputStream, or many other possibilities. dataAsBytes/dataAsByteBuffer do provide a straightforward way of handling the case.
90% of users won't have to subclass to put a concrete type of their choice, they just use the CloudEvent provided by core
Then added it to core. I only care about api here.
10% of users subclass overriding the signature, and they return the concrete type, removing the need to use the instanceof
this is a wrong assumption because, in a pipeline, you will get instances of CloudEvent, not SomeCloudEvent, where getData() will be Object, not the concrete type. That's how Java's type system works.
Makes explicit that the data is dynamic by nature in a CloudEvent.
This does not mean that the SDK should be dynamic tho.
We can't build in this sdk all the different JsonCloudEvent, XmlCloudEvent because data can be anything...
Because it doesn't assume anything on the nature of CloudEvent data, the builders/deserializers don't need to return different impls of the event depending on the payload, they always use the same CloudEventV1 and CloudEventV03
That's not what I suggested. Please re-read.
My feeling about subclassing CloudEvent depending on the type of data is that you're making the wrong thing "dynamic", while the Data idea makes more sense to me because is exactly the data field the dynamic one.
Your perception is incorrect. I am saying that, if someone wants to apply an optimization, they can subclass and expose the raw object. Just re-read the original issue ( #196 ), the problem was in "double encoding".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is not a "straightforward implementation", this is if {} else {} for a single case (byte[]) while it may also be ByteBuffer, InputStream, or many other possibilities. dataAsBytes/dataAsByteBuffer do provide a straightforward way of handling the case.
That's the very same thing i have to do with subclassing CloudEvent. Also, keep in mind that i cannot write directly byte[] into a jackson serializer, i need or a json type or a pojo to map to json (look at the diff).
Then added it to core. I only care about api here.
Makes explicit that the data is dynamic by nature in a CloudEvent.
I know, but the users of this sdk ends up using the interface CloudEvent in api, hence I care to have well aligned interface and impls for 90% of users without letting them downcast because the 10% thinks data is not a dynamic field and we can "statically type it". Just look at the spec of CloudEvents https://github.com/cloudevents/spec/blob/v1.0/spec.md#event-data and you clearly see that "This specification does not place any restriction on the type of this information", hence what is the point of trying to make it look static with subclasses and whatnot?
I am saying that, if someone wants to apply an optimization, they can subclass and expose the raw object
I'm not talking only about optimizations, I'm talking about the fact that if the user is allowed to put pojos inside the event (eg using the event builder providing a non byte array parameter), she/he expects to be able to get it back with a getter, like getDataAsObject().
And as I said, I see no point for the user of the sdk to do Object data = ((BaseCloudEventImpl)event).getRawData() where they build with builder.withData(pojo). That makes the usage unnecessary awful...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's the very same thing i have to do with subclassing CloudEvent. Also, keep in mind that i cannot write directly byte[] into a jackson serializer, i need or a json type or a pojo to map to json (look at the diff).
With byte[], you don't need Jackson at all, you have the bytes already. It is up to the implementation to decide how to return byte[]. That's the whole point.
|
At the moment we have 3 PRs trying to address the same thing. What's interesting is that no matter how much we discuss and how many idiomatic and intuitive approaches are recommended (type converters, serializers, specialized implementations of CloudEvent and so on), each new PR is something new and is clearly not the result of any of the discussions we're having. I am not sure what is the point of having these discussions. . .
We all worked very hard to get CloudEvent interface to where it is now. As a core strategy it serves everyone well. There is absolutely no reason to touch it (ever). My vote on this PR is definitive -1. |
|
Replaced by #250 |
More or less same as #208, but the difference is that there is no new
CloudEventinterface, I just relaxedgetData()in the originalCloudEventinterface to returnObject.The most important points of this PR are: