Skip to content

Conversation

@slinkydeveloper
Copy link
Member

@slinkydeveloper slinkydeveloper commented Aug 13, 2020

More or less same as #208, but the difference is that there is no new CloudEvent interface, I just relaxed getData() in the original CloudEvent interface to return Object.

The most important points of this PR are:

Signed-off-by: Francesco Guardiani <francescoguard@gmail.com>
Signed-off-by: Francesco Guardiani <francescoguard@gmail.com>
*/
@Nullable
byte[] getData();
Object getData();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would make me abandon the SDK altogether and introduce my own type. A strong no (-1).

Motivation:
Currently, one does not need core to serialize an instance of CloudEvent. If you change it to Object, it would mean that one will have to do something like:

if (getData() instanceof byte[]) {
    // Ok, I know how to deal with it
} else {
    // panic!
}

This is very JS-style plus makes the whole point of the api module obsolete.

It is "fine"-ish if getData will return Data type that gives methods like asBytes, asByteBuffer, or a custom type that will let accessing the original object, but making it Object is a no go IMO.

Copy link
Member Author

@slinkydeveloper slinkydeveloper Aug 13, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok ok i'm interested in exploring the Data alternative, what about (names are provisional):

public interface Data {
  public byte[] asBytes();

  public @Nullable Object raw();
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to understand your idea, what should it look like the Data interface for you?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

few things to keep in mind:

  1. Some implementations that care about the performance may want to implement Data on the same class that implements the CloudEvent type, so the method names should be carefully selected to not clash or confuse.

  2. This can as well be a set of methods of CloudEvent like dataAsBytes(), dataAsByteBuffer(), etc etc. not generic dataAs(Class) tho.

  3. before we make 5 more rounds of PRs, I would like to step back and ask the question that was already asked:
    Why do we even need to expose "raw" object access in CloudEvent on a first place? What are we trying to solve?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. so i wonder if dataAsBytes() and dataAsObject() just solves the problem we have here (I still prefer to have a sub interface and then I implement it in the same class of the cloudevent impl).

  2. [WIP] Allow POJOs as CloudEvent data #198 (comment)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dataAsObject() would require doing a mandatory instanceof check on the value to proceed with processing. You can't even use it in the JSON format to encode the body because it may return byte[] of already seriallized payload. You must have a concrete type to hint the serializer to use some other field other than dataAsBytes().

How is it better than doing an instanceof check on the CloudEvent itself?

Copy link
Member Author

@slinkydeveloper slinkydeveloper Aug 13, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can't even use it in the JSON format to encode the body because it may return byte[] of already seriallized payload.

I can and the implementation is pretty straightforward (look at the code here https://github.com/cloudevents/sdk-java/pull/211/files#diff-75b992e6fc5d4cd3890ea6e37c7fe2a4R114)

You must have a concrete type to hint the serializer to use some other field other than dataAsBytes().

WDYM? Every serializer just use val.getClass() to start the serialization.

How is it better than doing an instanceof check on the CloudEvent itself?

Of course you always need the instanceof, but makes simpler the usage and the impl:

  • 90% of users won't have to subclass to put a concrete type of their choice, they just use the CloudEvent provided by core
  • 10% of users subclass overriding the signature, and they return the concrete type, removing the need to use the instanceof
  • Makes explicit that the data is dynamic by nature in a CloudEvent. We can't build in this sdk all the different JsonCloudEvent, XmlCloudEvent because data can be anything...
  • Because it doesn't assume anything on the nature of CloudEvent data, the builders/deserializers don't need to return different impls of the event depending on the payload, they always use the same CloudEventV1 and CloudEventV03

My feeling about subclassing CloudEvent depending on the type of data is that you're making the wrong thing "dynamic", while the Data idea makes more sense to me because is exactly the data field the dynamic one. It's another layer of indirection, but makes more explicit the concept

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can and the implementation is pretty straightforward (look at the code here https://github.com/cloudevents/sdk-java/pull/211/files#diff-75b992e6fc5d4cd3890ea6e37c7fe2a4R114)

this is not a "straightforward implementation", this is if {} else {} for a single case (byte[]) while it may also be ByteBuffer, InputStream, or many other possibilities. dataAsBytes/dataAsByteBuffer do provide a straightforward way of handling the case.

90% of users won't have to subclass to put a concrete type of their choice, they just use the CloudEvent provided by core

Then added it to core. I only care about api here.

10% of users subclass overriding the signature, and they return the concrete type, removing the need to use the instanceof

this is a wrong assumption because, in a pipeline, you will get instances of CloudEvent, not SomeCloudEvent, where getData() will be Object, not the concrete type. That's how Java's type system works.

Makes explicit that the data is dynamic by nature in a CloudEvent.

This does not mean that the SDK should be dynamic tho.

We can't build in this sdk all the different JsonCloudEvent, XmlCloudEvent because data can be anything...
Because it doesn't assume anything on the nature of CloudEvent data, the builders/deserializers don't need to return different impls of the event depending on the payload, they always use the same CloudEventV1 and CloudEventV03

That's not what I suggested. Please re-read.

My feeling about subclassing CloudEvent depending on the type of data is that you're making the wrong thing "dynamic", while the Data idea makes more sense to me because is exactly the data field the dynamic one.

Your perception is incorrect. I am saying that, if someone wants to apply an optimization, they can subclass and expose the raw object. Just re-read the original issue ( #196 ), the problem was in "double encoding".

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is not a "straightforward implementation", this is if {} else {} for a single case (byte[]) while it may also be ByteBuffer, InputStream, or many other possibilities. dataAsBytes/dataAsByteBuffer do provide a straightforward way of handling the case.

That's the very same thing i have to do with subclassing CloudEvent. Also, keep in mind that i cannot write directly byte[] into a jackson serializer, i need or a json type or a pojo to map to json (look at the diff).

Then added it to core. I only care about api here.
Makes explicit that the data is dynamic by nature in a CloudEvent.

I know, but the users of this sdk ends up using the interface CloudEvent in api, hence I care to have well aligned interface and impls for 90% of users without letting them downcast because the 10% thinks data is not a dynamic field and we can "statically type it". Just look at the spec of CloudEvents https://github.com/cloudevents/spec/blob/v1.0/spec.md#event-data and you clearly see that "This specification does not place any restriction on the type of this information", hence what is the point of trying to make it look static with subclasses and whatnot?

I am saying that, if someone wants to apply an optimization, they can subclass and expose the raw object

I'm not talking only about optimizations, I'm talking about the fact that if the user is allowed to put pojos inside the event (eg using the event builder providing a non byte array parameter), she/he expects to be able to get it back with a getter, like getDataAsObject().
And as I said, I see no point for the user of the sdk to do Object data = ((BaseCloudEventImpl)event).getRawData() where they build with builder.withData(pojo). That makes the usage unnecessary awful...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's the very same thing i have to do with subclassing CloudEvent. Also, keep in mind that i cannot write directly byte[] into a jackson serializer, i need or a json type or a pojo to map to json (look at the diff).

With byte[], you don't need Jackson at all, you have the bytes already. It is up to the implementation to decide how to return byte[]. That's the whole point.

@olegz
Copy link
Contributor

olegz commented Aug 13, 2020

At the moment we have 3 PRs trying to address the same thing. What's interesting is that no matter how much we discuss and how many idiomatic and intuitive approaches are recommended (type converters, serializers, specialized implementations of CloudEvent and so on), each new PR is something new and is clearly not the result of any of the discussions we're having. I am not sure what is the point of having these discussions. . .

CloudEvent interface is a strategy to represent Cloud Event in its simplest and unassuming form. . . the form that is as close to raw representation of cloud event as possible. Any type conversion, transformation, serialization, decoration are all optional concerns that are only relevant in certain cases and as such should not in any way be leaked into the design of the base CloudEvent interface.

We all worked very hard to get CloudEvent interface to where it is now. As a core strategy it serves everyone well. There is absolutely no reason to touch it (ever).

My vote on this PR is definitive -1.

@slinkydeveloper
Copy link
Member Author

Replaced by #250

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

discussion enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants