-
Notifications
You must be signed in to change notification settings - Fork 589
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove Map from allowed Context Attribute types #467
Remove Map from allowed Context Attribute types #467
Conversation
Signed-off-by: Evan Anderson <evan.k.anderson@gmail.com>
…we-dont-need-maps
Signed-off-by: Evan Anderson <evan.k.anderson@gmail.com>
Signed-off-by: Evan Anderson <evan.k.anderson@gmail.com>
Signed-off-by: Evan Anderson <evan.k.anderson@gmail.com>
http-transport-binding.md
Outdated
|
||
Two of the event attributes, `datacontenttype` and `data` are handled specially | ||
and mapped onto HTTP constructs, all other attributes are transferred as | ||
metadata without further interpretation. | ||
metadata without further interpretation (except that extensions may define |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/may/MAY/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
Related to this is #457, where |
spec.md
Outdated
@@ -194,6 +195,11 @@ as a string. | |||
`String`, `Binary`, `Map`, `URI-reference` or `Timestamp`. The type system is | |||
intentionally abstract, and therefore it is left to implementations how to | |||
represent the `Any` type. | |||
- `Any-context` - A variant type which can take the shapes allowed in a context |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I understand the purpose or value of this. If it's really only for extensions, and we treat unknown extensions as strings, why do we need this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would leave Any
untouched. Stating that the Map type can't be used for context attributes is sufficient. Any
is not a submarine through which you can smuggle Map
undetected.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was attempting to define the types additively (i.e. without any special-case rules)
I'd somewhat prefer a set of ABNF productions, though our use of an abstract type system makes that harder:
CloudEvent = "{" Attributes ["," Body-Attribute] "}"
Attributes = Attribute *( "," Attribute )
Attribute = Key "=" Attribute-Value
Key = 1*20( "a"-"z" / "0"-"9" )
Attribute-Value = Integer / String / Binary / URI-reference / Timestamp
Data-Attribute = "data" "=" Any
Any = ...
Integer = *DIGIT
String = ...
The current language, while serviceable, seems like it sometimes leads to confusion: for example, is ""
a valid key in a Map?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll switch this back for now, but I think it would be clearer to have a different name.
Maybe Context-Value
?
We discussed this on today's call. I believe there was general agreement that there's an issue to solve, but it wasn't clear how to proceed. There were 4 options mentioned: My interpretation of what was said on the call : there seemed to be a lot of interest in dropping Maps entirely, but some people did worry that not allowing for structured extensions would limit things too much or end up with people inventing their own way to represent maps via encoding them as strings or other types. And then we might have an interop problem around decoding those custom things since there is no single way to do it. Please correct me if I got this wrong. Please chime in on this one as it's kind of a biggie to solve prior to v1.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using Any
is fine if the concrete type used is not a map.
The only thing I would add with this PR is this text:
Context and extension attributes MUST NOT have Map
values, even when their type is Any
.
spec.md
Outdated
@@ -210,6 +216,17 @@ and the event data will be materialized. For example, in the case of a JSON | |||
serialization, the context attributes and the event data might both appear | |||
within the same JSON object. | |||
|
|||
Context attributes MUST be one of the following types: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe the half sentence in line 227 is sufficient for this PR.
Map
values MUST NOT be used in context attributes, including in extension attributes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. I'm slightly uncomfortable with the formulation, but the result is clear if you read the whole spec:
The value of a context attribute is Integer, String, Binary, Map, URI-reference, Timestamp, or Any, except that it can't be Any or Map.
If we accept some form of #457, we can completely remove Map, and the sentence simplifies to:
The value of a context attribute is Integer, String, Binary, Map, URI-reference, Timestamp, or Any.
spec.md
Outdated
@@ -182,7 +182,8 @@ as a string. | |||
- `Binary` - Sequence of bytes. | |||
- String encoding: Base64 encoding per | |||
[RFC4648](https://tools.ietf.org/html/rfc4648). | |||
- `Map` - `String`-indexed dictionary of `Any`-typed values. | |||
- `Map` - `String`-indexed dictionary of `Any`-typed values. Only valid in the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's sufficient to say this only in one place normatively.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm concerned about confusion between line 150 and later, but I'll remove if this seems clear to everyone.
The following abstract data types are available for use in attributes.
spec.md
Outdated
@@ -194,6 +195,11 @@ as a string. | |||
`String`, `Binary`, `Map`, `URI-reference` or `Timestamp`. The type system is | |||
intentionally abstract, and therefore it is left to implementations how to | |||
represent the `Any` type. | |||
- `Any-context` - A variant type which can take the shapes allowed in a context |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would leave Any
untouched. Stating that the Map type can't be used for context attributes is sufficient. Any
is not a submarine through which you can smuggle Map
undetected.
mqtt-transport-binding.md
Outdated
@@ -176,7 +176,8 @@ in the MQTT PUBLISH message. | |||
##### 3.1.3.2 User Property Values | |||
|
|||
The value for each MQTT PUBLISH User Property MUST be constructed from the | |||
respective CloudEvents attribute type's canonical string representation. | |||
respective CloudEvents attribute type's canonical string representation in the | |||
[JSON][json-format] string encoding. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We define the canonical representations here. We've decoupled them from JSON now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, updated the link to the type system
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we accept some form of #457 which removes data
from the CloudEvents type system, I think we could remove Map entirely.
I'm happy to wait until 457 is resolved to update this and hopefully remove Map altogether. If we can't remove Map, I think it would be clearer to define Context-Value
and Data-Value
types separately, and remove the Any
type, rather than doing some sort of strikethrough on what's allowed in a context attribute.
spec.md
Outdated
@@ -194,6 +195,11 @@ as a string. | |||
`String`, `Binary`, `Map`, `URI-reference` or `Timestamp`. The type system is | |||
intentionally abstract, and therefore it is left to implementations how to | |||
represent the `Any` type. | |||
- `Any-context` - A variant type which can take the shapes allowed in a context |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was attempting to define the types additively (i.e. without any special-case rules)
I'd somewhat prefer a set of ABNF productions, though our use of an abstract type system makes that harder:
CloudEvent = "{" Attributes ["," Body-Attribute] "}"
Attributes = Attribute *( "," Attribute )
Attribute = Key "=" Attribute-Value
Key = 1*20( "a"-"z" / "0"-"9" )
Attribute-Value = Integer / String / Binary / URI-reference / Timestamp
Data-Attribute = "data" "=" Any
Any = ...
Integer = *DIGIT
String = ...
The current language, while serviceable, seems like it sometimes leads to confusion: for example, is ""
a valid key in a Map?
spec.md
Outdated
@@ -194,6 +195,11 @@ as a string. | |||
`String`, `Binary`, `Map`, `URI-reference` or `Timestamp`. The type system is | |||
intentionally abstract, and therefore it is left to implementations how to | |||
represent the `Any` type. | |||
- `Any-context` - A variant type which can take the shapes allowed in a context |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll switch this back for now, but I think it would be clearer to have a different name.
Maybe Context-Value
?
http-transport-binding.md
Outdated
|
||
Two of the event attributes, `datacontenttype` and `data` are handled specially | ||
and mapped onto HTTP constructs, all other attributes are transferred as | ||
metadata without further interpretation. | ||
metadata without further interpretation (except that extensions may define |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
mqtt-transport-binding.md
Outdated
@@ -176,7 +176,8 @@ in the MQTT PUBLISH message. | |||
##### 3.1.3.2 User Property Values | |||
|
|||
The value for each MQTT PUBLISH User Property MUST be constructed from the | |||
respective CloudEvents attribute type's canonical string representation. | |||
respective CloudEvents attribute type's canonical string representation in the | |||
[JSON][json-format] string encoding. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, updated the link to the type system
spec.md
Outdated
@@ -182,7 +182,8 @@ as a string. | |||
- `Binary` - Sequence of bytes. | |||
- String encoding: Base64 encoding per | |||
[RFC4648](https://tools.ietf.org/html/rfc4648). | |||
- `Map` - `String`-indexed dictionary of `Any`-typed values. | |||
- `Map` - `String`-indexed dictionary of `Any`-typed values. Only valid in the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm concerned about confusion between line 150 and later, but I'll remove if this seems clear to everyone.
The following abstract data types are available for use in attributes.
spec.md
Outdated
@@ -210,6 +216,17 @@ and the event data will be materialized. For example, in the case of a JSON | |||
serialization, the context attributes and the event data might both appear | |||
within the same JSON object. | |||
|
|||
Context attributes MUST be one of the following types: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. I'm slightly uncomfortable with the formulation, but the result is clear if you read the whole spec:
The value of a context attribute is Integer, String, Binary, Map, URI-reference, Timestamp, or Any, except that it can't be Any or Map.
If we accept some form of #457, we can completely remove Map, and the sentence simplifies to:
The value of a context attribute is Integer, String, Binary, Map, URI-reference, Timestamp, or Any.
Signed-off-by: Evan Anderson <evan.k.anderson@gmail.com>
Signed-off-by: Evan Anderson <evan.k.anderson@gmail.com>
spec.md
Outdated
@@ -211,6 +211,9 @@ and the event data will be materialized. For example, in the case of a JSON | |||
serialization, the context attributes and the event data might both appear | |||
within the same JSON object. | |||
|
|||
`Any` and `Map` values MUST NOT be used in context attributes, including in | |||
extension attributes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this version of the PR better - it feels less confusing to me. One thing I think we need to do is expand a bit of this sentence. Normally I would say we should do it in the Primer because the text in the spec should be obvious and the Primer is just for a more in-depth discussion, but in this case there's no indication at all as to why we're making this rule.
Do we need to say "Any" or can "Map" alone cover it? I wonder if someone will read it as: Any == one of int, string, etc... and therefore I can't use any of those. Or maybe: Values of type Map
, either directly or via the Any
type, MUST NOT be used in context attributes, including in extension attributes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ping @evankanderson
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
FWIW, I find this construction more complicated than defining a separate type, because now we can't say elsewhere in the spec "Unknown extension attributes should be assumed to be of type Any
" (for example) without including a reference to this clarification.
General Comment : I understand the drive for this change but would ask that we consider retaining some model that allows for extensions to be represented as a map in some constrained form; i believe there is value in keeping them grouped together. I would propose that an extension be defined as a "string indexed dictionary of string typed values" (or something similar) to prevent the map-of-maps situation. Doing this means that existing transport bindings for extensions continue to work unchanged, and the spec.json change would be very subtle:
|
I have mixed feeling about this, but one of the things that I keep thinking about is when someone (maybe @clemensv) mentioned that just about any serialization of maps in binary format (e.g. as http headers) will result in a really non-trivial mechanism to do filtering over that data. And when I combine that with the idea that CloudEvents metadata should not really be used to carry business logic (that's what And I guess, it's worth reminding people that just because a piece of data isn't part of the CloudEvents properties doesn't mean that it can't still be used for filter/routing... people will just need to do something like Of course I also once suggested that everything be strings... so what do I know :-) |
…ap` is allowed. Signed-off-by: Evan Anderson <argent@google.com>
Observation from a number of transports:
|
Apologies if this is a bit rambling, hopefully you can see what i'm driving at: My understanding of how extensions that allow for mutiple attributes would be represented (currently and with my proposal) Example using the sequence and partioning extension along with a company (Bobs Bait Shop - BBS) private internal extension. HTTP, Structured, JSON "data"
HTTP, Binary, JSON data
Having sketched this out, and assuming i'm correct, then i believe my proposal has legs. :-) Without some formal namespace/grouping everything gets flattened (in both modes)...
As an SDK provider i can no longer say to the application code "here are all the attributes of Bobs-Bait-Shop custom extension". |
Actually, since we don't put extensions into a bag it would be:
The Binary version looks right.
I would question the choice of using a prefix of "sequence" and a property of "sequence" instead of the first being a bit more descriptive (or just prefixing those two |
Hmmm .. according to the spec.json file we do put extensions into a bag ;-) |
LOL well that's a bug - good catch! PR coming.... |
@evankanderson I think this line: https://github.com/cloudevents/spec/pull/467/files#diff-958e7270f96f5407d7d980f500805b1bR422 might need to be updated |
@duglin - I found the sequence/sequence weird as well, we might want to consider having guidance of extension writers to use more terse naming (much like JWT's). |
@JemDay I agree, but I wasn't suggesting that it should be that way. I was just showing what it would look like if you blindly squashed things. In a real-world situation, I wouldn't name things such that I ended up with |
Another attempt to clarify my concern with the proposed changes - how do we move between structured mode, binary mode, and back without loss? (1) Start with a structured event
(2) Now, re-encode it into Binary
(3) ProofSo, only with the information above , how do you transform the binary representation into the structured one we started with using the various binding sprecifications. If you can show a translation from (2) -> (1) with no prior knowledge of (1) then i'm happy. I believe my proposal (further up the PR) 'restricting a extension to be a bag-of-strings (or primitives)', eg using an |
We should probably fix that sentence anyway; we don't support sequences (arrays) or many other types (including boolean). |
Summary of pros and cons. Note that the "Cons" are motivations to accept this proposal, and the "Pros" are motivations to ignore this proposal. Cons of having
Pros of having
|
With respect to Jem's example, I would recommend the following instead: (1) Start with a structured event
(2) Now, re-encode it into Binary POST /... HTTP/1.1
Content-Type: application/json
Content-Length: nnnn
CE-Specversion: 0.5
CE-Type: org.jem.example
CE-Time: 2019-07-18T16:26:00Z
CE-Id : aaaa-bbbb-cccc
CE-Sequence: 99
CE-Sequence-Type: Integer
CE-Bbs-Ctx: ABCD
CE-Bbs-Correlation: 111-222-333
CE-Partition-Key: P1
{
"message" : "The quick brown fox jumped over the lazy dog"
} In addition to this PR, this assumes a second PR which would add "" to the allowed set of interior characters in context attributes (i.e. not "foo" or "foo"). "" would be a language- and content-encoding specific delimiter character; in HTTP, it would be mapped to "-" (and the character after a "-" would be capitalized in this implementation, though HTTP headers are case-insensitive). "_" is chosen over "-" or other characters as it is a valid separator character in most programming languages, even though it is uncommon in HTTP header names. |
As I said on the call, this was already discussed in the working group. I personally did not agree to do away with any kind of separation between words, but the working group had a vote and accepted the proposal by Clemens. See #321 , in particular
While I personally don't like the tradeoff made in #321 (and have voted against it), I'd prefer if we can just keep it as-is and move on. Re-discussing all tradeoff decision every couple of months is not a good way to make progress towards a |
Focusing just on this one:
I'm curious to know how people expect SDKs/code to deal with this. When the extension is in a format like JSON then I suspect the answer is the same as how you'd deal with any unknown JSON property since that could be a singular value or an object/map. When we're dealing with binary mode, e.g. http binary, and we allow any arbitrary depth of maps, then it's harder. However, if we limited maps to just one level and kept the binary serialization as I'd like to know what people think about the compromise position that @JemDay mentioned... keep maps but just limit them to one level. |
At the end of this discussion on yesterday's call we briefly touched on whether maps could be added back in post v1.0. At the time I had my doubts, however I now wonder if we could add them by doing this:
Thoughts? |
Assuming that we want to remove I may be out on Thursday; I'll see whether I can make the call, but I have a meeting conflict at that time. Secondary question -- would you prefer that I squash the commits before this is merged? |
Given the magnitude of this change, we decided to start an async vote that ends at the start of next week's call. The following have voting rights: Adobe, ControlBEAM, Google, IBM, NATS/Synadia, PayPal, Red Hat, RX-M, SAP, VMWare, Vlad Ionescu, Erik Erikson You can either vote via comment in this PR or at the start of the call next week - but a vote via comment (with reasons why) would be great for those who are unsure and would like to understand your reasoning. |
I vote in favor of this proposal, i.e. removing Map from the set of supported attribute types, and the reason is because it adds complexity to the spec, as has been discussed extensively in this PR. |
SAP votes for approving this PR. |
I'm in favor of this PR. We'll make a decision now, release 1.0-rc1 and as per Doug's comment we'll see what the real world feedback is. |
VMware votes for approving this PR. Agreed with simplicity for 1.0. |
Synadia votes to approve. Keep things simple. Maps can be added in the future if deemed necessary through a use case. |
…review. Signed-off-by: Evan Anderson <evan.k.anderson@gmail.com>
Per discussion in #460 and #456, it's not clear that Map and Any types in Context Attributes are worth the trouble:
(me)
This is not quite correct, because Map and Any are used by the
data
attribute in core (and hence this PR does not remove those types entirely)It's not clear that context attributes needed for routing or other middleware decisions benefit from this flexibility (see also this thread), and the possibility of Any types as the value in a Map adds a lot of complexity to the encoding.