-
Notifications
You must be signed in to change notification settings - Fork 589
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Restricting character set for attribute names #321
Conversation
ef2fafe
to
54873d6
Compare
spec.md
Outdated
languages. Some of these treat metadata elements as case-sensitive while others | ||
do not, and a single CloudEvent might be routed via multiple hops that involve | ||
a mix of protocols, encodings, and runtimes. Therefore, this specification | ||
limits the available character set of all normatively named attributes such that |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we should drop "normatively named" because these rules would apply to extensions too, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@clemensv thoughts on this one?
I understand the need for clarification, and like the simplicity of this approach. However, I think this approach simplifies the spec by shifting the cost to usability. Practically speaking this will force keys to be lowercased everywhere, which will not play well with many programming languages and will feel seriously out of place. I worry about the inertia on adoption this approach might cause.
I'm doubtful about the practicality of this approach and think it's a false promise. If the spec mandates all lowercase charactes with no word separator, each possible word would have to be known to the SDK for it to correctly apply word separation. That is impossible, because:
|
The point of this change is that on the wire the attributes are always lower case. Period. |
The call was very informative. I realise #321 (comment) might not be a problem, other than losing any possibility of casing all identifiers, which is kind of the point anyway but affects some languages.
As I said in the call, my concern is with dynamically typed languages, where an idiomatic way to access the event (dynamic property access, e.g.
To me extensions and core fields being idiomatically cased and not separated would've seemed a natural progression after extensions were moved out of a bag to the top-level, but I I jumped to conclusions 👍 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I opened #317 (mainly to get a discussion started), but I also support this PR. Both have their tradeoffs, but they solve the same issue (in different ways).
Words that are acronyms are written in all-caps, e.g. "ID" and "URL". | ||
The CloudEvents specifications define mappings to various protocols and | ||
encodings, and the accompanying CloudEvents SDK targets various runtimes and | ||
languages. Some of these treat metadata elements as case-sensitive while others |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like that a rationale is provided. I think it could be extended to include a rationale for ASCII/alpha-numeric characters, and the length limit.
spec.md
Outdated
case-sensitivity issues or clashes with the permissible character set for | ||
identifiers in common languages are prevented. | ||
|
||
CloudEvents attribute names MUST consist of lower-case letter ('a' to 'z') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/letter/letters
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@clemensv please fix this one during your rebase - it should be non-controversial.
"otherValue": 5 | ||
"eventid" : "A234-1234-1234", | ||
"eventtime" : "2018-04-05T17:31:00Z", | ||
"comexampleextension1" : "value", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Exactly 20 characters 😉
I'm wondering if, with the 20 character limit, we should "examplify" the use of a reverse URL. Even for this rather short TLD, 10 characters are taken. "commicrosoft" is 12 chars, "comserverless" is 13 chars...
I don't have a good alternative suggestion, though.
On today's call we discussed this topic and narrowed our choices down to this PR and #317. We then decided to start a vote between the two. Please add a comment to this PR to vote regardless of which PR you would like to vote for (if you did not vote already during today's call). The vote ends at the start of next week's call. Voting members are: Adobe, Alibaba, Confluent, ControlBEAM, Fujitsu, Google, Huawei, IBM, Itau Unibanco, Microsoft, NAIC, NATS/Synadia, Nordstrom, Oracle, PayPal, Red Hat, SAP, Serverless, Solace, VMWare, Vlad Ionescu, Tapani Moilanen, Renato Gallis (Renato: do you work for Itau or are you independent?) On today's call, the following companies voted: |
Adobe: 321 |
I'm voting for 321 too. |
Synadia: 321 |
SAP: 321 |
IBM: 321 |
Fujitsu: 321 |
76ad9d1
to
af03f0a
Compare
Signed-off-by: Clemens Vasters <clemensv@microsoft.com>
af03f0a
to
678e215
Compare
amqp-transport-binding.md
Outdated
All [CloudEvents][CE] attributes with exception of `contentType` and `data` | ||
MUST be individually mapped to and from the AMQP | ||
All [CloudEvents][CE] attributes with exception of `contenttype` and `data` | ||
are individually mapped to and from the AMQP |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why did you drop the "MUST"?
http-transport-binding.md
Outdated
All [CloudEvents][CE] attributes with exception of `contentType` and `data` | ||
MUST be individually mapped to and from distinct HTTP message headers, | ||
All [CloudEvents][CE] attributes with exception of `contenttype` and `data` | ||
are individually mapped to and from distinct HTTP message headers, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no MUST?
Signed-off-by: Clemens Vasters <clemensv@microsoft.com>
Signed-off-by: Clemens Vasters <clemensv@microsoft.com>
LGTM I'd like to get at least one more set of eyes on this to make sure we didn't miss something before we merge it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
(I still think we should provide better guidance/examples on how to provide descriptive extension names with a limit of 20 characters. The reverse url "wastes" too much characters quickly. But we can do that in a follow-on PR)
Approved on 11/1 call. |
I've just heard of CloudEvents, so am very new to this. At the Wikimedia Foundation, we are undergoing a similar (internal) process to standardize our event schemas. I'd like it if we could conform to some larger standard rather than one we make up for ourselves. I appreciate the restriction of lower case here; often our events are imported into case-insensitive SQL based systems, and we also have issues with case insensitive HTTP headers too. Lowercasing everything helps. However, @clemensv, you wrote
Where do underscores clash? I'm sure this spec has to be compatible with many more languages and protocols than I'm familiar with, but off the top of my head I don't know of one. I understand that you are trying to just standardize 'wire' protocols here, but I doubt event data will only be used for async service communication. They are often also used for analytics purposes where they are imported into databases and need to be readable by humans. Underscores would make this much saner. |
Signed-off-by: Christoph Neijenhuis <christoph.neijenhuis@commercetools.de>
Extensions follow attribute naming scheme introduced in #321
cloudevents attributes should be handled. See cloudevents/spec#321 for context.
cloudevents attributes should be handled. See cloudevents/spec#321 for context.
* Fix issue with broker dropping messages with "No TTL Seen" * Treat the broker TTL attribute as case insensitive, in line with how cloudevents attributes should be handled. See cloudevents/spec#321 for context. * Remove unnecessary test change.
) * Fix issue with broker dropping messages with "No TTL Seen" * Treat the broker TTL attribute as case insensitive, in line with how cloudevents attributes should be handled. See cloudevents/spec#321 for context. * Remove unnecessary test change.
* Fix issue with broker dropping messages with "No TTL Seen" * Treat the broker TTL attribute as case insensitive, in line with how cloudevents attributes should be handled. See cloudevents/spec#321 for context. * Remove unnecessary test change.
At least week's meeting I was asked to create PR as a follow-up to my oral comment on #317
I appreciate the work done on that PR and the attempt to preserve case-sensitivity through an HTTP hop where the actors might perform case-folding for any reason, including semantically pointless aesthetics, due to HTTP being case-insensitive. That PR having to introduce complex rules is an issue with the main spec, and not something the HTTP mapping ought to fix, IMO.
This PR puts a constraint on the attribute name character set, only allowing lower-case letters and digits, with a leading lower-case letter. I am also including the rationale in the spec.
In effect, we are not taking a stance on case-sensitive or case-insensitive handling, but we rather eliminate the problem by only permitting lower-case letters for normatively named attributes.
While we're at it, I'm also adding a wire-footprint-greedy SHOULD clause regarding length.
To make up for the loss of the word separation by case, it would be great to have some kind of word separator available in addition to lower-case letters and digits, but dots and dashes and underscores clash with identifier rules in various languages/protocols. I don't think the result is all that terrible and the wire doesn't care about aesthetics.
If an SDK wants to "right case" identifiers for its clients, it's free to do so. A C# SDK ought to use PascalCase for strongly typed identifiers in order to comply with language conventions, and therefore "eventtime" ought to surface as "EventTime". The rules we make here must avoid that an implementer gets confused about what comes across and goes on the wire.
I have only modified the core spec for the purpose of the initial discussion. Specs that explicitly refer to attribute names will also have to change accordingly, of course, and I will amend the PR if we decide to go this route.