Description
Hi all, first: thanks for providing this library. It's going to be really useful to have a standard way to work with CloudEvents from Python.
I took a preliminary look at it and my first impression was that the interface provided by this library isn't very "Pythonic", meaning that it doesn't provide an interface in a way that (I believe) would be expected by most Python developers. There are some small things (like using method names like FromRequest
instead of from_request
, using New
in class/method names) but also larger things (like the use of method chaining, setters/getters, etc.)
I'm hoping that since this project is still in alpha, there's still time to improve the interface here (and that y'all are open to suggestions). I'll try to provide some examples below of what I'm talking about but I'm happy to clarify if anything's unclear. I'm also not intimately familiar with the CloudEvents spec, so I may have some misunderstandings as well.
Parsing upstream Event
from HTTP Request
The current example for turning an HTTP request into an event:
import io
from cloudevents.sdk.event import v02
from cloudevents.sdk import marshaller
m = marshaller.NewDefaultHTTPMarshaller()
event = m.FromRequest(
v02.Event(),
{
"content-type": "application/cloudevents+json",
"ce-specversion": "0.2",
"ce-time": "2018-10-23T12:28:22.4579346Z",
"ce-id": "96fb5f0b-001e-0108-6dfe-da6e2806f124",
"ce-source": "<source-url>",
"ce-type": "word.found.name",
},
io.BytesIO(b"this is where your CloudEvent data"),
lambda x: x.read()
)
- This makes the assumption that the user is only interested in supporting a single spec, but I think it's possible that either they would want to support multiple versions (and automatically detect which one is being used), or be able to upgrade to newer versions of the spec (and this module) without having to change their code.
- There's only one type of marshaller provided by the library right now,
HTTPMarshaller
, and if it's the default maybe we should just get it by default when creating anEvent
, instead of making the user initialize it every time? - The
data_unmarshaller
is maybe not necessary? We're taking a bytestring, turning it into aBytesIO
object to pass asdata
, and then usingdata_unmarshaller
to turn it back into a bytestring. Why not just have the user do any data unmarshalling themselves before constructing the event?
Instead, consider:
from cloudevents.sdk import Event
event = Event(
headers={
"content-type": "application/cloudevents+json",
"ce-specversion": "0.2",
"ce-time": "2018-10-23T12:28:22.4579346Z",
"ce-id": "96fb5f0b-001e-0108-6dfe-da6e2806f124",
"ce-source": "<source-url>",
"ce-type": "word.found.name",
},
body=b"this is where your CloudEvent data",
)
- This initializes a generic
Event
, detects which spec it matches, and gives back a correspondingEvent
subclass. (If the user knew which spec they were supporting, instead of constructing anEvent
they could construct av02.Event
and skip the event detection) - The
HTTPMarshaller
is used by default. This also helps avoid issues like Reusing a marshaller causes failures? #12. - The
data_unmarshaller
field is removed (or at the very least, optional)
Creating a minimal CloudEvent
The provided example is:
from cloudevents.sdk.event import v01
event = (
v01.Event().
SetContentType("application/json").
SetData('{"name":"john"}').
SetEventID("my-id").
SetSource("from-galaxy-far-far-away").
SetEventTime("tomorrow").
SetEventType("cloudevent.greet.you")
)
- Similar to above, the user has to specify the exact event type they're expecting
- Method chaining / fluent API: while this pattern is used by some Python libraries (pandas, various ORMs) it's not typical for the majority of Python libraries and generally not recommended.
Instead, consider the following based on the same Event
class proposed above:
from cloudevents.sdk import Event
event = Event(
data=b'{"name":"john"}',
content_type="application/json",
event_id="my-id",
event_time="tomorrow",
event_type="cloudevent.greet.you",
source="from-galaxy-far-far-away",
)
- The spec is automatically determined, but the user can also use a specific subclass if they prefer.
- The constructor uses keyword arguments, so that an
Event
object can be created with a single method call. If the user needs to modify the event after initialization, they can act on the attributes directly (e.g.event.event_time = "yesterday"
).
Getting event attributes
Currently, after creating an event, if the user wanted to get one of the fields such as the event time, the user would have to do something like:
event = ...
data = event.ce__eventTime.value
- The
ce
-prefix is a bit unexpected and the double-underscore is unconventional - Having to call
.value
to get the value is also unexpected
Instead, consider something like:
event = ...
data = event.event_time
- No prefix or double underscore
- Access the attribute value directly
Final thoughts
I realize this is proposing a pretty big overhaul of the current interface, but I think doing so would probably go a long way to lend towards the usability and maintainability of this library, and it'd be better to do it sooner than later. I'm happy to help out here, including designing/implementing these changes and helping maintain them afterwards, if it makes sense. Thanks!