Skip to content

Replace Order with an attrs dataclass #403

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 6 commits into from
Closed

Conversation

sgillies
Copy link
Contributor

@sgillies sgillies commented Mar 1, 2022

Shift responsibility for turning API responses into instances of Order to OrdersClient.

Previously, the Order class constructor took a blob of JSON directly from the Orders API response and did some computation to come up with an order description object. This coupled the API response directly to our information model. I have a hypothesis that using a less tightly coupled Order data class and shifting the responsibility for translation to the OrdersClient will give us more freedom and fun down the road.

Resolves #402


orders = [order.to_dict() for order in orders]
Copy link
Contributor Author

@sgillies sgillies Mar 2, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jreiberkyle let's keep it so only the outermost layer of our system, the CLI commands, are concerned about JSON. On the inside, let's use instances of the classes in models.py: Object, ObjectComponent, etc. My reasoning is based on an analogy to Ned Batchelder's "Unicode Sandwich".

As we saw with Fact of Life #1, the data coming into and going out of your program must be bytes. But you don’t need to deal with bytes on the inside of your program. The best strategy is to decode incoming bytes as soon as possible, producing unicode. You use unicode throughout your program, and then when outputting data, encode it to bytes as late as possible.

This creates a Unicode sandwich: bytes on the outside, Unicode on the inside.

Also, methods that return one type of thing or another type based on an keyword argument are harder to maintain and refactor in the long run. We'll feel good about a single return type in 6 months or a year, I predict.

@tschaub I'm cc'ing you on this since you're getting back into Python programming. I hope the STAC ecosystem agrees with me about the JSON 🥪 😄

Decode JSON to data classes as soon as possible. Use data classes throughout the program. Encode back to JSON as late as possible.

cc @mkshah605

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huh. This is an interesting proposal and I can understand it, especially the analogy, and I still wonder about the tradeoff between working with JSON internally and working with data structures which are pretty much just wrapped JSON. Does the Python library user prefer the user experience of managing data structures vs managing JSON? I don't know the answer, though my original assumption was that they would prefer JSON. I would love to learn more about this ux decision in the wild. I will check out the python libraries associated with aws and gcloud - any others I could check?

Also, methods that return one type of thing or another type based on an keyword argument are harder to maintain and refactor in the long run. We'll feel good about a single return type in 6 months or a year, I predict.

This I full-heartedly agree with 😄

@@ -228,4 +229,4 @@ async def create(ctx, name, ids, bundle, item_type, email, cloudconfig, clip,
async with orders_client(ctx) as cl:
order = await cl.create_order(request)

echo_json(order.json, pretty)
echo_json(order.to_dict(), pretty)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we're encoding to JSON as late as possible this can just be an ordinary function and not a property that we might cache or whatever.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The property json could be a function that calls e.g. to_dict(), so it seems the change would be based on whether it improves ux, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. We're only going to call Order.to_dict() once in a CLI command, just before we print the JSON-encoded result, and I think that'll very likely be true in user code.

Also I think json.dumps(order.to_dict()) makes more sense that json.dumps(order.json). The second one reads like we're JSON-encoding twice.

@@ -326,24 +326,23 @@ async def download_order(
planet.exceptions.APIException: On API error.
"""
order = await self.get_order(order_id)
locations = order.locations
locations = [component.location for component in order.results]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consequence of adding a new OrderComponent class (alluded to in https://developers.planet.com/docs/orders/reference/#operation/getOrder).

"""Get all order requests.

Parameters:
state: Filter orders to given state.
limit: Limit orders to given limit.
as_json: Return orders as a json dict.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method will be much easier to maintain if it only returns a list of Orders.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agreed!

class Order:
id: str
state: str
results: typing.List[OrderComponent]
Copy link
Contributor Author

@sgillies sgillies Mar 2, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously, Order exposed results and locations as separate lists. Here we only expose a list of OrderComponents, which is a simplification.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Order is only a data class now. I think this is a great match for its role in our information model: it's only a snapshot of the state of an order, the truth of which exists entirely in our API server.

class Orders(Paged):
'''Asynchronous iterator over Orders from a paged response describing
orders.'''
LINKS_KEY = '_links'
NEXT_KEY = 'next'
ITEMS_KEY = 'orders'

def __init__(self, request, do_request_fcn, order_factory, limit=None):
super().__init__(request, do_request_fcn, limit=limit)
self._order_factory = order_factory
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We inject a factory for Orders.

@@ -166,23 +165,6 @@ async def test_list_orders_limit(order_descriptions, session):
assert oids == ['oid1']


@respx.mock
@pytest.mark.asyncio
async def test_list_orders_asjson(order_descriptions, session):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No longer needed.

@sgillies sgillies requested a review from jreiberkyle March 2, 2022 01:23
@@ -65,7 +65,11 @@ def test_cli_orders_list_basic(invoke, order_descriptions):

result = invoke(['list'])
assert not result.exception
assert [order1, order2, order3] == json.loads(result.output)
Copy link
Contributor Author

@sgillies sgillies Mar 2, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The decoupling I've done means that CLI output is no longer exactly the same as the API response.

Specifically:

  • Order properties other than the ones specified in the class definition are gone from the output JSON. They were already hidden by the Python API (only id and state were surfaced). We can add any ones that we want.
  • _links.results has been flattened to results.

To me it looks incidental that components of an order are under _links.results and not just results. I don't think _links serves users of the Python client and we can safely eliminate it.

@sgillies sgillies marked this pull request as ready for review March 2, 2022 22:27
@sgillies
Copy link
Contributor Author

sgillies commented Mar 2, 2022

Tests are passing, this is ready for review.

Copy link
Contributor

@jreiberkyle jreiberkyle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First off, thank you for this work. There are some considerable gems in here and I am appreciating the conversation this initiates. As for my review, I am finding it difficult to focus on the large number of actual proposed changes in light of the equally large number of formatting changes. To help me get traction, I will focus on commenting on the high level changes I see proposed here:

  1. Moving managing Order over to the OrdersClient does make sense from a separation of concerns point of view.
  2. I would further advocate for managing (paged)Orders to the OrdersClient as well, and possibly spending some time on design of a pager that, similar to the waiter, can be abstracted across all clients (as aws does)
  3. I am curious why you are advocating for having models manage the Order dataclass. Could OrdersClient do the same just as well? While it does seem nice to have one 'go to' for a data structure (aka models here), it seems to tightly couple OrdersClient to models, as evidenced by the orders module importing from models and the models tests import from the orders module.
  4. Regarding the proposal to use JSON as IO in the CLI and data structures as IO for the Python library, I think this is an excellent discussion point and may be evidence that we could be well served by going through the Python library public interface, similar to what we've done for the CLI, and coming to a consensus.

"""Get all order requests.

Parameters:
state: Filter orders to given state.
limit: Limit orders to given limit.
as_json: Return orders as a json dict.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agreed!

sgillies added 3 commits March 7, 2022 12:31
Shift responsibility for turning API responses into instances of
Order to OrdersClient.
@sgillies sgillies force-pushed the attrs-order-descr branch from e4f2993 to 8013136 Compare March 7, 2022 19:41
@define
class OrderComponent:
delivery: str
expires: datetime
Copy link
Contributor Author

@sgillies sgillies Mar 7, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jreiberkyle the API response has expires_at, but that's an idiosyncracy. I think users will be less surprised to see expires.


@define
class Order:
created: datetime
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

id: str
last_message: str
last_modified: datetime
request: dict
Copy link
Contributor Author

@sgillies sgillies Mar 7, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

request primarily exists to contain the user-provided order and delivery criteria. In this implementation it will also contain the entire order description sent back by the API because it's simpler to just do that than strip out API-provided items that are also exposed by class attributes.

Comment on lines +133 to +136
created=created,
last_modified=last_modified,
last_message=contents["last_message"],
error_hints=contents["error_hints"],
Copy link
Contributor Author

@sgillies sgillies Mar 7, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New attributes that fully cover the API-provided and dynamic parts of an order description.

last_modified=last_modified,
last_message=contents["last_message"],
error_hints=contents["error_hints"],
request=contents)
Copy link
Contributor Author

@sgillies sgillies Mar 7, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

request is the user-provided order and delivery criteria, the data that we post to the API when we create an order.

@sgillies
Copy link
Contributor Author

sgillies commented Mar 7, 2022

@jreiberkyle I rebased on the newly formatted v2 branch and fleshed out the Order class with more API-generated attributes.

I wouldn't be opposed to moving Order, OrderComponent, and Orders into planet.clients.orders. Could we do that in another PR?

results (list of OrderComponent): The canonical links to this
Order's results. API provided.
state (str): The current state of this Order request. API
provided.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All the attribute descriptions come straight from https://developers.planet.com/docs/orders/reference/#operation/getOrder.

@sgillies sgillies closed this Mar 29, 2022
@jreiberkyle jreiberkyle deleted the attrs-order-descr branch June 28, 2023 20:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants