Skip to content

Browseable and Children features from 1.0.0-beta5 #336

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

moradology
Copy link
Collaborator

@moradology moradology commented Jan 25, 2022

Related Issue(s): #311 #312

Description:
This PR adds a workaround which is intended to allow children and browsable conformance classes to be enabled in stac-fastapi while deeper changes for a more robust implementation are being developed in pgstac. For a sense of what specifying browsable hierarchies looks like, pay special attention to hierarchy.json and hierarchy.py

Additionally, clients have been broken out of the hundreds-of-lines file they used to live in and organized into their own package.

image
image
image

PR Checklist:

  • Code is formatted and linted (run pre-commit run --all-files)
  • Tests pass (run make test)
  • Documentation has been updated to reflect changes, if applicable, and docs build successfully (run make docs)
  • Changes are added to the CHANGELOG.

@moradology moradology force-pushed the feature/children-1.0.0-beta.5 branch 2 times, most recently from 658c75d to d9570a0 Compare January 25, 2022 00:23
@moradology
Copy link
Collaborator Author

With the most recent commit (191b75f), each catalog endpoint is, itself, a STAC API. Reviewers ought to keep an eye out for code that can be deduplicated without losing important context (e.g. a try/catch block which issues a 404 on exception is quite likely more useful duplicated as it doesn't hide relevant information for understanding an endpoint's behavior)

@moradology moradology force-pushed the feature/children-1.0.0-beta.5 branch from 3fbbac9 to 8baf324 Compare April 19, 2022 14:27
@moradology moradology requested review from geospatial-jeff and philvarner and removed request for geospatial-jeff April 19, 2022 14:27
@moradology moradology marked this pull request as ready for review April 19, 2022 14:27
@@ -83,9 +86,15 @@ class StacApi:
api_version: str = attr.ib(default="0.1")
stac_version: str = attr.ib(default=STAC_VERSION)
description: str = attr.ib(default="stac-fastapi")
search_get_request_base_model: Type[BaseSearchGetRequest] = attr.ib(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whats the difference between this and search_get_request_model? Same comment for post models.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this is a potentially ugly and maybe a point where you could provide some counsel. Basically, the need to supply a catalog URI parameter changes the constructed search model so what I'm doing here is passing in the base search_get_request_base_model because it might possibly differ from the fully constructed search_get_request_model used by standard searches. Perhaps the answer is to just construct both of these inside of core.py. Let me know what you think.

if "collection_id" in child
]
children_links = catalog_links + collection_links
item_links = [
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it necessary to link to items from the GET /children endpoint? The spec says:

The /children endpoint must return the all the Catalog and Collection objects referenced by these child link relations.

And doesn't reference items at all.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh nevermind this is a requirement of browseable

@@ -319,6 +479,16 @@ def register_core(self):
self.register_get_collection()
self.register_get_item_collection()

# Browseable endpoints
self.register_catalog_conformance_classes()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to hide these behind some sort of flag (disabled by default) for two reasons:

  • These two extensions are really new, and we are the first implementation of it.
  • They increase the surface area of the API quite a bit, and may not be required by most use cases.

@@ -103,6 +109,53 @@ async def get_collection(self, collection_id: str, **kwargs) -> Collection:

return Collection(**collection)

async def get_root_children(self, **kwargs) -> Children:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a note to self that the duplication between async/sync clients is starting to get a little much and we may want to think of a way to fix that in the future.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. In this PR it already got to the point that organization in a single file was getting absurd.

@@ -29,14 +32,20 @@
TokenPaginationExtension(),
ContextExtension(),
]
with open(settings.browseable_hierarchy_definition, "r") as definition_file:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please do the hierarchy parsing behind a function so its not happening in global scope (to avoid side effects).

@@ -0,0 +1,35 @@
{
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm concerned with the scaleability of encoding larger hierarchies into a JSON file. I'm also concerned that this approach requires knowing item IDs ahead of time (this JSON file is read before the API starts up) which presents some challenges for an API that is adding/deleting items through transaction endpoints. Ideally item browseability could be done at runtime.

This is a hard problem, and I don't have a better solution for it at present; so happy to go forward with this approach and iterate/improve as needed!

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your scalability concern is reasonable. I'm having a hard time imagining it being a performance bottleneck, but JSON is brittle and it is (as currently imagined) directly maintained by developers (often a recipe for trouble). I'm not 100% sure what the best approach is to avoid such a difficulty. Perhaps rules for generating trees rather than fully specified trees? I'm open to suggestions.

Runtime addition of nodes should be possible with a small set of modifications. I agree that's important to add and am also in support of adding it in future updates

request_path = kwargs["request"]["path"]
split_path = request_path.split("/")[2:-1]
remaining_hierarchy = self.hierarchy_definition.copy()
# PGSTAC TODO: Add optional 'catalog' parameter to search, which will determine which
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are any of these PGSTAC TODO laying around the PR important to resolve before we merge?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These were initially added to provide some pointers to @bitner about what kinds of functionality could move back into pgstac, but I'm happy to get rid of them!

if "collection_id" in node
]
search_request.collections = child_collections
return await self.post_search(search_request, **kwargs)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm glad this is just a wrapper on top of the existing post_search, super cool!



# PGSTAC TODO: function to allow ingest of hierarchy
def parse_hierarchy(d: dict) -> BrowseableNode:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Return type should be a Union of all three nodes, but this function is only ever called on a dictionary that would return a BrowseableNode. So I'm not sure if the return type is wrong or the CollectionNode and CatalogNode returns aren't needed.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is very possible that I'm missing something peculiar about python types or TypedDict inheritance, but I think we can avoid the Union here because of the inheritance structure - both CollectionNode and CatalogNode are subsumed under the BrowseableNode type. Let me know if I'm off base in this assertion

@moradology moradology changed the title Browsable and Children features from 1.0.0-beta5 Browseable and Children features from 1.0.0-beta5 May 5, 2022
@gadomski
Copy link
Member

@moradology is this workaround still necessary? I'm doing a little housekeeping and don't want to try and resolve these merge conflicts if its not needed.

@moradology
Copy link
Collaborator Author

Hey @gadomski, I don't know that the children/browseable features are 100% necessary so it is probably not absolutely necessary that this workaround be included in the project. I'll close the PR for now but keep the branch around in case people want to refer back for a future implementation

@moradology moradology closed this Jan 10, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants