-
Notifications
You must be signed in to change notification settings - Fork 123
Description
@volaya @duckontheweb @lossyrob and #264
Introduced here pystac/summaries.py#L40, FIELDS_JSON_URL causes the default path to go load the field definitions from a remote file. I am not a fan of having an additional code path (there was already the validation schema fetches) outside of testing:
- That file is not versioned with pystac
- It incurs a runtime cost of fetching 24089 bytes across a network. For some folks this might be an issue.
- It's another code path that I have to patch out for hermetic testing
- It blocks using using this code when offline or sandboxed
- If something is wrong with that file, it's not obvious how to approach fixing it
- What if the file moves?
I would much rather have a script to rebuild a python file that contains just the resulting table that is only the parts used here.
Currently it's this:
FIELDS_JSON_URL = (
"https://cdn.jsdelivr.net/npm/@radiantearth/stac-fields/fields-normalized.json"
)
@lru_cache(maxsize=None)
def _get_fields_json(url: str) -> Dict[str, Any]:
return pystac.StacIO.default().read_json(url)
# ...
class Summarizer:
# ...
def __init__(self, fields: Optional[str] = None):
fieldspath = fields or FIELDS_JSON_URL
try:
jsonfields = _get_fields_json(fieldspath)
except:
if fields is None:
raise Exception(
"Could not read fields definition file at "
f"{fields} or it is invalid.\n"
"Try using a local fields definition file."
)
else:
raise
self._set_field_definitions(jsonfields)
def _set_field_definitions(self, fields: Dict[str, Any]) -> None:
self.summaryfields: Dict[str, SummaryStrategy] = {}
for name, desc in fields["metadata"].items():
if isinstance(desc, dict):
strategy_value = desc.get("summary", True)
try:
strategy: SummaryStrategy = SummaryStrategy(strategy_value)
except ValueError:
strategy = SummaryStrategy.DEFAULT
if strategy != SummaryStrategy.DONT_SUMMARIZE:
self.summaryfields[name] = strategy
else:
self.summaryfields[name] = SummaryStrategy.DEFAULT
Additionally, having an enum of mixed types is likely to bite us down the road. An importer should translate all of those strategies into single character strings. Especially with True vrs 3 other strings that eval to True
. It's not obvious here what the default even is.
class SummaryStrategy(Enum):
ARRAY = "v"
RANGE = "r"
SCHEMA = "s"
DONT_SUMMARIZE = False
DEFAULT = True
Also, the above uses DONT
for don't
. That's hard on some users. Please use DO_NOT_SUMMARIZE
or something else more readable. I generally suggest avoiding contractions in code even when it is just in comments.