-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature: modification and creation date in the aggregate messages #473
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use one query instead of two.
src/aleph/db/accessors/aggregates.py
Outdated
from sqlalchemy import ( | ||
join, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unused import. Note that PyCharm can remove unused imports for you (Code > Optimize Imports)
src/aleph/db/accessors/aggregates.py
Outdated
@@ -38,13 +37,32 @@ def get_aggregates_by_owner( | |||
return session.execute(select_stmt).all() # type: ignore | |||
|
|||
|
|||
def get_aggregates_info_by_owner( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understand correctly you perform 2 DB accesses, one for the aggregate content and the other one for the info. Performing several DB accesses instead of one should usually be avoided for the following reasons:
- Performance: more round trips to the DB -> more time spent in establishing connections, looking for the correct rows, etc.
- Atomicity: aggregates may be added/deleted between your two queries, so you may end up with less/more entries in info than you'd expect. If you perform one query, you won't have this problem as the query will happen as a single DB transaction.
It would be better to just add a with_info
parameter to get_aggregates_by_owner
and use different queries for each case, but always selecting the content
field.
Typing-wise, you can add a typing.overload
to tell mypy what to expect:
AggregateContentWithInfo = Tuple[str, Dict[str, Any], dt.datetime, dt.datetime, str, str]
AggregateContent = Tuple[str, Dict[str, Any]]
@overload
def get_aggregates_by_owner(..., with_info: Literal[True]) -> Iterable[AggregateContentWithInfo]): ...
@overload
def get_aggregates_by_owner(..., with_info: Literal[False]) -> Iterable[AggregateContent]): ...
|
||
info[aggregate_key] = { | ||
"created": created.isoformat(), # Convert datetime to ISO format | ||
"last_updated": last_updated.isoformat(), # Use actual 'last_updated' value |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why "Use actual 'last_updated' value"? I don't get this comment.
@@ -34,6 +39,8 @@ async def address_aggregate(request: web.Request) -> web.Response: | |||
""" | |||
|
|||
address = request.match_info["address"] | |||
with_info_param = request.query.get("with_info", "false") | |||
with_info = with_info_param.lower() == "true" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add with_info
to AggregatesQueryParams
just above. We already have a Pydantic model, no need to perform custom input validation.
if not aggregates: | ||
return web.HTTPNotFound(text="No aggregate found for this address") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Revert.
info = {} | ||
if query_params.with_info: | ||
for result in aggregates: | ||
( | ||
aggregate_key, | ||
content, | ||
created, | ||
last_updated, | ||
original_item_hash, | ||
last_update_item_hash, | ||
) = result | ||
|
||
if isinstance(created, dt.datetime): | ||
created = created.isoformat() | ||
if isinstance(last_updated, dt.datetime): | ||
last_updated = last_updated.isoformat() | ||
|
||
info[aggregate_key] = { | ||
"created": created, | ||
"last_updated": last_updated, | ||
"original_item_hash": original_item_hash, | ||
"last_update_item_hash": last_update_item_hash, | ||
} | ||
output["info"] = info |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The way you're doing this, you're iterating twice over the aggregates
list. The output dict can be generated with only one iteration.
tests/api/test_aggregates.py
Outdated
@@ -26,12 +26,17 @@ def make_uri(address: str) -> str: | |||
return AGGREGATES_URI.format(address=address) | |||
|
|||
|
|||
async def get_aggregates(api_client, address: str, **params) -> aiohttp.ClientResponse: | |||
async def get_aggregates( | |||
api_client, address: str, with_info: str, **params |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why str
? Use bool
. aiohttp
takes care of casting iirc.
tests/api/test_aggregates.py
Outdated
|
||
assert address == aggregates["address"] | ||
assert aggregates["data"]["test_key"] == {"a": 1, "b": 2} | ||
assert aggregates["data"]["test_target"] == {"a": 1, "b": 2} | ||
assert aggregates["data"]["test_reference"] == {"a": 1, "b": 2, "c": 3, "d": 4} | ||
assert aggregates["info"]["test_reference"] is not None | ||
print(aggregates) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove the print statement.
tests/api/test_aggregates.py
Outdated
|
||
assert address == aggregates["address"] | ||
assert aggregates["data"]["test_key"] == {"a": 1, "b": 2} | ||
assert aggregates["data"]["test_target"] == {"a": 1, "b": 2} | ||
assert aggregates["data"]["test_reference"] == {"a": 1, "b": 2, "c": 3, "d": 4} | ||
assert aggregates["info"]["test_reference"] is not None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you do more than just checking if the key is present? Like, compare to the content of the aggregate message.
* Fix regression introduced by (#473) * Fix aggregate filter by keys * Feature: * Add param to only return the value of an aggregate by its key * add Tests
Problem :
To avoid fetching all the messages to find the creation/modification time of a specific aggregate entry it will be useful to have this info directly in the aggregate message
Solution:
Update '/api/v0/aggregates' Endpoint
param :
From :
To:
Fixes #470.