Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature: modification and creation date in the aggregate messages #473

Merged
merged 6 commits into from
Oct 6, 2023
Merged

Feature: modification and creation date in the aggregate messages #473

merged 6 commits into from
Oct 6, 2023

Conversation

1yam
Copy link
Collaborator

@1yam 1yam commented Sep 11, 2023

Problem :
To avoid fetching all the messages to find the creation/modification time of a specific aggregate entry it will be useful to have this info directly in the aggregate message

Solution:
Update '/api/v0/aggregates' Endpoint

param :

  • with_info: bool default : False

From :

{
  "address": "0xa1B3bb7d2332383D96b7796B908fB7f7F3c2Be10",
  "data": {
    "key1": ...,
    "key2": ...,
  }
}

To:

{
  "address": "0xa1B3bb7d2332383D96b7796B908fB7f7F3c2Be10",
  "data": {
    "key1": ...,
    "key2": ...,
  }
  "info": {
    "key1": {
      "created": ...,
      "last_updated": ...,
      "original_item_hash": ...,
      "last_update_item_hash: ...,
    },
    "key2": ...
  }
}

Fixes #470.

Copy link
Contributor

@odesenfans odesenfans left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use one query instead of two.

from sqlalchemy import (
join,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unused import. Note that PyCharm can remove unused imports for you (Code > Optimize Imports)

@@ -38,13 +37,32 @@ def get_aggregates_by_owner(
return session.execute(select_stmt).all() # type: ignore


def get_aggregates_info_by_owner(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand correctly you perform 2 DB accesses, one for the aggregate content and the other one for the info. Performing several DB accesses instead of one should usually be avoided for the following reasons:

  1. Performance: more round trips to the DB -> more time spent in establishing connections, looking for the correct rows, etc.
  2. Atomicity: aggregates may be added/deleted between your two queries, so you may end up with less/more entries in info than you'd expect. If you perform one query, you won't have this problem as the query will happen as a single DB transaction.

It would be better to just add a with_info parameter to get_aggregates_by_owner and use different queries for each case, but always selecting the content field.

Typing-wise, you can add a typing.overload to tell mypy what to expect:

AggregateContentWithInfo = Tuple[str, Dict[str, Any], dt.datetime, dt.datetime, str, str]
AggregateContent = Tuple[str, Dict[str, Any]]


@overload
def get_aggregates_by_owner(..., with_info: Literal[True]) -> Iterable[AggregateContentWithInfo]): ...

@overload
def get_aggregates_by_owner(..., with_info: Literal[False]) -> Iterable[AggregateContent]): ...


info[aggregate_key] = {
"created": created.isoformat(), # Convert datetime to ISO format
"last_updated": last_updated.isoformat(), # Use actual 'last_updated' value
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why "Use actual 'last_updated' value"? I don't get this comment.

@@ -34,6 +39,8 @@ async def address_aggregate(request: web.Request) -> web.Response:
"""

address = request.match_info["address"]
with_info_param = request.query.get("with_info", "false")
with_info = with_info_param.lower() == "true"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add with_info to AggregatesQueryParams just above. We already have a Pydantic model, no need to perform custom input validation.

Comment on lines 69 to 70
if not aggregates:
return web.HTTPNotFound(text="No aggregate found for this address")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Revert.

Comment on lines 76 to 99
info = {}
if query_params.with_info:
for result in aggregates:
(
aggregate_key,
content,
created,
last_updated,
original_item_hash,
last_update_item_hash,
) = result

if isinstance(created, dt.datetime):
created = created.isoformat()
if isinstance(last_updated, dt.datetime):
last_updated = last_updated.isoformat()

info[aggregate_key] = {
"created": created,
"last_updated": last_updated,
"original_item_hash": original_item_hash,
"last_update_item_hash": last_update_item_hash,
}
output["info"] = info
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The way you're doing this, you're iterating twice over the aggregates list. The output dict can be generated with only one iteration.

@@ -26,12 +26,17 @@ def make_uri(address: str) -> str:
return AGGREGATES_URI.format(address=address)


async def get_aggregates(api_client, address: str, **params) -> aiohttp.ClientResponse:
async def get_aggregates(
api_client, address: str, with_info: str, **params
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why str? Use bool. aiohttptakes care of casting iirc.


assert address == aggregates["address"]
assert aggregates["data"]["test_key"] == {"a": 1, "b": 2}
assert aggregates["data"]["test_target"] == {"a": 1, "b": 2}
assert aggregates["data"]["test_reference"] == {"a": 1, "b": 2, "c": 3, "d": 4}
assert aggregates["info"]["test_reference"] is not None
print(aggregates)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove the print statement.


assert address == aggregates["address"]
assert aggregates["data"]["test_key"] == {"a": 1, "b": 2}
assert aggregates["data"]["test_target"] == {"a": 1, "b": 2}
assert aggregates["data"]["test_reference"] == {"a": 1, "b": 2, "c": 3, "d": 4}
assert aggregates["info"]["test_reference"] is not None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you do more than just checking if the key is present? Like, compare to the content of the aggregate message.

@odesenfans odesenfans merged commit 5e7c10a into aleph-im:master Oct 6, 2023
2 checks passed
@odesenfans odesenfans deleted the 1yam-better-aggregates branch October 6, 2023 14:20
aliel added a commit that referenced this pull request Feb 19, 2024
* Fix aggregate filter by keys
aliel added a commit that referenced this pull request Feb 19, 2024
* Fix aggregate filter by keys
hoh pushed a commit that referenced this pull request Feb 20, 2024
* Fix aggregate filter by keys
MHHukiewitz pushed a commit that referenced this pull request Feb 23, 2024
* Fix regression introduced by (#473)

* Fix aggregate filter by keys

* Feature:

* Add param to only return the value of an aggregate by its key

* add Tests
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add the modification and creation date in the aggregate messages
2 participants