Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

arc28 and other log data in JSON format (may be) garbage #6128

Open
kirse opened this issue Sep 12, 2024 · 0 comments
Open

arc28 and other log data in JSON format (may be) garbage #6128

kirse opened this issue Sep 12, 2024 · 0 comments
Labels
bug Something isn't working

Comments

@kirse
Copy link

kirse commented Sep 12, 2024

Subject of the issue

When using the json format on /v2/blocks it returns serialized lg data that is unusable / garbage in many cases.

In this case, the bad serialization is most easily identifiable by the visual presence of utf-8 replacement characters -> https://www.compart.com/en/unicode/U+FFFD

Steps to reproduce

Method One

Look at txn index 16 (assuming start index of 0) which is a Tinyman swap:

https://mainnet-api.algonode.cloud/v2/blocks/41335116

Fields in dt.ld.1 have garbage data along with lg, particularly logs like output_asset_id and output_amount

Method Two

Run this script to see a more thorough example of trying to access Tinyman log data, which is provided by v2/blocks in json format.

from algosdk.v2client import algod
from base64 import b64decode
import pprint

algod_address = "https://mainnet-api.algonode.cloud"
algod_token = ""
algod_client = algod.AlgodClient(algod_token, algod_address)

def decode_logs(logs: "list") -> dict:
    decoded_logs = dict()
    for log in logs:
        print("Decoding:", log)
        if type(log) == str:
            log = b64decode(log.encode())
        if b"%i" in log:
            i = log.index(b"%i")
            s = log[0:i].decode()
            value = int.from_bytes(log[i + 2 :], "big")
            decoded_logs[s] = value
        else:
            raise NotImplementedError()
    return decoded_logs

block = algod_client.block_info(block=41335116)
pp = pprint.PrettyPrinter(indent=2)

tinyman_tx = block.get('block').get('txns')[16]
log = tinyman_tx.get('dt').get('lg')
pp.pprint(log)

result = decode_logs(log)
pp.pprint(result)

Thoughts

There are a few issues here:

  1. JSON format is currently serializing data that may be unusable, wasting loads of bandwidth + cpu cycles
  • Not a total waste in cases where the bytestream can be serialized
  1. Switching to msgpack is replete with all sorts of quirks across various languages, the biggest being the main Javascript library can't even decode it without issues
  2. @joe-p has mentioned:

We technically could "fix" the JSON response to use surrogate escaping like Python supports, but again not every client would support that and that could actually break code that is currently working.

Which really sounds like the ideal answer, and then using path-versioning like /v2.1/blocks to provide a new endpoint to maintain legacy endpoint functioning.

  1. @pbennett notes:

problem likely stems from the 'logs' being defined as string not []byte - so it should be base64 encoded by algod for eg and its not.

@kirse kirse added the bug Something isn't working label Sep 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant