Skip to content

Implement new default constr ID #262

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 13 commits into from
Aug 31, 2023

Conversation

nielstron
Copy link
Contributor

This is a possible implementation of a fix for #239

Properties

  • almost unique
    • pretty much random value between 1-2**32
    • based on class name, field names and field types
  • deterministic (sha256 defined behaviour)
  • overwritable (just an attribute)

@nielstron
Copy link
Contributor Author

nielstron commented Jul 14, 2023

As suggested by @jkoppel we should maybe use constructor IDs up to 2^32 since the numbers up to this value are treated all the same by the plutus VM wrt costing. I suspect that the additional 3 bytes don't make much of a costing difference either when looking at script and inline datum size.

@codecov-commenter
Copy link

codecov-commenter commented Jul 14, 2023

Codecov Report

Merging #262 (8b7d2d6) into main (a88d725) will increase coverage by 0.13%.
The diff coverage is 100.00%.

❗ Your organization is not using the GitHub App Integration. As a result you may experience degraded service beginning May 15th. Please install the Github App Integration for your organization. Read more.

@@            Coverage Diff             @@
##             main     #262      +/-   ##
==========================================
+ Coverage   85.01%   85.14%   +0.13%     
==========================================
  Files          26       26              
  Lines        2983     2996      +13     
  Branches      715      719       +4     
==========================================
+ Hits         2536     2551      +15     
+ Misses        336      335       -1     
+ Partials      111      110       -1     
Files Changed Coverage Δ
pycardano/plutus.py 89.06% <100.00%> (+1.46%) ⬆️

Copy link
Collaborator

@cffls cffls left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Elegant implementation!

on class attributes, types and class name.
"""
det_string = (
cls.__name__ + "*" + "*".join([f"{f.name}~{f.type}" for f in fields(cls)])
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it make sense to sort the fields before hashing?

Personally, I'd like to see both

    @dataclass
    class A(PlutusData):
        a: int
        b: bytes

and

    @dataclass
    class A(PlutusData):
        b: bytes
        a: int

hash into the same digest.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, is there a way to cache this result? The calculation might be expensive.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it make sense to sort the fields before hashing?

No this doesnt make sense. The order of fields is relevant for plutusData! (think of how C objects have their fields arranged in memory, same applies to Plutus data how their fields are arranged in the internal list). In particular, the two classes you listed do not constitute the same plutus data

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay makes sense, thanks for the explanation!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I played around with some trivial caching methods (i.e. overriding CONSTR_ID after initial call) but get some really weird errors in pytest (CONSTR_IDs are the same for different classes, but only if run in the suite, not when run individually)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I played around with some trivial caching methods (i.e. overriding CONSTR_ID after initial call) but get some really weird errors in pytest (CONSTR_IDs are the same for different classes, but only if run in the suite, not when run individually)

Will take a look at this issue.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to work:

    @classproperty
    def CONSTR_ID(cls):
        """
        Constructor ID of this plutus data.
        It is primarily used by Plutus core to reconstruct a data structure from serialized CBOR bytes.
        The default implementation is an almost unique, deterministic constructor ID in the range 1 - 2^32 based
        on class attributes, types and class name.
        """
        if not hasattr(cls, "_CONSTR_ID"):
            det_string = (
                cls.__name__ + "*" + "*".join([f"{f.name}~{f.type}" for f in fields(cls)])
            )
            det_hash = sha256(det_string.encode("utf8")).hexdigest()
            setattr(cls, "_CONSTR_ID", int(det_hash, 16) % 2**32)

        return cls._CONSTR_ID

If you could grant me write access to https://github.com/OpShin/pycardano.git, I can push this change. Also, I think there is an option to allow others to push changes to your PR when creating it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool! I have given you write access

@nielstron nielstron linked an issue Jul 22, 2023 that may be closed by this pull request
@cffls cffls force-pushed the feat/default_unique_constr branch from 71b54a9 to 59a64ee Compare July 23, 2023 06:30
"""
if not hasattr(cls, "_CONSTR_ID"):
det_string = (
cls.__name__ + "*" + "*".join([f"{f.name}~{f.type}" for f in fields(cls)])
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some fields are metadata, e.g. ClassVar, which is not serialized to cbor. I think we should try to exclude those fields.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe you can recycle the logic from the to_cbor or to_primitive function? these would exactly describe which parts are not serialized if I understand correctly?

Copy link
Collaborator

@cffls cffls Jul 31, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a bit tricky to reuse to_cbor or to_primitive here because they are not class methods. We can check if the type is in primitive list (recursively).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As of now, all fields are checked and serialized. We can revisit this when field exclusion is enabled in the future.

@cffls cffls force-pushed the feat/default_unique_constr branch from 6142794 to c4e06d6 Compare July 23, 2023 16:33
CONSTR_ID: ClassVar[int] = 0
"""Constructor ID of this plutus data.
It is primarily used by Plutus core to reconstruct a data structure from serialized CBOR bytes."""
AUTO_ID: ClassVar[bool] = False
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would honestly much prefer if AUTO ID would default to true.

When writing a smart contract and defining classes, setting the constr id manually requires more expertise than automatically having the unique constructor. If you think about it, there is no reason you should even know about constructor IDs when writing smart contracts, you should be able to assume that classes are distinct and distinguishable. So setting the constructor manually should be the non-default, reserved for experts that know what they are doing.

When you are using pycardano to model some existing contract written in Plutus, you definitely should be an expert and manually specify the constructor id, as you are working across different language implementations.

This is a breaking change, but I think it would much benefit the usability of pycardano.

Making AUTO ID default to True would then make the entire flag superfluous - when you are setting it to false manually you can just as well manually specify the constructor id.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense to enable auto id by default as you explained. Added auto id because I wanted to make it a non-breaking change. I will revert this commit.

@nielstron
Copy link
Contributor Author

@cffls any update on this?

@cffls
Copy link
Collaborator

cffls commented Aug 30, 2023

Added a comment regarding field exclusion. The PR looks good to me. If you don't have more to add, I will merge it.

@nielstron
Copy link
Contributor Author

It looks good to me as well :)

@cffls cffls merged commit 0d5b3d7 into Python-Cardano:main Aug 31, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Automatic unique CONSTR_IDs
3 participants