feat: Allow semantic comparison of schemas #55
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Motivation
Two schemas that define the same columns, with the same constraints, and the same set of the schema-level rules are generally considered equal.
Recently, I talked to @AndreasAlbertQC and @MoritzPotthoffQC about adding serialization of schemas to dataframely. This PR is a preparation to allow a schema
X
to be considered equal to a schemaY
that is dynamically created from the serialization ofX
.Changes
matches
method to theSchema
class. One could also implement__eq__
in the metaclass to allow forSchema == Schema
to be a semantic comparison. This has a bunch of weird effects though, so I refrained from doing that.matches
function forColumn
andRule
. Initially, I implemented__eq__
for them, but it feels more consistent to also implement a method namedmatches
. It also allows to pass an additional parameter for evaluatingColumn
equality (i.e. the name of the column) which will come in handy when performing serialization.