Structurally compare js objects rather than YAML #404

Bj0rnen · 2021-09-16T14:37:57Z

We are using the TestRunner to test our own kpt functions. For our use-cases, we have found the output of failed test cases to be insufficient in helping us locate the actual problems. That is because the test assertion compares two rendered YAMLs. These YAML strings are simply too long and the testing library (jasmine) truncates the strings, typically before the actual difference has appeared.

With this change, we compare JavaScript objects instead, and the assertion errors show a vastly improved structural diff between the configs. Now, we have been careful here to not change the behavior in terms of when tests succeed or fail. We've done that by still converting the configs to YAML and then converting them back from YAML. For us, this has been working exactly as intended.

@emmmile

Before vs after

Both before and after the change, diffs are printed like Expected <foo> to equal <bar>.. Keep an eye out for the to equal part in the before diff below:

Expected 'apiVersion: v1
items:
- apiVersion: bigtable.cnrm.cloud.google.com/v1beta1
  kind: BigtableInstance
  metadata:
    name: testtest-bigtable-instance
  spec:
    cluster:
    - clusterId: testtest-gr1-1
      numNodes: 0
      zone: gcp-region1-b
    - clusterId: testtest-gr2-1
      numNodes: 0
      zone: gcp-region2-b
    - clusterId: testtest-gr3-1
      numNodes: 0
      zone: gcp-region3-b
    displayName: testtest-bigtable-instance
- apiVersion: storage.example.com/v1alpha1
  kind: BigtableBackupSchedule
  metadata:
    name: testtest-bigtable-instance
- apiVersion: storage.example.com/v1alpha1
  kind: DatabaseAutoscaler
  metadata:
    name: testtest-gr3-1
  spec:
    configuration:
      pagerdutykey: xyz
      parameters:
        cpuTarget: 0.7
        maxNodes: 100
        minNodes: 5
      strategy: StandardStrategy
      version: v1alpha1
    databaseName: >-
      //bigtable.googleapis.com/projects/my-project/instances/testtest-bigtable-instance/clusters/testtest-gr3- ... to equal 'apiVersion: v1
items:
- apiVersion: bigtable.cnrm.cloud.google.com/v1beta1
  kind: BigtableInstance
  metadata:
    name: testtest-bigtable-instance
  spec:
    cluster:
    - clusterId: testtest-gr1-1
      numNodes: 0
      zone: gcp-region1-b
    - clusterId: testtest-gr2-1
      numNodes: 0
      zone: gcp-region2-b
    - clusterId: testtest-gr3-1
      numNodes: 0
      zone: gcp-region3-b
    displayName: testtest-bigtable-instance
- apiVersion: storage.example.com/v1alpha1
  kind: BigtableBackupSchedule
  metadata:
    name: testtest-bigtable-instance
- apiVersion: storage.example.com/v1alpha1
  kind: DatabaseAutoscaler
  metadata:
    name: testtest-gr3-1
  spec:
    configuration:
      pagerdutykey: xyz
      parameters:
        cpuTarget: 0.7
        maxNodes: 100
        minNodes: 5
      strategy: StandardStrategy
      version: v1alpha1
    databaseName: >-
      //bigtable.googleapis.com/projects/my-project/instances/testtest-bigtable-instance/clusters/testtest-gr3- ....

And here's the after diff for the same test case:

Expected $.results[0].message = 'Word 'bigtable' in instance name is redundant. Complete blocklist: bigtable,cluster,database,instance.' to equal 'Word 'instance' in instance name is redundant. Complete blocklist: bigtable,cluster,database,instance.'.
Expected $.results[1].message = 'Word 'instance' in instance name is redundant. Complete blocklist: bigtable,cluster,database,instance.' to equal 'Word 'bigtable' in instance name is redundant. Complete blocklist: bigtable,cluster,database,instance.'.

The before diff has many issues:

Displaying the whole YAML encoding is too verbose.
Worst of all, the sheer length of the YAML encoding results in a truncated diff. Notice how the "expected" and "actual" both end with .... In this test case (and a majority of cases for us), the differences weren't even in the Kubernetes objects part of the Configs objects; they were in the results part. So the displayed, truncated, YAML strings are, in fact, 100% equal. The before diff gives us zero information about differences between "expected" and "actual".
Since the whole Configs objects are rendered as YAML, Kubernetes objects and results included, the diffing only reports that there is a single difference. In the after diff, we see two differences.

From the after diff, we can conclude that the two results result[0] and result[1] were simply placed in the wrong order in the results array.

google-cla · 2021-09-16T14:38:01Z

Thanks for your pull request. It looks like this may be your first contribution to a Google open source project (if not, look below for help). Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

📝 Please visit https://cla.developers.google.com/ to sign.

Once you've signed (or fixed any issues), please reply here with @googlebot I signed it! and we'll verify it.

What to do if you already signed the CLA

Individual signers

It's possible we don't have your GitHub username or you're using a different email address on your commit. Check your existing CLA data and verify that your email is set on your git commits.

Corporate signers

Your company has a Point of Contact who decides which employees are authorized to participate. Ask your POC to be added to the group of authorized contributors. If you don't know who your Point of Contact is, direct the Google project maintainer to go/cla#troubleshoot (Public version).
The email used to register you as an authorized contributor must be the email used for the Git commit. Check your existing CLA data and verify that your email is set on your git commits.
The email used to register you as an authorized contributor must also be attached to your GitHub account.

ℹ️ Googlers: Go here for more info.

emmmile · 2021-09-16T14:45:24Z

👍 (I've worked on this with @Bj0rnen)

Bj0rnen · 2021-09-16T14:52:59Z

@googlebot I signed it!

mengqiy · 2021-09-17T19:09:41Z

@Bj0rnen and @emmmile Thanks for creating the PR!
Can you please paste the before vs after?

emmmile · 2021-09-18T11:21:42Z

@mengqiy on Monday we will restart working on this and we will provide some more information. You mean an example of how it looks with this change and how it was before?

About the typing problem, either we use load() (that has return type any) or set the type of valueOf() to any directly. I don't think we really need to reload the objects as Configs so maybe we can loosen the types a bit.

Bj0rnen · 2021-09-21T10:16:15Z

Hi @mengqiy! I have added a before vs after to the original comment. I think everything should be in order now. Does it look good from your end?

mengqiy · 2021-09-22T01:15:04Z

ts/kpt-functions/src/testing.ts

    indent: 2,
    noArrayIndent: true,
    skipInvalid: true,
    sortKeys: true,
  });
+  return safeLoad(yaml);


Why do you want to safeDump and then safeLoad?
Can we simply compare the ResourceList returned by configs.toResourceList()? Are there any downsides?

I know that it's a roundabout way to implement it. But in practice, returning configs.toResourceList() causes many test failures for us. Mainly these two types of inequalities:

Expected $.items[0].metadata not to have properties namespace: undefined

and

Expected $.items[0] to be a kind of IAMPolicyResource, but was Object({ apiVersion: 'iam.cnrm.cloud.google.com/v1alpha2', kind: 'IAMPolicy', metadata: Object({ name: 'iam-policy', namespace: 'project-level' }), spec: Object({ resourceRef: Object({ apiVersion: 'test', kind: 'Test' }), bindings: [ ] }) }).

The first kind of inequality is easy to understand. safeDump elides all undefined properties.

The second kind of inequality results from the call to deepCopy on line 102

kpt-functions-sdk/ts/kpt-functions/src/testing.ts

Line 102 in cfd7628

const configs = input.deepCopy();

I haven't dug into the implementation of Configs.deepCopy. But basically, before the equality comparison happens, the input object goes through a deepCopy and the expectedOutput object doesn't. That deepCopy is apparently not "equality-safe", however whatever differences it creates disappear again upon safeDump.

We have already relied on the semantics of post-yaml comparison in our test cases. 44/118 of our test cases break if we directly return configs.toResourceList() from valueOf. safeDump returns a string. And if strings s and t are equal (and valid YAML), then safeLoad(s) and safeLoad(t) must also be equal (safeDump is deterministic, surely). The proposed change can't break existing tests.

Alternatives

Patch the differences

I suppose it's possible to address the two types of equality differences that we have observed above. We can do a deepCopy of the o object that goes into the comparison. We can explicitly filter out undefined properties with some recursive function. Or maybe modify our kpt functions to be meticulous about this, so that all the undefineds match up.

But how can we know that these two equality differences are the only ones that could appear? Someone else might depend on other aspects of safeDump in their tests. It seems a bit arbitrary to patch over these two things specifically.

Rich YAML comparison

Maybe the expect(...).toEqual(...) can be replaced with some YAML-specific equality comparison function, which takes two YAMLs and outputs a rich diff on the strings? For example a line-based diff. Then we can go back to only using safeDump and not safeLoad. That should probably also output readable diffs.

Thanks for the detailed explanation.

mengqiy · 2021-09-24T19:25:45Z

This seems to be the best option at the moment. And it doesn't introduce any breaking changes to any user-facing functions (valueOf is an internal functions).
I guess we can safely merge it.
@Bj0rnen @emmmile Please let me know if I misunderstood anything.

emmmile · 2021-09-27T08:40:44Z

@mengqiy All our points have been well described by @Bj0rnen and I don't think you misunderstood anything :)

Bj0rnen · 2021-09-27T11:02:10Z

Yep. All good from me 👍

emmmile · 2021-09-29T14:52:46Z

@mengqiy is this going to be released automatically?

mengqiy · 2021-09-29T21:45:29Z

No. It needs some manual work to release it.
@emmmile Are you blocked by this?

emmmile · 2021-09-29T21:58:24Z

No I wouldn't say that we are blocked, but I guess would be good to understand if the new version is days/weeks/months ahead :) Thanks!

mengqiy · 2021-09-29T22:28:05Z

You can expect it in the next 7 days.

mengqiy · 2021-10-11T22:18:42Z

In case you missed it, kpt-functions@0.15.1 has been released.

Structurally compare js objects rather than YAML

604cbe6

Bj0rnen requested review from droot, etefera, frankfarzan, mengqiy and phanimarupaka as code owners September 16, 2021 14:37

google-cla bot added the cla: no Google CLA needed label Sep 16, 2021

google-cla bot added cla: yes Google CLA OK and removed cla: no Google CLA needed labels Sep 16, 2021

Correct the return type in the signature

e69a55b

Correct the return type in the signature (again)

cfd7628

mengqiy reviewed Sep 22, 2021

View reviewed changes

mengqiy approved these changes Sep 24, 2021

View reviewed changes

mengqiy merged commit 37e38c1 into GoogleContainerTools:master Sep 28, 2021

Bj0rnen deleted the patch-1 branch September 29, 2021 14:08

mengqiy mentioned this pull request Oct 8, 2021

kpt-functions@0.15.1 #426

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Structurally compare js objects rather than YAML #404

Structurally compare js objects rather than YAML #404

Bj0rnen commented Sep 16, 2021 •

edited

Loading

google-cla bot commented Sep 16, 2021

emmmile commented Sep 16, 2021

Bj0rnen commented Sep 16, 2021

mengqiy commented Sep 17, 2021

emmmile commented Sep 18, 2021 •

edited

Loading

Bj0rnen commented Sep 21, 2021

mengqiy Sep 22, 2021 •

edited

Loading

Bj0rnen Sep 23, 2021

mengqiy Sep 24, 2021

mengqiy commented Sep 24, 2021

emmmile commented Sep 27, 2021

Bj0rnen commented Sep 27, 2021

emmmile commented Sep 29, 2021

mengqiy commented Sep 29, 2021

emmmile commented Sep 29, 2021

mengqiy commented Sep 29, 2021

mengqiy commented Oct 11, 2021

Structurally compare js objects rather than YAML #404

Structurally compare js objects rather than YAML #404

Conversation

Bj0rnen commented Sep 16, 2021 • edited Loading

Before vs after

google-cla bot commented Sep 16, 2021

What to do if you already signed the CLA

Individual signers

Corporate signers

emmmile commented Sep 16, 2021

Bj0rnen commented Sep 16, 2021

mengqiy commented Sep 17, 2021

emmmile commented Sep 18, 2021 • edited Loading

Bj0rnen commented Sep 21, 2021

mengqiy Sep 22, 2021 • edited Loading

Choose a reason for hiding this comment

Bj0rnen Sep 23, 2021

Choose a reason for hiding this comment

Alternatives

Patch the differences

Rich YAML comparison

mengqiy Sep 24, 2021

Choose a reason for hiding this comment

mengqiy commented Sep 24, 2021

emmmile commented Sep 27, 2021

Bj0rnen commented Sep 27, 2021

emmmile commented Sep 29, 2021

mengqiy commented Sep 29, 2021

emmmile commented Sep 29, 2021

mengqiy commented Sep 29, 2021

mengqiy commented Oct 11, 2021

Bj0rnen commented Sep 16, 2021 •

edited

Loading

emmmile commented Sep 18, 2021 •

edited

Loading

mengqiy Sep 22, 2021 •

edited

Loading