Skip to content

fix bundle producing schema with corrupted refs in some cases #62

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jan 11, 2018
Merged

fix bundle producing schema with corrupted refs in some cases #62

merged 1 commit into from
Jan 11, 2018

Conversation

RomanHotsiy
Copy link
Contributor

@RomanHotsiy RomanHotsiy commented Dec 6, 2017

Hey @BigstickCarpet,

ReDoc users found an interesting issue in json-schema-ref-parser. This is hard to explain so I will just show an example. If we have two files:

index.yaml

paths:
  $ref: '#/components'
external:
  go:
    deeper:
      $ref: other.yaml#/thing'
components:
  test:
    $ref: 'other.yaml#/thing'

and other.yaml:

thing:
  type: string

bundle produced the following output:

{
  "paths": {
    "$ref": "#/components"
  },
  "external": {
    "go": {
      "deeper": {
        "$ref": "#/paths/test" // <--- broken ref here
      }
    }
  },
  "components": {
    "test": {
      "type": "string"
    }
  }
}

It actually depends on the deep level of the ref, I believe it has something to sorting order.

I've fixed this in this PR. The fix seems reasonable to me but I'm not sure if this is correct so please check it carefully. Also added test case but not sure if the names are so good.

@coveralls
Copy link

coveralls commented Dec 6, 2017

Coverage Status

Coverage increased (+0.04%) to 94.842% when pulling 7b55cfd0f7f2f5287ac338a062592dc9fcbfab2d on RomanGotsiy:fix-broken-refs into 707e626 on BigstickCarpet:master.

@coveralls
Copy link

coveralls commented Dec 6, 2017

Coverage Status

Coverage increased (+0.04%) to 94.842% when pulling 7b55cfd0f7f2f5287ac338a062592dc9fcbfab2d on RomanGotsiy:fix-broken-refs into 707e626 on BigstickCarpet:master.

@RomanHotsiy
Copy link
Contributor Author

Sorry, just noticed that the fix doesn't solve all the cases.

I've investigated a bit deeper and it seems something is wrong with sorting here: https://github.com/BigstickCarpet/json-schema-ref-parser/blob/master/lib/bundle.js#L161
I will investigate further, will update this PR and then ping you.

@coveralls
Copy link

coveralls commented Dec 7, 2017

Coverage Status

Coverage increased (+0.007%) to 94.813% when pulling ad183641e760f5e8cdf280ed6fad06886d057684 on RomanGotsiy:fix-broken-refs into 707e626 on BigstickCarpet:master.

@RomanHotsiy
Copy link
Contributor Author

Seems that sorting is not stable. Check out the following two screen-shots of console.table-ing
inventory array at this point: https://github.com/BigstickCarpet/json-schema-ref-parser/blob/master/lib/bundle.js#L166

image

here is same spec, just with one more reference added:

image

Check the order of item 3 and 4 at the first screenshot and 4 and 5 at the second: while all the details are the same except pathFromRoot they are in different order (401 first, 403 next and vice versa).
Because the pathFromRoot for all the refs is calculated in order of visiting corresponding items in inventory array should be ordered in the same order to not produce wrong pointers.
Updated my PR.

I wasn't able to minimize this use case to small reproducible sample (as it depends somehow on refs number in the array and is just crazy). But here is a repo with the spec that reproduces issue: https://github.com/mikunn/openapi-test-refs

Sorry, maybe I am not clear enough, but this issue is hard to explain for me :(

@JamesMessinger
Copy link
Member

Thanks for finding and debugging this, @RomanGotsiy. I'll take a look at it as soon as I get a chance (probably next week). As long as it doesn't break existing tests/functionality, I'll merge it.

@coveralls
Copy link

coveralls commented Dec 7, 2017

Coverage Status

Coverage increased (+0.007%) to 94.813% when pulling 7935a8e on RomanGotsiy:fix-broken-refs into 707e626 on BigstickCarpet:master.

@RomanHotsiy
Copy link
Contributor Author

🙌 , did you have a chance to look into this?

@JamesMessinger
Copy link
Member

Sorry, @RomanGotsiy - This week has been crazy. But I'm off next week for the holidays, so I'll have time to take a look at this.

@VatroslavByte
Copy link

VatroslavByte commented Jan 11, 2018

From what I figured, and what works in my case, a simple fix is needed.
In bundle.js line 175

else if (i.file === file && i.hash === hash) {
      // This $ref points to the same value as the prevous $ref, so remap it to the same path
      i.$ref.$ref = pathFromRoot;
}

pathFromRoot should be replaced with hash

else if (i.file === file && i.hash === hash) {
      // This $ref points to the same value as the prevous $ref, so remap it to the same path
      i.$ref.$ref = hash;
}

(Sorry, I'm not very familiar with pull requests)

@JamesMessinger JamesMessinger merged commit 4ee64b2 into APIDevTools:master Jan 11, 2018
@JamesMessinger
Copy link
Member

JamesMessinger commented Jan 11, 2018

@RomanGotsiy and @VatroslavByte, thank you both for your input on this. Here's my analysis:

The behavior was actually by design. The bundle() method produces a single document that only contains internal $refs. It makes no guarantees that those $refs don't reference other $refs. In fact, it can't guarantee that, since that's the only way to support circular references, self-references, etc.

If you look at @RomanGotsiy's example. The bundled output is actually valid:

{
  "paths": {
    "$ref": "#/components"
  },
  "external": {
    "go": {
      "deeper": {
        "$ref": "#/paths/test"   // <--- This is valid
      }
    }
  },
  "components": {
    "test": {
      "type": "string"
    }
  }
}

The reference to #/paths/test is valid. It points to #/paths, which points to #/components, which has a test property. Obviously, a human would have replaced #/paths/test with #/components/test instead, since it's more direct. So we can say that the desired behavior is for references to be direct if possible, rather than indirect.

@RomanGotsiy's PR produces a direct reference in this particular scenario, which is an improvement. However, the logic relies on the sequence order of the nodes in the document, so it can still produce indirect references depending on the order of nodes.

I'll experiment with the code a bit today and see if I can come up with a more reliable method of preventing indirect references. This PR serves as a good starting point for that. 👍

JamesMessinger added a commit that referenced this pull request Jan 11, 2018
…inimize the number of indirect references in the bundled output (see #62 (comment))
@JamesMessinger
Copy link
Member

@RomanGotsiy and @VatroslavByte - 👉 I need your help to test this, please 👈

I've modified the bundle() logic to account for the number of indirections that are required to resolve each $ref. This change should cause the bundle() method to use the most direct reference whenever possible, which should fix the issue you were seeing.

I've published a beta version of json-schema-ref-parser to npm that contains these changes. You'll need to install version 4.1.0-beta.1, which you can do by running the following command:

npm install json-schema-ref-parser@beta

Please give this a try and let me know if it works for your use-cases. I really appreciate your help and feedback.

@VatroslavByte
Copy link

The beta didn't fix the issue I was having.

The problem is that I was referencing objects through root json file, rather than directly referencing the json file where the object actually is.
And I can see why it is a problem once you introduce circular dependencies.

The behavior that I was expecting is that it would merge root json with all objects from external files, and refs in external files would be replaced with shortest path to that object in root file.

Example:
In file "models/Event.json" I have a ref to object "../root.json#/components/schemas/SportId" (SportId object is also in a separate file), and what I'm expecting is that it is replaced with "#/components/schemas/SportId" once "Event.json" is merged into "root.json".

The funny thing is, I created an test app that is literally copy/paste of my existing app (same code, jsons and json-schema-ref-parser version), and I get the output which I described in the example above.
How is it possible to have different outputs with same jsons?

NOTE:
When I reference the objects directly where the files are (not through root file), everything works fine.

@JamesMessinger
Copy link
Member

@VatroslavByte - thanks for your feedback. I tried reproducing the issue that you mentioned, but was unable to. Here is my reproduction. If you look at the output, you'll see that the reference to root.json#/components/schemas/SportId is replaced with #/components/schemas/SportId as expected. I get the same result with both the old code (v4.0.4) and the new code (v4.1.0-beta.1).

Perhaps I'm missing something and failing to accurately reproduce your example. If you could post reproduction code, similar to mine above, I'd really appreciate it.

@VatroslavByte
Copy link

Well, I'm also having trouble reproducing it. That's why I mentioned that it works in one app and not the other.

I'll try and figure out why that is and whip up an example for you these days.

@JamesMessinger
Copy link
Member

I decided to go ahead and release this change publicly as version 4.1.0, since it definitely fixes the original problem that was pointed out by @RomanGotsiy, and all of my tests show that it improves the algorithm without breaking any existing functionality.

@RomanHotsiy
Copy link
Contributor Author

Hey @BigstickCarpet,

Sorry for the long reply. This seems awesome and is fixing my original problem. Thanks so much!

Also, thanks for the explanation that the produced references are valid. Although I can't find anything about this behavior in specs (would be really great if you can point me to), I checked a few JSON schema implementations and they actually work fine with transitive refs like those.

@brettstack
Copy link

@BigstickCarpet any plans to fix this for all scenarios? Greatly appreciated.

@JamesMessinger
Copy link
Member

@brettstack - Not sure what you mean. Is there a specific scenario where it doesn't work? If so, can you post a repro? AFAIK, this fix solves the problem for all scenarios that I've tested.

@brettstack
Copy link

brettstack commented Jan 24, 2018

Simplest repro I could make:

definitions:
  Resource:
    properties:
      resourceId:
        $ref: '#/definitions/indirectId'
  indirectId:
    $ref: '#/definitions/id'
  id:
    title: ID

yields

{
    "definitions": {
        "Resource": {
            "properties": {
                "resourceId": {
                    "$ref": "#/definitions/id"
                }
            }
        },
        "indirectId": {
            "$ref": "#/definitions/id"
        },
        "id": {
            "title": "ID"
        }
    }
}

Expected definitions.Resource.properties.resourceId.$ref to be '#/definitions/indirectId'

@JamesMessinger
Copy link
Member

@brettstack - In your example above, that's exactly the output that I would expect, especially after applying this fix. #/definitions/indirectId is an indirect reference, whereas #/definitions/id is a direct reference, so the bundle() method will choose #/definitions/id, since it's the most direct reference.

@brettstack
Copy link

Thanks. What if I wanted to keep the indirect references? Is that possible?

@JamesMessinger
Copy link
Member

@brettstack - If you want to leave the references as-is, rather than re-mapping them, then just call the .resolve() method instead of .bundle(). The .bundle() method is specifically for re-mapping references.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants