-
Notifications
You must be signed in to change notification settings - Fork 3.2k
Description
Describe the bug
When ingesting a JSON Schema using the json-schema
source type, the ingestion process fails to detect and handle $ref loops correctly. This results in an infinite loop during import, and the process never completes.
To Reproduce
Run an ingestion of recipe for source type json-schema
with the following stripped down schema:
{
"$schema": "http://json-schema.org/draft-07/schema#",
"$id": "https://schema.com/schema/1.0",
"type": "object",
"properties": {
"test": {
"$ref": "#/definitions/condition"
}
},
"required": [
"test"
],
"definitions": {
"condition": {
"type": "object",
"properties": {
"condition": {
"$ref": "#/definitions/condition"
}
}
}
}
}
Expected behavior
The ingestion process should detect the $ref
loop and either:
- Resolve it safely (e.g., by limiting recursion depth), or
- Fail gracefully with a clear error message indicating the loop.
Actual behavior
The ingestion process enters an infinite loop and does not complete.
Desktop (please complete the following information):
- Docker Image: acryldata/datahub-ingestion:v1.1.0
- DataHub version: v1.2.0
Additional context
This kind of $ref loop is valid in some schema use cases (e.g., recursive structures), and many JSON Schema parsers handle it by tracking visited references. It would be helpful if DataHub could implement similar handling or provide guidance on supported schema patterns.