Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Schema descriptor with path property raises a "resource-error" #1688

Closed
pierrecamilleri opened this issue Sep 23, 2024 · 2 comments · Fixed by #1690
Closed

Schema descriptor with path property raises a "resource-error" #1688

pierrecamilleri opened this issue Sep 23, 2024 · 2 comments · Fixed by #1690
Labels
bug Something isn't working

Comments

@pierrecamilleri
Copy link
Collaborator

pierrecamilleri commented Sep 23, 2024

Overview

If a schema descriptor has a path property, validating the descriptor with python code returns a "resource-error". At the command line, the file is considered valid.

For the context, the path property is defined for a data resource, but not in the Table Schema specification. It is also not in the Schema class properties.

I have no clue what is going on here.

Reproduce

Schema descriptor in file "schema.json" :

{
  "name": "schema",
  "path": "abc",
  "fields": [
    {
      "name": "id",
      "type": "integer"
    }
  ]
}
import frictionless

schema = frictionless.Schema("./schema.json")
descr = schema.to_descriptor()
print(frictionless.validate(descr, type="schema"))
# {'valid': False,
#  'stats': {'tasks': 0, 'errors': 1, 'warnings': 0, 'seconds': 0},
#  'warnings': [],
#  'errors': [{'type': 'resource-error',
#              'title': 'Resource Error',
#              'description': 'A validation cannot be processed.',
#              'message': "The data resource has an error: [{'name': 'id', "
#                         "'type': 'integer'}] is not of type 'integer' at "
#                         "property 'fields'",
#              'tags': [],
#              'note': "[{'name': 'id', 'type': 'integer'}] is not of type "
#                      "'integer' at property 'fields'"}],
#  'tasks': []}

What I tested

  • Validating at the command line frictionless validate --type=schema --json schema.json returns VALID
  • Validating with file path returns 'valid': True
  • Changing the name of path to patch does not reproduce the behavior.
  • Changing the path value to an URL does not solve the issue.
@pierrecamilleri pierrecamilleri added the bug Something isn't working label Sep 23, 2024
@pierrecamilleri
Copy link
Collaborator Author

Short investigation shows Resource is the culprit : it has a path property and a fields: Optional[int] = None which explains the error.

@pierrecamilleri
Copy link
Collaborator Author

pierrecamilleri commented Sep 23, 2024

Wow there is some magic going on there.

So :

  • resource Factory class calls Detector.detect_metadata_type(path, format=options.get("format")) (with path = source a couple of lines above), despite the type being explicitely set.
  • The detect_metadata_type uses METADATA_TRAITS to select the type, and ...
  • Detects the Resource type if path is present : 
    ...
     "resource": {
        "names": ["resource.json", "resource.yaml"],
        "props": ["path", "data"],
    },
    ...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant