-
Notifications
You must be signed in to change notification settings - Fork 0
feat: Add optional debug output #223
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
| if os.getenv("CDP_DEBUG"): | ||
| rprint("", properties) | ||
| rprint(*issues) | ||
| rprint(explain(issues)) | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To activate, you can run a script like this:
CDP_DEBUG=true uv run joel_checking.py I tried fancier solutions with the rich logging module, but it was impossible for me to find a way that pytest would pipe through the colors of the output from rich when using the logging handler. With printing it "just works" (tm).
There is also no built-in way to pass a debug level flag to Python's logging module, so we would en up reading an env variable or making our script accept a command line parameter, which both seemed like unnecessary code.
As an example of output, let's run this command:
CDP_DEBUG=true uv run python -c 'import check_datapackage as cdp; cdp.check({"resources": [{"title": "Title"}]})'which gives us:
It is easy to inspect the two error message from explain this way. In this case, the first one seems to make sense whereas the second doesn't make it clear what went wrong (it seems like we passed something to name so why is it telling us it didn't). Follow up on @martonvago 's comment in #208 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I put a comment on the issue about this too 😅
10ea1b0 to
1352ccc
Compare
| def pytest_report_teststatus(report, config): | ||
| if os.getenv("CDP_DEBUG"): | ||
| if report.when == "call": | ||
| # Add newlines to separate test results | ||
| category = report.outcome | ||
| shortletter = "\n\n" # dot / F / X / etc. | ||
| verbose = "\n\n" # ("PASSED", "FAILED", ...) | ||
|
|
||
| return category, shortletter, verbose | ||
|
|
||
| return None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can already get debug output from tests by running them with:
CDP_DEBUG=true uv run pytest -sv tests/test_check.py However, it gets hard to see what output belongs to which tests since there is no separation:
("PASSED" belongs to the previous test and tests melts into each other)
With this little code snippet in conftest.py, the output is easier to read:
(I'm thinking we don't need the test status since we would just use this to look at formatting but it could easily be added back in)
|
@seedcase-project/developers I'll mark as ready for review to get some early thoughts around whether this will be useful and if you prefer a different approach. |
martonvago
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is very helpful to get a general idea about the types of messages!
I cannot say if we want the debug option actually to be part of the released thing, I leave that to Luke :P .
About the wording of the messages (not sure if you're interested in that at this point, but I think Luke wanted to rewrite them):
One thing to note is that while there are many tests, they are testing cdp-specific behaviour and not the jsonschema output. So there are presumably more message types than what shows up in the tests. They format their messages based on the JSON schema validator (their validator, our type), so getting a list of validators that are used in the Data Package JSON schema could also give us an idea of how many message types there are.
But yeah, the answer is likely a bit too many for this to be a pleasant task.
Now, jsonschema can be customised in various ways, including using a custom validator to check the input object. In a custom validator it should be possible to tweak the behaviour of each individual JSON schema rule (so minItems, required, type, etc.) and redefine the messages. I don't know if this would be easier to do than remapping the default messages after validation, but it's something to explore.
| if os.getenv("CDP_DEBUG"): | ||
| rprint("", properties) | ||
| rprint(*issues) | ||
| rprint(explain(issues)) | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I put a comment on the issue about this too 😅
Description
I think we need to go through many types of error messages manually and see if the way the error is presented via
explainis helpful. This debug option helps with that by always printing the properties dict, its issues, and the output fromexplain. It also enables us to easily do this on all test messages.Here is an example of an error message where all the
^^^^^are not that helpful:This is now easy to detect by running the test suite with the debug option enabled, but we still need to make decision about what to do with these errors by opening a new issue for each one of them and discuss the best solutions.
Needs an in-depth review.
Checklist
just run-all