Skip to content

Conversation

ntBre
Copy link
Contributor

@ntBre ntBre commented Jul 3, 2025

Summary

This was originally stacked on #19129, but some of the changes I made for JSON also impacted the Azure format, so I went ahead and combined them. The main changes here are:

  • Implementing FileResolver for Ruff's EmitterContext
  • Adding FileResolver::notebook_index and FileResolver::is_notebook methods
  • Adding a DisplayDiagnostics (with an "s") type for rendering a group of diagnostics at once
  • Adding Azure, Json, and JsonLines as new DiagnosticFormats

I tried a couple of alternatives to the FileResolver::notebook methods like passing down the NotebookIndex separately and trying to reparse a Notebook from Ruff's SourceFile. The latter seemed promising, but the SourceFile only stores the concatenated plain text of the notebook, not the re-parsable JSON. I guess the current version is just a variation on passing the NotebookIndex, but at least we can reuse the existing resolver argument. I think a lot of this can be cleaned up once Ruff has its own actual file resolver.

As suggested, I also tried deleting the corresponding Emitter files in ruff_linter, but it doesn't look like git was able to follow this as a rename. It did, however, track that the tests were moved, so the snapshots should be easy to review.

Test Plan

Existing Ruff tests ported to tests in ruff_db. I think some other existing ruff tests also cover parts of this refactor.

@ntBre ntBre added ty Multi-file analysis & type inference diagnostics Related to reporting of diagnostics. labels Jul 3, 2025
Copy link
Contributor

github-actions bot commented Jul 3, 2025

mypy_primer results

No ecosystem changes detected ✅
No memory usage changes detected ✅

Copy link
Contributor

github-actions bot commented Jul 3, 2025

ruff-ecosystem results

Linter (stable)

✅ ecosystem check detected no linter changes.

Linter (preview)

✅ ecosystem check detected no linter changes.

Formatter (stable)

✅ ecosystem check detected no format changes.

Formatter (preview)

✅ ecosystem check detected no format changes.

@ntBre ntBre force-pushed the brent/json-rendering branch from 6601b4c to 200c1a7 Compare July 4, 2025 16:03
@ntBre ntBre force-pushed the brent/json-rendering branch from 200c1a7 to c3e3077 Compare July 4, 2025 16:04
@ntBre ntBre changed the base branch from brent/diagnostic-rendering to main July 4, 2025 16:05
@ntBre ntBre changed the title [WIP] Render JSON output with the new diagnostics Render Azure, JSON, and JSON lines output with the new diagnostics Jul 4, 2025
@ntBre ntBre marked this pull request as ready for review July 4, 2025 16:19
@ntBre
Copy link
Contributor Author

ntBre commented Jul 10, 2025

I introduced some intentional typos to see if any integration tests fail, and we have one for JSON (ruff::integration_test::stdin_json) but not for JSON lines or Azure. I'll write some CLI tests.

ntBre added a commit that referenced this pull request Jul 10, 2025
Summary
--

I spun this off from #19133 to be sure to get an accurate baseline before
modifying any of the formats. I picked the code snippet to include a lint
diagnostic with a fix, one without a fix, and one syntax error. I'm happy to
expand it if there are any other kinds we want to test.

Test Plan
--

New CLI tests
Copy link
Member

@BurntSushi BurntSushi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! Feel free to ignore/disagree with my nits about json!. :-)

"end_location": location_to_json(end_location.unwrap_or_default()),
"filename": filename.unwrap_or_default(),
"noqa_row": noqa_location.map(|location| location.line)
})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of curiosity, how come the json! macro instead of types? I'd imagine json! to be slower since it has to shove everything into the serde_json::Value type, but I guess I'd also imagine it probably doesn't matter here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just pulled these over from the current Ruff version, so I'm not too sure. @MichaReiser might know, it looks like there were types deriving Serialize before #3895.

json! is kind of handy here for the preview behavior, but otherwise I'd lean toward types too.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't. Wasn't it already like this before my PR and all I did was splitting the code into different modules.

I'm generally in favor of structs. It makes accidental schema changes easier to spot

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yeah sorry, I didn't realize this was copied code. I thought it was newly written.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No worries, I should have pointed it out to help with the review. I opened #19270 to explore using structs.


s.end()
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like if you're using the json! macro everywhere already, it's probably not worth a hand-impl of the Serialize trait? There's a bit more ceremony with it compared to just building a Value::Array(Vec<Value>) directly.

ntBre added a commit that referenced this pull request Jul 10, 2025
Summary
--

I spun this off from #19133 to be sure to get an accurate baseline
before modifying any of the formats. I picked the code snippet to
include a lint diagnostic with a fix, one without a fix, and one syntax
error. I'm happy to expand it if there are any other kinds we want to
test.

I initially passed `CONTENT` on stdin, but I was a bit surprised to
notice that some of our output formats include an absolute path to the
file. I switched to a `TempDir` to use the `tempdir_filter`.

Test Plan
--

New CLI tests
ntBre added a commit that referenced this pull request Jul 11, 2025
## Summary

See #19133 (comment)
for recent discussion. This PR moves to using structs for the types in
our JSON output format instead of the `json!` macro.

I didn't rename any of the `message` references because that should be
handled when rebasing #19133 onto this.

My plan for handling the `preview` behavior with the new diagnostics is
to use a wrapper enum. Something like:

```rust
#[derive(Serialize)]
#[serde(untagged)]
pub(crate) enum JsonDiagnostic<'a> {
    Old(OldJsonDiagnostic<'a>),
}

#[derive(Serialize)]
pub(crate) struct OldJsonDiagnostic<'a> {
    // ...
}
```

Initially I thought I could use a `&dyn Serialize` for the affected
fields, but I see that `Serialize` isn't dyn-compatible in testing this
now.

## Test Plan

Existing tests. One quirk of the new types is that their fields are in
alphabetical order. I guess `json!` sorts the fields alphabetically? The
tests were failing before I sorted the struct fields.

## Other formats

It looks like the `rdjson`, `sarif`, and `gitlab` formats also use
`json!`, so if we decide to merge this, I can do something similar for
those before moving them to the new diagnostic format.
ntBre added 3 commits July 11, 2025 11:06
the remaining `message` fields are in the actual JSON schema and seem
appropriate. they aren't referring to the old internal `Message` implemenation,
that's just a reasonable field name for the diagnostic's main message.
}
};

json!(value)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels a bit silly, but callers of this function expect a Display type. This avoids having to update the callers to use serde_json::to_writer, which in turn avoids updating their arguments from std::fmt::Formatters to something implementing std::io::Write and their return types to something that can accommodate a serde_json::Error instead of a std::fmt::Error.

We could also just call serde_json::to_value and unwrap it here or in the callers, which is all this macro expands into.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would probably move the json! call to where you need the Display implementation.

Comment on lines +86 to +87
"column": 1,
"row": 5
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is actually correct based on testing a released version of Ruff against the real notebook. I think the manual notebook cell index was outdated.

Comment on lines 225 to 247
enum JsonDiagnostic<'a> {
Old {
cell: Option<OneIndexed>,
code: Option<&'a SecondaryCode>,
end_location: JsonLocation,
filename: &'a str,
fix: Option<JsonFix<'a>>,
location: JsonLocation,
message: &'a str,
noqa_row: Option<OneIndexed>,
url: Option<String>,
},
New {
cell: Option<OneIndexed>,
code: Option<&'a SecondaryCode>,
end_location: Option<JsonLocation>,
filename: Option<&'a str>,
fix: Option<JsonFix<'a>>,
location: Option<JsonLocation>,
message: &'a str,
noqa_row: Option<OneIndexed>,
url: Option<String>,
},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using an enum here strikes me a bit weird. I would either use two separate structs or just ignore the old entirely and ensure that the values are all Some where the change isn't backwards compatible.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, for some reason just wrapping the unwrap_or_default calls in Some didn't occur to me. That makes a lot more sense. Thanks!

@ntBre
Copy link
Contributor Author

ntBre commented Jul 11, 2025

I'm going to merge this and start working on more output formats, happy to follow up if I missed anything. Thanks for the reviews!

@ntBre ntBre merged commit b5c5f71 into main Jul 11, 2025
37 checks passed
@ntBre ntBre deleted the brent/json-rendering branch July 11, 2025 19:04
ntBre added a commit that referenced this pull request Jul 15, 2025
## Summary

Another output format like #19133. This is the
[reviewdog](https://github.com/reviewdog/reviewdog) output format, which
is somewhat similar to regular JSON. Like #19270, in the first commit I
converted from using `json!` to `Serialize` structs, then in the second
commit I moved the module to `ruff_db`.

The reviewdog
[schema](https://github.com/reviewdog/reviewdog/blob/320a8e73a94a09248044314d8ca326a6cd710692/proto/rdf/jsonschema/DiagnosticResult.json)
seems a bit more flexible than our JSON schema, so I'm not sure if we
need any preview checks here. I'll flag the places I wasn't sure about
as review comments.

## Test Plan

New tests in `rdjson.rs`, ported from the old `rjdson.rs` module, as
well as the new CLI output tests.

---------

Co-authored-by: Micha Reiser <micha@reiser.io>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
diagnostics Related to reporting of diagnostics. internal An internal refactor or improvement
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants