[ty] Remap Jupyter notebook cell indices in `ruff_db` #19698

ntBre · 2025-08-01T20:44:16Z

Summary

This PR remaps ranges in Jupyter notebooks from simple row:column indices in the concatenated source code to cell:row:col to match Ruff's output. This is probably not a likely change to land upstream in annotate-snippets, but I didn't see a good way around it.

The remapping logic is taken nearly verbatim from here:

ruff/crates/ruff_linter/src/message/text.rs

Lines 212 to 222 in cd6bf14

    
           // If we're working with a Jupyter Notebook, skip the lines which are 
        
           // outside of the cell containing the diagnostic. 
        
           if let Some(index) = self.notebook_index { 
        
               let content_end_cell = index.cell(content_end_index).unwrap_or(OneIndexed::MIN); 
        
               while end_index > content_end_index { 
        
                   if index.cell(end_index).unwrap_or(OneIndexed::MIN) == content_end_cell { 
        
                       break; 
        
                   } 
        
                   end_index = end_index.saturating_sub(1); 
        
               } 
        
           }

Test Plan

New full rendering test for a notebook

I was mainly focused on Ruff, but in local tests this also works for ty:

error[invalid-assignment]: Object of type `Literal[1]` is not assignable to `str`
 --> Untitled.ipynb:cell 1:3:1
  |
1 | import math
2 |
3 | x: str = 1
  | ^
  |
info: rule `invalid-assignment` is enabled by default

error[invalid-assignment]: Object of type `Literal[1]` is not assignable to `str`
 --> Untitled.ipynb:cell 2:3:1
  |
1 | import math
2 |
3 | x: str = 1
  | ^
  |
info: rule `invalid-assignment` is enabled by default

This isn't a duplicate diagnostic, just an unimaginative example:

# cell 1
import math

x: str = 1
# cell 2
import math

x: str = 1

ntBre · 2025-08-01T20:46:59Z

crates/ruff_annotate_snippets/src/renderer/display_list.rs

+pub(crate) enum Position {
+    RowCol(usize, usize),
+    Cell(usize, usize, usize),
+}


I guess another option here is

struct Position { row: usize, col: usize, cell: Option<usize>, }

or even just adding the Option<usize> to the tuple we had before. That could deduplicate a little code in the match above.

I think I'd prefer using an Option with named fields.

Yeah I like the Option better here as well. Or the very least, enum variants with named fields. (i.e., Which usize is which?)

That sounds good. I think the old code had the names backwards even with only two fields in the tuple, so named fields will be helpful.

ruff/crates/ruff_annotate_snippets/src/renderer/display_list.rs

Lines 252 to 257 in af8587e

if let Some((col, row)) = pos {

buffer.append(line_offset, ":", stylesheet.none);

buffer.append(line_offset, col.to_string().as_str(), stylesheet.none);

buffer.append(line_offset, ":", stylesheet.none);

buffer.append(line_offset, row.to_string().as_str(), stylesheet.none);

}

github-actions · 2025-08-01T20:54:30Z

`ruff-ecosystem` results

Linter (stable)

✅ ecosystem check detected no linter changes.

Linter (preview)

✅ ecosystem check detected no linter changes.

MichaReiser · 2025-08-01T21:32:50Z

crates/ruff_annotate_snippets/src/renderer/display_list.rs

+pub(crate) enum Position {
+    RowCol(usize, usize),
+    Cell(usize, usize, usize),
+}


I think I'd prefer using an Option with named fields.

MichaReiser · 2025-08-01T21:34:09Z

crates/ruff_annotate_snippets/src/snippet.rs

+
+    /// The optional cell index in a Jupyter notebook, used for reporting source locations along
+    /// with the ranges on `annotations`.
+    pub(crate) cell_index: Option<usize>,


Hmm, we might need to add this to the Annotation because a Diagnostic with multiple ranges may span multiple cells. Or is this already what snippet represents. A single code frame?

I believe this is what a snippet represents, based on our RenderableSnippet::new docs:

ruff/crates/ruff_db/src/diagnostic/render.rs

Lines 578 to 579 in 5f149fc

/// Callers should guarantee that the `input` on every `ResolvedAnnotation`

/// given is identical.

and this usage in the same function:

ruff/crates/ruff_db/src/diagnostic/render.rs

Lines 593 to 594 in 5f149fc

let diagnostic_source = &anns[0].diagnostic_source;

let source = diagnostic_source.as_source_code();

Oh, I guess this is documented just above this too:

ruff/crates/ruff_annotate_snippets/src/snippet.rs

Lines 67 to 70 in 5f149fc

/// One `Snippet` is meant to represent a single, continuous,

/// slice of source code that you want to annotate.

#[derive(Debug)]

pub struct Snippet<'a> {

/// One `Snippet` is meant to represent a single, continuous, /// slice of source code that you want to annotate.

I'm not sure this is sufficient. You could have one snippet with multiple annotations but there's no guarantee that all annotations belong to the same cell and this would also be hard to guarantee in the linter (or formatter) because they run on the concatenated notebook source.

That's why I think that each annotation might need to have its own cell in addition to the snippet itself.

in addition to the snippet itself

Ah, okay I think I see what you mean. It turns out we have an existing mechanism that helps here. If the context windows for the annotations don't overlap, the secondary annotation gets a small sub-header, which already works for our notebooks:

error[unused-import]: `os` imported but unused --> notebook.ipynb:cell 1:2:8 | 1 | # cell 1 2 | import os | ^^ | ::: notebook.ipynb:cell 3:4:5 | 2 | def foo(): 3 | print() 4 | x = 1 | - second cell | help: Remove unused import: `os`

The - second cell line is the secondary annotation with a very short underline.

This doesn't currently trigger if the context windows overlap:

error[unused-import]: `os` imported but unused --> notebook.ipynb:cell 1:2:8 | 1 | # cell 1 2 | import os | ^^ 3 | # cell 2 4 | import math | ---- second cell 5 | 6 | print('hello world') | help: Remove unused import: `os`

So there may be a very neat solution here without having to modify the Snippet or Annotation representation. I might just need better bookkeeping of the cell indices when computing context windows. We already truncate the first cell's context in the first example to fit within the cell, so I think this is all that's missing.

Oh I see. And we also never print the cell if they do overlap. So maybe that's fine as is? But it probably makes sense to add a test demonstrating the behavior at least

I think we should render something like this in the second case or the line numbers are a bit misleading:

error[unused-import]: `os` imported but unused --> notebook.ipynb:cell 1:2:8 | 1 | # cell 1 2 | import os | ^^ | ::: notebook.ipynb:cell 2:2:8 | 1 | # cell 2 2 | import math | ---- second cell 3 | 4 | print('hello world') | help: Remove unused import: `os`

That's the test I'm working on passing now.

Oh, that makes sense!

MichaReiser · 2025-08-02T08:54:24Z

crates/ruff_db/src/diagnostic/render.rs

-            context,
-            anns.iter().map(|ann| ann.line_end).max().unwrap(),
-        );
+        let content_start_index = anns.iter().map(|ann| ann.line_start).min().unwrap();


Can you extend the documentation to account for notebooks?

MichaReiser · 2025-08-02T08:55:55Z

Thanks for working on this. Adding this to the rendering will also improve ty's notebook support :)

ntBre · 2025-08-04T16:36:09Z

This should be ready for another look. We now always render additional annotations in different cells with their own sub-headings, even if the concatenated source lines are adjacent:

error[unused-import]: `os` imported but unused
 --> notebook.ipynb:cell 1:2:8                
  |                                           
1 | # cell 1                                  
2 | import os                                 
  |        ^^                                 
  |                                           
 ::: notebook.ipynb:cell 2:2:8                
  |                                           
1 | # cell 2                                  
2 | import math                               
  |        ---- second cell                   
3 |                                           
4 | print('hello world')                      
  |                                           
help: Remove unused import: `os`

in this concatenated source:

# cell 1            
import os           
# cell 2            
import math         
                    
print('hello world')

I double-checked that we reuse the same snippet if they are in the same cell too:

error[test-diagnostic]: main diagnostic message
 --> notebook.ipynb:cell 2:2:8                 
  |                                            
1 | # cell 2                                   
2 | import math                                
  |        ^^^^ second cell                    
3 |                                            
4 | print('hello world')                       
  | ----- print statement                      
  |                                            
help: Remove `print` statement

I also updated the docs and Position representation, as suggested!

BurntSushi

I buy what you're selling!

MichaReiser

Nice, this is great!

we want the first new test diagnostic to match the second. i.e. something like this: ``` error[unused-import]: `os` imported but unused --> notebook.ipynb:cell 1:2:8 | 1 | # cell 1 2 | import os | ^^ | ::: notebook.ipynb:cell 2:2:8 | 1 | # cell 2 2 | import math | ---- second cell 3 | 4 | print('hello world') | help: Remove unused import: `os` ```

I thought this was unreachable when adding it (#19644 (comment)), but obviously it's tested here! Without the spacing tweak, the snapshots were rendering like this: ``` error[unused-import][*] : `os` imported but unused error[unused-import][*] : `math` imported but unused ``` with awkward extra spaces before the colons.

ntBre added ty Multi-file analysis & type inference diagnostics Related to reporting of diagnostics. labels Aug 1, 2025

ntBre commented Aug 1, 2025

View reviewed changes

ntBre mentioned this pull request Aug 1, 2025

Use new Diagnostic functionality for Ruff rules #19690

Closed

7 tasks

ntBre marked this pull request as ready for review August 1, 2025 21:23

ntBre requested review from AlexWaygood, BurntSushi, MichaReiser, carljm, dcreager and sharkdp as code owners August 1, 2025 21:23

ntBre removed request for AlexWaygood, carljm, dcreager and sharkdp August 1, 2025 21:23

MichaReiser reviewed Aug 2, 2025

View reviewed changes

BurntSushi approved these changes Aug 4, 2025

View reviewed changes

MichaReiser approved these changes Aug 5, 2025

View reviewed changes

ntBre force-pushed the brent/full-file-diagnostics branch 2 times, most recently from 4603fc8 to 2cfad84 Compare August 5, 2025 14:56

Base automatically changed from brent/full-file-diagnostics to main August 5, 2025 15:20

ntBre added 6 commits August 5, 2025 11:31

add a failing test

662781e

pass the test

6b5c80d

move notebook handling into context_{before,after}, pass test

0c2ae78

struct Position instead of enum

e0b59a4

add a note about notebooks to RenderableSnippet::new docs

89a762b

ntBre added 3 commits August 5, 2025 11:31

clean up an extraneous whitespace change

ad3ca1b

add test where the contexts are in the same cell

a6577e8

ntBre force-pushed the brent/full-jupyter branch from 0d3c8dd to 5601e03 Compare August 5, 2025 15:45

ntBre closed this Aug 5, 2025

ntBre reopened this Aug 5, 2025

ntBre merged commit 5bfffe1 into main Aug 5, 2025
35 checks passed

ntBre deleted the brent/full-jupyter branch August 5, 2025 18:10

	// If we're working with a Jupyter Notebook, skip the lines which are
	// outside of the cell containing the diagnostic.
	if let Some(index) = self.notebook_index {
	let content_end_cell = index.cell(content_end_index).unwrap_or(OneIndexed::MIN);
	while end_index > content_end_index {
	if index.cell(end_index).unwrap_or(OneIndexed::MIN) == content_end_cell {
	break;
	}
	end_index = end_index.saturating_sub(1);
	}
	}

	if let Some((col, row)) = pos {
	buffer.append(line_offset, ":", stylesheet.none);
	buffer.append(line_offset, col.to_string().as_str(), stylesheet.none);
	buffer.append(line_offset, ":", stylesheet.none);
	buffer.append(line_offset, row.to_string().as_str(), stylesheet.none);
	}

	/// Callers should guarantee that the `input` on every `ResolvedAnnotation`
	/// given is identical.

	let diagnostic_source = &anns[0].diagnostic_source;
	let source = diagnostic_source.as_source_code();

	/// One `Snippet` is meant to represent a single, continuous,
	/// slice of source code that you want to annotate.
	#[derive(Debug)]
	pub struct Snippet<'a> {

[ty] Remap Jupyter notebook cell indices in ruff_db #19698

[ty] Remap Jupyter notebook cell indices in ruff_db #19698

Uh oh!

Conversation

ntBre commented Aug 1, 2025

Summary

Test Plan

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Aug 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

ruff-ecosystem results

Linter (stable)

Linter (preview)

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

MichaReiser commented Aug 2, 2025

Uh oh!

ntBre commented Aug 4, 2025

Uh oh!

BurntSushi left a comment

Choose a reason for hiding this comment

Uh oh!

MichaReiser left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

[ty] Remap Jupyter notebook cell indices in `ruff_db` #19698

[ty] Remap Jupyter notebook cell indices in `ruff_db` #19698

github-actions bot commented Aug 1, 2025 •

edited

Loading

`ruff-ecosystem` results