Patch extractor can go (fully) out of bounds

- TIA Toolbox version: 1.4
- Python version: ALL
- Operating System: ALL

### Description

I think the logic for getting the coordinate list in patch extraction isn't quite right.

The following code:
https://github.com/TissueImageAnalytics/tiatoolbox/blob/eb49f66f17c84c44cfc215cefbbf39fa13cedf36/tiatoolbox/tools/patchextraction.py#L455-L462

Only works if stride is the same as the patch size. If it isn't, its possible that some patch locations are generated that are entirely outside the slide.

### What I Did

An example:
```
wsi_shape = [43668, 14634]
coords = PatchExtractor.get_coordinates(
            image_shape=wsi_shape,
            patch_input_shape=(256,256),
            stride_shape=(128,128),
            input_within_bound=False,
        )
np.max(coords, axis=0) # gives array([43648, 14720, 43904, 14976])
```
note there are patches with top-left y coord of 14720, but the slide dimension is 14634. That means there are patches which lie wholly outside the slide bounds, which should not be happening. (input_within_bounds=False just means we allow patches that *overlap* with the slide boundary)

This also raises another discussion point. WSIreader.read_rect (or read bounds) will happily read a patch that is entirely outside the bounds of the slide, and will do so silently. Its designed to safely pad regions that overlap the edge of the slide, and that is fine, but I think in most cases, if your code is ending up reading patches from entirely outside the slide, theres something wrong somewhere and it would be good to know that its happening. So we could have a discussion on what the behaviour for this case should be.

### potential solution

I think the code should look like:
```
        output_x_end = (
            np.ceil(image_shape[0] / stride_shape[0]) * stride_shape[0]
        )
        output_x_list = np.arange(0, int(output_x_end), stride_shape[0])
        output_y_end = (
            np.ceil(image_shape[1] / stride_shape[1]) * stride_shape[1]
        )
        output_y_list = np.arange(0, int(output_y_end), stride_shape[1])
```
which gives in the above example:
```
wsi_shape = [43668, 14634]
np.max(coords, axis=0) # gives array([43648, 14592, 43904, 14848])
```
which correctly has all patches with at least some overlap with the slide.
This removes output_shape from those equations entirely, which I can't see a problem with but i'm also not 100% sure why output_shape was there in the first place so want to make sure i'm not missing anything.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Patch extractor can go (fully) out of bounds #710

Description

What I Did

potential solution

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

	output_x_end = (
	np.ceil(image_shape[0] / patch_output_shape[0]) * patch_output_shape[0]
	)
	output_x_list = np.arange(0, int(output_x_end), stride_shape[0])
	output_y_end = (
	np.ceil(image_shape[1] / patch_output_shape[1]) * patch_output_shape[1]
	)
	output_y_list = np.arange(0, int(output_y_end), stride_shape[1])

Uh oh!

Patch extractor can go (fully) out of bounds #710

Description

Description

What I Did

potential solution

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions