Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Suggestion: Use Torch primitives for Gaussian blur to vastly speed it up #41

Open
ttulttul opened this issue May 30, 2024 · 1 comment

Comments

@ttulttul
Copy link
Contributor

The torchvision Gaussian function runs on the CPU even if your tensor is on the GPU. Here's a blur function that guarantees the blur will run very quickly on the GPU. Some adaptation for the tensor shape may need to be made; this one works for the latent tensor format:

`def gaussian_blur(tensor, kernel_size=5, sigma=1.0):
if len(tensor.shape) == 4: # Batch of images
batch_size, channels, height, width = tensor.shape
else:
raise ValueError("Expected a 4D tensor [B, C, H, W]")

# Create Gaussian kernel
x = torch.arange(-kernel_size // 2 + 1, kernel_size // 2 + 1, device=tensor.device)
x = torch.exp(-x**2 / (2 * sigma**2))
x = x / x.sum()

# Create 2D Gaussian kernel by outer product
kernel = x[:, None] * x[None, :]

# Expand to match input tensor shape [out_channels, in_channels, kernel_height, kernel_width]
kernel = kernel.expand(channels, 1, kernel_size, kernel_size)

# Apply Gaussian blur
blurred = F.conv2d(tensor, kernel, groups=channels, padding=kernel_size // 2)

return blurred`
@cubiq
Copy link
Owner

cubiq commented May 30, 2024

that's cool but when set to "GPU" the mask blur takes a fraction of a second even with a kernel size of 256...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants