Skip to content

Fix: Downsample before embedding in multi-resolution mode and fix fps… #32

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

maliozer
Copy link

Ensure correct order of FPS downsampling and embedding for multi-resolution inputs; fix import location of fps

Summary

This PR fixes the following issues in the PerceiverCrossAttentionEncoder._forward() method:

Correct import location for fps:
The function fps from torch_cluster is now imported directly before it is used in the multi-resolution downsampling block, preventing NameError.

Correct order of downsampling and embedding:
Multi-resolution FPS downsampling of point clouds and associated features (pc, feats, sharp_pc, sharp_feat) is now performed before any embedding or projection. This ensures that only the downsampled tensors are passed through the embedder and input projections, so shapes are always aligned. No redundant recomputation is performed.

Changes

  • Moved from torch_cluster import fps to immediately before usage in the if self.use_multi_reso: block.
  • Downsample input tensors first (if use_multi_reso is enabled), and then perform embedding/projection.
  • Prevents mismatches in shape/dimension and avoids unnecessary recomputation.
  • All logic paths now consistently process only the current (possibly downsampled) batch.

Related Issue: #31

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant