Skip to content

Conversation

lgarrison
Copy link
Collaborator

In looking at the gridlink/cell pair code for #276, I was reminded that we are explicitly computing the offset between pairs of cells e.g. here:

const DOUBLE off_xwrap = ((ix + iix) >= 0) && ((ix + iix) < nmesh_x) ? 0.0: ((ix+iix) < 0 ? xwrap:-xwrap);

Which made me think: if we know the offset of each cell pair, and that offset is applied in the kernels, then what's the harm in adding duplicate cell pairs? They will not find duplicate particle pairs, because a cell pair can be added at most once with a positive offset, and once with a negative. So as long as we enforce Rmax < boxsize/2, there ought not to be duplicate counts.

So allowing duplicate cell pairs, the usual tests seem to be passing (but those have small Rmax). I did add a simple large Rmax test with two particles, which seems to work.

This is also related to the old issue #210. I did test that reproducer, which fails with an "Rmax too large" error in master but succeeds with this PR.

@manodeep Am I missing something? Is there a counterexample where this doesn't work?

This is relatively low priority, but it does relate to the question of the conditions to avoid duplicates in #276 (comment) (or rather, removes that question entirely).

@lgarrison lgarrison requested a review from manodeep July 19, 2022 18:03
@manodeep
Copy link
Owner

I don’t think we can add duplicate cell pairs.

Consider a set of points - one that is close to the left edge and another that is close to the right edge (and where Rmax << Lbox). If we add duplicate cell-pairs (with a positive and a negative offsets), then that particle pair would be counted twice.

@lgarrison
Copy link
Collaborator Author

lgarrison commented Jul 21, 2022

But isn't the particle distance with the negative offset very small, and it is thus counted, and the distance with the positive offset very large, and thus not counted?

I implemented it as a test here: https://github.com/manodeep/Corrfunc/pull/277/files#diff-c7d4836bf94aa48cd9d4e4410ddab869be6ec08d83dfc0999ee5eac26cd8d30cR158

and it seems to work correctly. I may have misunderstood you, though. Could you look and see if the test is actually checking the scenario you had in mind?

(adding a milestone just to make the CI happy)

@lgarrison lgarrison added this to the v2.5.0 milestone Jul 21, 2022
@manodeep
Copy link
Owner

The distance between the two points would be the same in both the positive and negative offset cases. In one case the particle close to the right edge is shifted by xwrap towards the left edge and therefore ends up to the left of the second particle.

For a more concrete 1D example, say two particles A and B are located at x_A = 0 + alpha, and x_B = Lbox - alpha.

  • For a positive offset, x_A would be shifted to x_A' = x_A + Lbox == Lbox + alpha => dx^2 = (x_B - x_A')^2 = (-2*alpha)^2
  • For a negative offset, x_B' = x_B - Lbox == -alpha => dx^2 = (x_A - x_B')^2 = (2*alpha)^2.

This new distance with the negative offset would be the same as computed for the positive offset case. Therefore, for an autocorr with duplicated cell-pairs, particle pairs would get repeated

@lgarrison
Copy link
Collaborator Author

Sorry, I think I confused the issue by saying "positive offset" and "negative offset". In the case of an autocorrelation, the negative offset is actually "no offset". Offsets are only applied to achieve periodicity (as all particle positions are global), and one of the cell pairs crosses the wrap (the positive offset) and the other does not (no offset).

So in the two-particle example you gave, we have cell A and cell B. The cell pair generation goes as:

  1. We look for cell pairs using cell A as the primary, and find none because of the usual autocorrelation optimization to only consider half the cell pairs (icell < icell2 in the gridlink code).
  2. We look for cell pairs with cell B as the primary, and find two:
    • One at index offset -1, which is not a negative positional offset but is zero offset, because we did not have to cross the wrap
    • One at index offset +1, which does have a positive offset because it cross the wrap.

Only the second cell pair brings the particles adjacent to each other, so only one particle pair is found, and the count is doubled at the end (the usual doubling since the autocorr considers half the particle pairs).

The cross-correlation case is the same, except we find two cell pairs that generate counts, and the counts are not doubled at the end.

So I think this all works because we know whether we had to cross the wrap to make a cell pair. Thus "duplicate" cell pairs don't have the same positional offset.

I've stepped through the cell pair generation pretty closely, and I think the min-sep-opt will need to be re-validated for this PR to be production-ready. But the more I look at this, the more I think the basic principle is sound.

@manodeep
Copy link
Owner

Hmm - I may have misunderstood - let me back up. I thought you were saying that adding duplicate cell-pairs was okay (i.e., adding both (cellA, cellB) and (cellB, cellA) as two separate cell-pairs) and would not result in duplicated particle pairs. Is that the case?

@lgarrison
Copy link
Collaborator Author

Yes, because any time we add a duplicate cell pair, it is with a different offset (xwrap value).

@manodeep
Copy link
Owner

Not following then how you could avoid duplicate pair-counts if you add duplicate cell-pairs. Let's pretend that we manually add a duplicate cell-pair corresponding to a real one for an auto-corr calculation. In the case without any offsets (i.e., cell-pairs where no domain edges are crossed), how could you avoid duplicating the pair-counts that were already counted in the first real cell-pair?

In the extreme case, where there is only one cell in the domain, then duplicating the cell-pair (i.e., the self-cell-pair), would certainly result in duplicate particle pair counts.

I must be missing something :(

@lgarrison
Copy link
Collaborator Author

Oh, I definitely agree that if you add a truly identical cell pair—same icell & icell2, and same xwrap—you get duplicate particle counts! My point is just that generate_cell_pairs does not produce such truly identical pairs.

In fact, maybe this suggests the most conservative change is to adjust the duplicate detection to take into account the x/y/zwrap values. I've implemented this in the latest commit. I've also added tests for multiple combinations of bin refinement, max_cells_per_dim, min_sep_opt, and autocorr. What do you think about this approach?

@lgarrison
Copy link
Collaborator Author

Hey @manodeep, have you had a chance to look at the new approach to this problem?

@manodeep
Copy link
Owner

Thanks for your patience @lgarrison - on parental leave at the moment, so time is extremely limited.

I took another look at the PR - I am not really sure that I understand the improvement. Thus far, Corrfunc has required a minimum of 3 cells for any calculation - otherwise, it fails with "Rmax too large" error.

I am not seeing how a duplicate cell-pair could be added with different offsets. For an autocorr, a cell-pair (icell, icell2) should always be added with the same set of three offsets. The duplicate cell check macro only checks for the same cell-pair with the first cell as the primary cell - so the offsets should always be the same (i.e., there is no need to pass the xwrap into the macro). Now, if we consider the opposite cell-pair, i.e., (icell2, icell) then the relevant offsets would be negated but then would fail the icell2 > icell test. Once again, I feel that I am missing something ...and now I can blame the sleep-deprived brain :)

@lgarrison
Copy link
Collaborator Author

Thanks for your patience @lgarrison - on parental leave at the moment, so time is extremely limited.

Oh, no worries! There's no rush.

I took another look at the PR - I am not really sure that I understand the improvement. Thus far, Corrfunc has required a minimum of 3 cells for any calculation - otherwise, it fails with "Rmax too large" error.

The improvement is to remove this 3-cell minimum requirement, and thus allow all Rmax < L/2. It also simplifies the Rmax logic when we have per-axis boxsizes, as in #276, which is why I started thinking about this.

I am not seeing how a duplicate cell-pair could be added with different offsets. For an autocorr, a cell-pair (icell, icell2) should always be added with the same set of three offsets. The duplicate cell check macro only checks for the same cell-pair with the first cell as the primary cell - so the offsets should always be the same (i.e., there is no need to pass the xwrap into the macro). Now, if we consider the opposite cell-pair, i.e., (icell2, icell) then the relevant offsets would be negated but then would fail the icell2 > icell test.

Consider the case of two cells, evenly dividing the box. A particle pair could (1) straddle the boundary in the middle of the box, or (2) straddle the periodic wrap. Thus, we want to add the pair (icell2, icell) in duplicate, with two different offsets: zero, and -L. The zero offset will detect (1); the -L offset will detect (2).

(Recall the gridlink "offset" is the adjustment applied to the particle separations, rather than the distance between cell centers)

So that's why we want to add duplicate cell-pairs with different offsets: to detect both those cases. As you point out, icell2 will be the primary in both (1) and (2) because of the icell2 > icell test.

If you're asking if the duplicate macro check does anything at all, I think the answer is no! But it sounded like in #277 (comment) you were concerned about adversarial examples, hence keeping the check in place.

This same example is stepped through more carefully in #277 (comment). A version of this is also implemented here for concreteness.

@manodeep
Copy link
Owner

manodeep commented Sep 8, 2022

@lgarrison Think I have finally understood where the update is relevant - the check for duplicate cells was throwing me off. If I understand correctly, the fix is relevant when the number of bins along any dimension is smaller than the 2*bin_refine_factor + 1 . Specifically, this would be relevant for cases with bin_refs = [1, 1, 1] but rmax such that only two cells are generated. In such a case, you want to add the cell-pair [0, 1] with no positional offset, and then add the (duplicate) cell-pair [0,1] but with a negative positional offset. That's why the duplicate check requires the xwrap/ywrap/zwrap. Have I understood correctly?

If so, then perhaps the check for duplicate cells could be modified to only run when nxbins <= 2*xbinref || nybins <= 2*ybinref || nzbins <= 2*zbinref?

@lgarrison
Copy link
Collaborator Author

@lgarrison Think I have finally understood where the update is relevant - the check for duplicate cells was throwing me off. If I understand correctly, the fix is relevant when the number of bins along any dimension is smaller than the 2*bin_refine_factor + 1 . Specifically, this would be relevant for cases with bin_refs = [1, 1, 1] but rmax such that only two cells are generated. In such a case, you want to add the cell-pair [0, 1] with no positional offset, and then add the (duplicate) cell-pair [0,1] but with a negative positional offset. That's why the duplicate check requires the xwrap/ywrap/zwrap. Have I understood correctly?

Yes, exactly!

If so, then perhaps the check for duplicate cells could be modified to only run when nxbins <= 2*xbinref || nybins <= 2*ybinref || nzbins <= 2*zbinref?

I think this is already the case. The check_for_duplicate_ngb_cells variable does exactly this test: https://github.com/manodeep/Corrfunc/pull/277/files#diff-96f43764787ed164aaf40bc5cb9f61d347d18e748d51a6b29346f3865392eb94R484-R487

So do you think this means this PR is ready to go? I think we can also close #276 pretty quickly after this is merged.

* Update GNU assembler bug detection

* Cosmetic enhancement to suppress spurious warnings during GAS bug test

* Fix test error code
Copy link
Owner

@manodeep manodeep left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that I (mostly) understand what's going on, the changes seems good. Perhaps, add one more test with ~100 particle positions, say 50 each around [$0 \pm \delta$, $Lbox/2 \pm \delta$] mod Lbox and check that the pair counts from Corrfunc match those from a brute-force.

@manodeep
Copy link
Owner

If so, then perhaps the check for duplicate cells could be modified to only run when nxbins <= 2*xbinref || nybins <= 2*ybinref || nzbins <= 2*zbinref?

I think this is already the case. The check_for_duplicate_ngb_cells variable does exactly this test: https://github.com/manodeep/Corrfunc/pull/277/files#diff-96f43764787ed164aaf40bc5cb9f61d347d18e748d51a6b29346f3865392eb94R484-R487

Can't tell what that code fragment points to and reading the macro definition didn't show any signs of the if condition I mentioned. What am I missing?

@lgarrison
Copy link
Collaborator Author

Now that I (mostly) understand what's going on, the changes seems good. Perhaps, add one more test with ~100 particle positions, say 50 each around [$0 \pm \delta$, $Lbox/2 \pm \delta$] mod Lbox and check that the pair counts from Corrfunc match those from a brute-force.

Good idea; I've added this test, comparing against brute force, and it passes.

Can't tell what that code fragment points to and reading the macro definition didn't show any signs of the if condition I mentioned. What am I missing?

Hmm, looks like that link is broken. Here is one that should work:

const int check_for_duplicate_ngb_cells = ( any_periodic &&
(nmesh_x < (2*xbin_refine_factor + 1) ||
nmesh_y < (2*ybin_refine_factor + 1) ||
nmesh_z < (2*zbin_refine_factor + 1)) ) ? 1:0;

The macro is only called if this condition holds.

Comment on lines 192 to 207
np.random.seed(1234)
pos = np.empty((3, 100), dtype='f8')
eps = 0.2
boxsize = 123.
bins = np.linspace(0.01, 0.49*boxsize, 20)

pos[:, :50] = np.random.uniform(low=-eps, high=eps, size=(3, 50))*boxsize
pos[:, 50:] = np.random.uniform(low=-eps, high=eps,
size=(3, 50))*boxsize + boxsize/2.

pdiff = np.abs(pos[:, None] - pos[:, :, None])
pdiff[pdiff >= boxsize/2] -= boxsize
pdiff = (pdiff**2).sum(axis=0).reshape(-1)
brutecounts, _ = np.histogram(pdiff, bins=bins**2)
assert brutecounts[0] == 2
assert brutecounts[-1] == 172
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bit too hard-coded - could use npts as a variable. I also don't follow the first set of positions - wouldn't line 198 generate positions that are between [-eps, +eps]*boxsize (rather than the more stringent [0, eps] and [boxsize-eps, boxsize])?

How did you get the rhs values in lines 206-207?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I should have left a comment; that check was strictly to verify that the comparison with Corrfunc was actually doing something, and not trivially passing due to returning all zeros. The real test is the comparison with Corrfunc which follows below.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the positions, I tried to generate a cloud around (0,0,0) and (L/2,L/2,L/2). Each cloud has width 0.2*boxsize. The relatively large width is so that we can test pairs between the clouds---the two clouds must be within L/2 of each other on the diagonal!

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I should have left a comment; that check was strictly to verify that the comparison with Corrfunc was actually doing something, and not trivially passing due to returning all zeros. The real test is the comparison with Corrfunc which follows below.

In that case checking with assert brutecounts[0] > 0 and brutecounts[-1] > 0 might be sufficient. Particularly, so if the specific counts might change based on the random number seed

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

import numpy as np
seed = 1234
np.random.seed(seed)

eps = 0.2
boxsize = 123.
npts = 10
pos = np.empty((3, npts), dtype='f8')
pos[:, :npts//2] = np.random.uniform(low=-eps, high=eps, size=(3, npts//2))*boxsize
pos[:, npts//2:] = np.random.uniform(low=-eps, high=eps, size=(3, npts//2))*boxsize + boxsize/2.

pos

array([[-1.51772430e+01,  6.00775154e+00, -3.06379524e+00,
         1.40396423e+01,  1.37748098e+01,  6.45108524e+01,
         6.16516917e+01,  3.75774077e+01,  7.49230698e+01,
         8.03259466e+01],
       [-1.11884438e+01, -1.09979586e+01,  1.48521111e+01,
         2.25404562e+01,  1.84958856e+01,  5.48523904e+01,
         6.71774920e+01,  4.06087571e+01,  5.50461411e+01,
         8.28104930e+01],
       [-6.99539032e+00,  4.89601758e-02,  9.02637641e+00,
         1.04649397e+01, -6.38366286e+00,  6.89478046e+01,
         5.64423668e+01,  7.57055230e+01,  5.24883372e+01,
         6.48504537e+01]])

That snippet produces invalid positions - i.e., positions not in [0, Lbox). Plus, the width of the distribution (which should be ~ eps) is multiplied by the boxsize. We can get the expected behaviour by removing the multiplication with boxsize (or change the semantics of eps such that eps is a fraction of boxsize -> generate the random positions from -eps*boxsize, +eps*boxsize) and then also wrapping the positions into [0, Lbox).

Here's my attempt:

import numpy as np
seed = 1234
np.random.seed(seed)

# eps is the width of the point clouds as a fraction of boxsize
eps = 0.01 
boxsize = 123.
npts = 10
pos = np.empty((3, npts), dtype='f8')
pos[:, :npts//2] = np.random.uniform(low=-eps*boxsize, high=eps*boxsize, size=(3, npts//2))
pos[:, npts//2:] = np.random.uniform(low=-eps*boxsize, high=eps*boxsize, size=(3, npts//2)) + boxsize/2.

pos[pos < 0] += boxsize
pos[pos >= boxsize] -= boxsize

pos

array([[1.22241138e+02, 3.00387577e-01, 1.22846810e+02, 7.01982116e-01,
        6.88740488e-01, 6.16505426e+01, 6.15075846e+01, 6.03038704e+01,
        6.21711535e+01, 6.24412973e+01],
       [1.22440578e+02, 1.22450102e+02, 7.42605557e-01, 1.12702281e+00,
        9.24794281e-01, 6.11676195e+01, 6.17838746e+01, 6.04554379e+01,
        6.11773071e+01, 6.25655247e+01],
       [1.22650230e+02, 2.44800879e-03, 4.51318821e-01, 5.23246986e-01,
        1.22680817e+02, 6.18723902e+01, 6.12471183e+01, 6.22102762e+01,
        6.10494169e+01, 6.16675227e+01]])

What do you think?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, the positions do indeed need to be wrapped! Using eps as a fraction of the boxsize was intentional; I am using eps=0.2 so that the two clouds are separated by < L/2. This allows us to test the large-Rmax functionality of this PR.

Using the correctly wrapped positions did reveal a bug. If the user has forced nmesh=1 (which ends up doing N^2 on the full particle set), then in the case of an autocorrelation, the icell2 > icell early-exit doesn't trigger (because there's only one cell), so we end up adding duplicate cell pairs.

I think my preferred solution is simply to require a minimum of two cells per dimension. In master, the minimum is 3 already. And nmesh=1 would be more efficient if one were to skip gridlink and thread over particles instead of cell pairs. Basically, I think nmesh=1 is never a good idea.

…quire nmesh>=2 to avoid duplicate cell pairs
Copy link
Owner

@manodeep manodeep left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Think the actual code is good to go. I made some minor comments on the testing side

Comment on lines 204 to 208
pdiff = np.abs(pos[:, None] - pos[:, :, None])
pdiff[pdiff >= boxsize/2] -= boxsize
pdiff = (pdiff**2).sum(axis=0).reshape(-1)
brutecounts, _ = np.histogram(pdiff, bins=bins**2)
print(brutecounts)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally speaking, I am opposed to "clever" test code. What's the performance hit if you write out a stupid for loop?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm sure the performance is fine for 100 points, but it's way more lines of code! But more importantly, we're using numpy, and I think we should do things the "numpy way". Python for loops over array indices are definitely not that!

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure thing but then do you mind adding explicit comments to these lines, esp. 204 and 206 - and what the shapes are of each intermediate and result array. Particularly, if there is any specific assumption about numpy multi-dim indexing, it would be good to note that too.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, done!

@manodeep
Copy link
Owner

manodeep commented Oct 7, 2022

Thanks for your patience - yup, looks good to me now! Once the CI passes, this can be merged

@lgarrison lgarrison merged commit 23453ae into boxsize-per-dim Oct 7, 2022
@lgarrison
Copy link
Collaborator Author

Great, thanks! I'll revisit #276 next, once I remind myself what I was doing there 😅

@lgarrison lgarrison deleted the large-rmax branch October 7, 2022 15:52
manodeep added a commit that referenced this pull request Dec 7, 2022
* Implement non-cubic periodic box for theory.DD

* Use boxsize tuple when calling low-level python bindings

* Changelog

* pep8

* Fix gridlink for non-cubic

* Implement turning off periodicity per-dimension by specifying boxsize=-1 in that dimension

* Change binsize calculation to use particle extent, not periodic wrap

* Port per-dimension periodicity and boxsize to other modules

* Fix calling CPython modules

* Extend simple boxsize tests to other modules

* pep8

* Only apply minimum cell criterion in a dimension when the periodic wrap makes wrap-crossing pairs possible

* Assign particles to cell 0 when they are all in a plane

* Add warning when all particles fall in a plane

* Add comments and parens. More explicit particle positions in narrow extent test.

* WIP: enable larger Rmax (up to half the box size) (#277)

* Attempt to enable Rmax comparable to half boxsize by removing (unnecessary?) duplicate cell checks

* Comment out unused var

* Add test implementing Manodeep's example

* Restore duplicate cell pair check, but only count a pair as a duplicate if the wrap value is identical.

* Add pragmas for CI (will revisit)

* Update GNU assembler bug detection (#278)

* Update GNU assembler bug detection

* Cosmetic enhancement to suppress spurious warnings during GAS bug test

* Fix test error code

* Add another test of large Rmax, comparing against brute-force

* Add const qualifiers

* Apply @manodeep's fix for large Rmax test against brute-force, and require nmesh>=2 to avoid duplicate cell pairs

* Make boxsize non-trivial in test. Remove extra print statement.

* Add comments on array broadcasting to test

* Changed variable name for clarity

Co-authored-by: Manodeep Sinha <manodeep@gmail.com>

* Fix missing docstring escapes

* Add tests for anisotropic boxes against brute-force. Allow more boxsize arg types, and tests for those arg types. Remove old large Rmax test.

* Fix Rmax>L/2 for non-periodic. Greatly expand brute-force tests. Fix passing boxsize=None to the Python API.

Co-authored-by: Manodeep Sinha <manodeep@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants