-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
primitive neighbor list #68
Comments
Can you tell me a little bit more about your problem? I guess that is not the conversion that takes time, but the fact that new lists of neighbors are created (a new array is allocated) on every call (the return type of the neighborlist function is equivalent to an It is of course possible to implement an But, in general, one uses the neighbor list for for something, and with celllistmap you can compute that directly without even having to build the list, which is even faster.
I read that, but I didn't get what exactly do you want that is not already given by the neighborlist function we have. |
The motivation for my work is this for each frame of trajectory, I need to compute neighbor list. This neighborlist should be converted to an edge_index = torch.tensor of shape (2, -1). And I should move it to gpu, julia computation time gives me a speed up of 180 times but the conversion of list to torch.tensor reduces speed up to 2 times faster.
This is not primary issue right now. This is the case with crystals, where unit cell is smaller than the cutoff. We have to repeat the system on the fly (kx, ky, kz) times to get neighbor list up to cutoff. |
Maybe we could try to return exactly that already. How can I generate one of such tensors in Python (I don't use torch)? The return type of CellListMap is, I think, the equivalent of a numpy array with a custom dtype. That I can check and I will implement the inplace version of neighborlist, which will receive such array as a parameter and updated it. I think that, in principle, that will be easy.
Uhm... no, currently we do not support that. I never thought of that possibility. That is not straightforward to implement, no, for the moment the only idea that comes into my mind is to replicate the system. From a first sight, I think I would need to define two unitcells. I'll think about that. |
I am newbie in Python, so let me know how bad this is in terms of performance: In [4]: from juliacall import Main as jl
In [5]: x = jl.seval("[ (i,i,Float64(i)) for i in 1:3 ]")
In [6]: x
Out[6]:
3-element Vector{Tuple{Int64, Int64, Float64}}:
(1, 1, 1.0)
(2, 2, 2.0)
(3, 3, 3.0)
In [7]: import numpy as np
In [8]: y = np.full(len(x), (0,0,0.0), dtype="i,i,f")
In [9]: for i in range(0,len(y)) :
...: y[i] = x[i]
...:
In [10]: y
Out[10]:
array([(1, 1, 1.), (2, 2, 2.), (3, 3, 3.)],
dtype=[('f0', '<i4'), ('f1', '<i4'), ('f2', '<f4')])
This will copy the elements of the output list into the |
I have played a little with the conversion of the list that is output by CellListMap to a regular numpy array, and with this option there doesn´t seem to be a large overhead implied in copying the data. The key is to perform the copy of the arrays with a simple Julia function, and not on the python side. I also tested preallocating all data, but that had a minor implication only: Basically, if copying the data with the Julia function, you get: In [8]: %timeit convert.run_withcopy_juliaside(x)
49.1 ms ± 568 µs per loop (mean ± std. dev. of 7 runs, 10 loops each) But if the data copying is done with a python loop: In [9]: %timeit convert.run_withcopy_pyside(x)
1.27 s ± 80.7 ms per loop (mean ± std. dev. of 7 runs, 1 loop each) The complete code is here, probably it can guide a practical implementation of this in general: """
# Execute with:
import numpy as np
import convert
x = np.random.random((50_000,3))
%timeit convert.run_nocopy(x)
%timeit convert.run_withcopy_juliaside(x)
%timeit convert.run_withcopy_pyside(x)
xt = x.transpose()
iinds = np.full((0,), 0, dtype=np.int64)
jinds = np.full((0,), 0, dtype=np.int64)
# first call: needs to allocate all indices
%timeit convert.run_inplace(xt, iinds, jinds)
# second call: the list is almost of the correct size
%timeit convert.run_inplace(xt, iinds, jinds)
"""
from juliacall import Main as jl
import numpy as np
import timeit
jl.seval("using CellListMap")
# Simple Julia function to copy the data on the Julia side
jl.seval("""
function copy_to_np(jl_list, iinds, jinds)
for i in eachindex(jl_list)
iinds[i] = jl_list[i][1]
jinds[i] = jl_list[i][2]
end
return nothing
end
""")
# Python function to resize arrays of indices and copy with the Julia function
def copy_from_jl(jl_list, iinds, jinds) :
iinds.resize((len(jl_list),), refcheck=False)
jinds.resize((len(jl_list),), refcheck=False)
jl.copy_to_np(jl_list, iinds, jinds)
# Python function to resize arrays of indices and copy with a Python loop
def copy_from_py(jl_list, iinds, jinds) :
iinds.resize((len(jl_list),), refcheck=False)
jinds.resize((len(jl_list),), refcheck=False)
for i in range(0,len(jl_list)) :
iinds[i] = jl_list[i][0]
jinds[i] = jl_list[i][1]
# Run the neighborlist computation only
def run_nocopy(x) :
xt = x.transpose()
jl_list = jl.CellListMap.neighborlist(xt, 0.05)
# Run and copy using the Julia function
def run_withcopy_juliaside(x) :
iinds = np.full((0,), 0, dtype=np.int64)
jinds = np.full((0,), 0, dtype=np.int64)
xt = x.transpose()
jl_list = jl.CellListMap.neighborlist(xt, 0.05)
copy_from_jl(jl_list, iinds, jinds)
return iinds, jinds
# Run and copy using the python function
def run_withcopy_pyside(x) :
iinds = np.full((0,), 0, dtype=np.int64)
jinds = np.full((0,), 0, dtype=np.int64)
xt = x.transpose()
jl_list = jl.CellListMap.neighborlist(xt, 0.05)
copy_from_py(jl_list, iinds, jinds)
return iinds, jinds
# Run but do all inplace to test the cost of allocations
def run_inplace(xt, iinds, jinds) :
jl_list = jl.CellListMap.neighborlist(xt, 0.05)
copy_from_jl(jl_list, iinds, jinds)
return iinds, jinds |
Thanks for the codes, I will run them this week to benchmark their performance.
My test also shows no improvement with pre-allocation. |
It works fine and I have 20-100 x faster neighbor list generation! Closing this. |
Nice, thanks for the feedback. I will add something like that to the user guide. |
I've added a simpler interface for Python users here: https://m3g.github.io/CellListMap.jl/stable/python/ You may be able to run the julia code multi-threaded as well, and get some additional speedup. |
Hi,
Do you have support for primitive neighbor list? https://wiki.fysik.dtu.dk/ase/_modules/ase/neighborlist.html
I am also using CellListMap in python, memory copy takes long time from list to numpy and torch, is it possible to return numpy array direclty?
Best
The text was updated successfully, but these errors were encountered: