Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Graph partitioning interface #28

Closed
mtfishman opened this issue Dec 20, 2022 · 7 comments
Closed

Graph partitioning interface #28

mtfishman opened this issue Dec 20, 2022 · 7 comments

Comments

@mtfishman
Copy link
Member

mtfishman commented Dec 20, 2022

Currently we have functions:

using NamedGraphs
using ITensors
using ITensorNetworks
using Metis

tn = ITensorNetwork(named_grid((4, 4)); link_space=4)
partition(tn, 2)
subgraphs(tn, nv(tn) ÷ 2)

for partitioning graphs, which output:

julia> partition(tn, 2)
16-element Dictionaries.Dictionary{Tuple{Int64, Int64}, Int64}
 (1, 1) │ 1
 (2, 1) │ 1
 (3, 1) │ 1
 (4, 1) │ 1
 (1, 2) │ 1
 (2, 2) │ 1
 (3, 2) │ 1
 (4, 2) │ 1
 (1, 3) │ 2
 (2, 3) │ 2
 (3, 3) │ 2
 (4, 3) │ 2
 (1, 4) │ 2
 (2, 4) │ 2
 (3, 4) │ 2
 (4, 4) │ 2

julia> subgraphs(tn, nv(tn) ÷ 2)
DataGraph{Int64, Vector{Tuple{Int64, Int64}}, Any, NamedGraph{Int64}, NamedEdge{Int64}} with 2 vertices:
2-element Vector{Int64}:
 1
 2

and 1 edge(s):
1 => 2

with vertex data:
2-element Dictionaries.Dictionary{Int64, Vector{Tuple{Int64, Int64}}}
 1 │ [(1, 1), (2, 1), (3, 1), (4, 1), (1, 2), (2, 2), (3, 2), (4, 2)]
 2 │ [(1, 3), (2, 3), (3, 3), (4, 3), (1, 4), (2, 4), (3, 4), (4, 4)]

and edge data:
0-element Dictionaries.Dictionary{NamedEdge{Int64}, Any}

We should make them consistent so that they both can accept the partition size or the number of partitions. Additionally, their names should give a better indication for what they output.

For example, I could imagine a function:

julia> partition_vertices(tn, 2)
2-element Vector{Vector{Tuple{Int64, Int64}}}:
 [(1, 1), (2, 1), (3, 1), (4, 1), (1, 2), (2, 2), (3, 2), (4, 2)]
 [(1, 3), (2, 3), (3, 3), (4, 3), (1, 4), (2, 4), (3, 4), (4, 4)]

julia> partition_vertices(tn; partition_size=nv(tn) ÷ 2)
2-element Vector{Vector{Tuple{Int64, Int64}}}:
 [(1, 1), (2, 1), (3, 1), (4, 1), (1, 2), (2, 2), (3, 2), (4, 2)]
 [(1, 3), (2, 3), (3, 3), (4, 3), (1, 4), (2, 4), (3, 4), (4, 4)]

Or perhaps partition_vertices should output a DataGraph like subgraphs does right now, which also keeps track of which partitions have connected edges.

@JoeyT1994 curious about your opinion on this.

@emstoudenmire
Copy link
Contributor

Agreed the new function interface is better. Would it make sense in the future to allow partitioning by properties too? Like a keyword argument by=... that takes a function that determines which partition to put a vertex into? (Only if there's a use case for it of course.)

@mtfishman
Copy link
Member Author

mtfishman commented Dec 20, 2022

I think that kind of thing would be better handled externally by applying functions like SplitApplyCombine.group to the vertices vertices(tn): https://github.com/JuliaData/SplitApplyCombine.jl#grouping.

The functions I'm discussing above are more focused on graph partitioning based on graph properties, and wrap functionality from packages like Metis.jl and KaHyPar.jl.

@mtfishman
Copy link
Member Author

I could imagine a functionality like:

subgraph_vertices = [
  [(1, 1), (2, 1), (3, 1), (4, 1), (1, 2), (2, 2), (3, 2), (4, 2)],
  [(1, 3), (2, 3), (3, 3), (4, 3), (1, 4), (2, 4), (3, 4), (4, 4)],
]
dg = subgraph_connectivity(tn, subgraph_vertices)

which outputs a graph with edges indicating which subgraphs are connected to each other (like the output of the currently available subgraphs function I show above), in this case an edge 1 => 2 indicating subgraph 1 has at least one edge connected to subgraph 2. The subgraph_vertices input could then be determined using functions like partition_vertices or SplitApplyCombine.group.

@JoeyT1994
Copy link
Contributor

JoeyT1994 commented Dec 21, 2022

Yes I think that a `partition_vertices' function which allows the user to choose partition_number vs partition_size would be helpful.

@emstoudenmire, there are a few cases from BP I can think of where partitioning by properties would be helpful..

For instance if we want to do belief propagation over a wave function tn but don't want to explicitly flatten the network due to memory overhead we can form Z = <tn|tn> without actually doing the contraction via

image

It is then logical to ensure those ITensors in Z which share a site index are placed into the same subgraphs and do belief propagation that way.

Additionally if we do want to efficiently do a Belief Propagation version of DMRG we would need to form (but not contract) <tn|H|tn> and again want to keep all ITensors which share site inds in the same subgraphs.

@mtfishman In the case of Simple Belief Propagation this seems quite easily done with SplitApplyCombine.group (as those tensors which share a site index have all elements of their vertex tuple identical - other than the first). What, however, if I wanted to have larger subgraphs for GBP (e.g. 2*n ITensors of Z in each subgraph, each ITensor must share the subgraph with its conjugate ITensor, and the network of ITensors formed within each subgraph must be connected)?

Perhaps we would need to write up a function for that?

With this functionality the code we have written in beliefpropagation.jl can be used to form SBP or GBP message tensors of such objects without needing any keyword arguments itself, as all of the subgraphing is taken care of elsewhere which is neat.

@mtfishman
Copy link
Member Author

mtfishman commented Dec 21, 2022

Good example, I think it hits on the question of how to do graph partitioning with constraints on which vertices will get grouped together, or another way of seeing it is partitioning a graph that has already been partitioned (i.e. how to work with and represent partitioned graphs).

Here is a solution with current tools that are available, where we do it in two passes, one where you first group the vertices you know you always want grouped together, and next you partition the resulting graph:

using ITensors
using Graphs
using NamedGraphs
using ITensorNetworks
using SplitApplyCombine
using Metis

using ITensorNetworks: findfirst_on_vertices

function subgraph_connectivity(g, vertex_groups)
  dg_subgraphs = DataGraph(NamedGraph(keys(vertex_groups)), vertex_groups)
  for e in edges(g)
    s1 = findfirst_on_vertices(vertex_groups -> src(e)  vertex_groups, dg_subgraphs)
    s2 = findfirst_on_vertices(vertex_groups -> dst(e)  vertex_groups, dg_subgraphs)
    if (!has_edge(dg_subgraphs, s1, s2) && s1 != s2)
      add_edge!(dg_subgraphs, s1, s2)
    end
  end
  return dg_subgraphs
end

s = siteinds("S=1/2", named_grid(8))
tn = ITensorNetwork(s; link_space=2)
Z = prime(tn; sites=[])  tn

# Group the vertices by the first dimension, so pairs vertices up like:
# [[(1, 1), (1, 2)], [(2, 1), (2, 2)], [(3, 1), (3, 2)], ...]
vertex_groups = group(v -> v[1], vertices(Z))

# Uses the function defined above to turn the vertex groups into a DataGraph
# which encodes which vertex groups have edges connected between them
# Probably to provide more information for partitioning, it should store how
# many of the original edges are connected
Z_subgraphs = subgraph_connectivity(Z, vertex_groups)

# Partition the partitioned graph
Z_partitions = subgraphs(Z_subgraphs, 2)

# Group the vertices in terms of the original vertices of `Z`
Z_verts = [reduce(vcat, (Z_subgraphs[p] for p in Z_partitions[v])) for v in vertices(Z_partitions)]

which returns:

4-element Vector{Vector{Tuple{Int64, Int64}}}:
 [(1, 1), (1, 2), (2, 1), (2, 2)]
 [(3, 1), (3, 2), (4, 1), (4, 2)]
 [(5, 1), (5, 2), (6, 1), (6, 2)]
 [(7, 1), (7, 2), (8, 1), (8, 2)]

This is a bit verbose and opaque right now, but I think with some helper functions and improvements to the partitioning interface we can make it nicer. I would prefer a solution for this kind of problem which involves composing various well defined and simple tools instead of making one big function that tries to do everything, since you can imagine there may be many variations that people will want.

This points to an original question I had about designing this package and the graph types, which was that it seemed natural to want to make use of partitioned graphs, for example a tensor network with multiple tensors grouped together and treated as a single vertex. I had actually originally designed a PartitionedGraph type but found that it was unwieldy and too complicated to design. I think that DataGraph can handle partitioned graphs quite naturally, since ultimately partitioned graphs can be viewed as graphs that store subgraphs in the metadata of the vertices (or information defining subgraphs, like subgraph vertex groups). Additionally, the vertex names of a NamedGraph can give some human-level information about the connectivity of the graph, which can help with partitioning as well, like in the example above.

@JoeyT1994
Copy link
Contributor

Hi Matt,

Thanks to the solution to that problem. That is really neat and I agree that writing some helper functions that can turn it into a few lines when it is needed to be used mid-program would be perfect.

Do you think such functions should go into subgraphs.jl? For instance one could be subgraph_connectivity as you wrote above, with the extra that the edge data is number of edges from the original graph actually involved.

Tomorrow, I could add that to subgraphs.jl and make a PR if you wish?

@mtfishman
Copy link
Member Author

Closed by #30.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants