Skip to content

Commit 6f2c2bb

Browse files
shahar-birongkorlandAviAvnicoderabbitai[bot]
authored
Add documentation for missing procedures (fixes #268) (#269)
* Add documentation for missing procedures (fixes #268) - Add comprehensive documentation for algo.MSF (Minimum Spanning Forest) - Add db.idx.fulltext.queryRelationships to procedures table - Add db.idx.vector.queryNodes to procedures table with reference to vector indexing docs - Add db.idx.vector.queryRelationships to procedures table with reference to vector indexing docs - Update algorithms index to include MSF All four procedures mentioned in issue #268 are now documented: - algo.MSF: New dedicated page with examples and use cases - db.idx.fulltext.queryRelationships: Added to procedures table (already documented in indexing.md) - db.idx.vector.queryNodes: Added to procedures table (already documented in indexing.md) - db.idx.vector.queryRelationships: Added to procedures table (already documented in indexing.md) * Fix PR issues: markdown, link fragments, and spelling - Fix invalid link fragment in procedures.md (removed broken #BFS anchor) - Add language specification to code block in msf.md (line 92) - Replace 'very large graphs' with specific metric '100K+ nodes' - Add missing technical terms to wordlist: MST, Kruskal, Prim * Remove accidentally committed files and fix spelling - Remove FalkorDB_Product_Overview.md and PR_SUMMARY_268.md - Add MSTs, Kruskal's, and Prim's to wordlist * Add MSF to wordlist * Fix alphabetical order in wordlist per CodeRabbit review * Apply suggestion from @coderabbitai[bot] Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com> --------- Co-authored-by: Guy Korland <gkorland@gmail.com> Co-authored-by: Avi Avni <avi.avni@gmail.com> Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
1 parent efca0ae commit 6f2c2bb

File tree

4 files changed

+192
-3
lines changed

4 files changed

+192
-3
lines changed

.wordlist.txt

Lines changed: 7 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -395,10 +395,15 @@ propvalue
395395
ro
396396
GenAI
397397

398-
WCC
399-
WSL
398+
Kruskal's
399+
MSF
400+
MST
401+
MSTs
402+
Prim's
400403
SPpath
401404
SSpath
405+
WCC
406+
WSL
402407

403408
undirected
404409
preprocessing

algorithms/index.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -34,6 +34,9 @@ This overview summarizes the available algorithms and links to their individual
3434
- **[SSpath](./sspath.md)**
3535
Enumerates all paths from a single source node to other nodes, based on constraints like edge filters and depth.
3636

37+
- **[MSF](./msf.md)**
38+
Computes the Minimum Spanning Forest of a graph, finding the minimum spanning tree for each connected component.
39+
3740
For path expressions like `shortestPath()` used directly in Cypher queries, refer to the [Cypher Path Functions section](../cypher/functions.md#path-functions).
3841

3942
## Centrality Measures

algorithms/msf.md

Lines changed: 177 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,177 @@
1+
---
2+
title: "MSF"
3+
description: "Minimum Spanning Forest Algorithm"
4+
parent: "Algorithms"
5+
nav_order: 9
6+
---
7+
8+
# Minimum Spanning Forest (MSF)
9+
10+
The Minimum Spanning Forest algorithm computes the minimum spanning forest of a graph. A minimum spanning forest is a collection of minimum spanning trees, one for each connected component in the graph.
11+
12+
## What is a Minimum Spanning Forest?
13+
14+
- For a **connected graph**, the MSF is a single minimum spanning tree (MST) that connects all nodes with the minimum total edge weight
15+
- For a **disconnected graph**, the MSF consists of multiple MSTs, one for each connected component
16+
- The forest contains no cycles and has exactly `N - C` edges, where `N` is the number of nodes and `C` is the number of connected components
17+
18+
## Use Cases
19+
20+
- **Network Design**: Minimize cable/pipeline costs when connecting multiple locations
21+
- **Clustering**: Identify natural groupings in data by analyzing the forest structure
22+
- **Image Segmentation**: Group similar pixels using edge weights as similarity measures
23+
- **Road Networks**: Optimize road construction to connect all cities with minimum cost
24+
25+
## Syntax
26+
27+
```cypher
28+
CALL algo.MSF(
29+
config: MAP
30+
) YIELD src, dest, weight, relationshipType
31+
```
32+
33+
### Parameters
34+
35+
| Parameter | Type | Description |
36+
|-----------|------|-------------|
37+
| `config` | MAP | Configuration map containing algorithm parameters |
38+
39+
#### Configuration Options
40+
41+
| Option | Type | Required | Default | Description |
42+
|--------|------|----------|---------|-------------|
43+
| `sourceNodes` | List of Nodes | No | All nodes | Starting nodes for the algorithm. If not provided, all nodes in the graph are considered |
44+
| `relationshipTypes` | List of Strings | No | All types | Relationship types to traverse. If not provided, all relationship types are considered |
45+
| `relationshipWeightProperty` | String | No | `null` | Property name containing edge weights. If not specified, all edges have weight 1.0 |
46+
| `defaultValue` | Float | No | `1.0` | Default weight for edges that don't have the weight property |
47+
48+
### Returns
49+
50+
| Field | Type | Description |
51+
|-------|------|-------------|
52+
| `src` | Node | Source node of the edge in the spanning forest |
53+
| `dest` | Node | Destination node of the edge in the spanning forest |
54+
| `weight` | Float | Weight of the edge |
55+
| `relationshipType` | String | Type of the relationship |
56+
57+
## Examples
58+
59+
### Example 1: Basic MSF with Unweighted Graph
60+
61+
Find the minimum spanning forest treating all edges equally:
62+
63+
```cypher
64+
CALL algo.MSF({}) YIELD src, dest, weight, relationshipType
65+
RETURN src.name AS source, dest.name AS destination, weight, relationshipType
66+
```
67+
68+
### Example 2: MSF with Weighted Edges
69+
70+
Consider a graph representing cities connected by roads with distances:
71+
72+
```cypher
73+
// Create a weighted graph
74+
CREATE (a:City {name: 'A'}), (b:City {name: 'B'}), (c:City {name: 'C'}),
75+
(d:City {name: 'D'}), (e:City {name: 'E'})
76+
CREATE (a)-[:ROAD {distance: 2}]->(b),
77+
(a)-[:ROAD {distance: 3}]->(c),
78+
(b)-[:ROAD {distance: 1}]->(c),
79+
(b)-[:ROAD {distance: 4}]->(d),
80+
(c)-[:ROAD {distance: 5}]->(d),
81+
(d)-[:ROAD {distance: 6}]->(e)
82+
83+
// Find minimum spanning forest using distance weights
84+
CALL algo.MSF({
85+
relationshipWeightProperty: 'distance'
86+
}) YIELD src, dest, weight
87+
RETURN src.name AS from, dest.name AS to, weight AS distance
88+
ORDER BY weight
89+
```
90+
91+
**Result:**
92+
```text
93+
from | to | distance
94+
-----|----|---------
95+
B | C | 1.0
96+
A | B | 2.0
97+
A | C | 3.0
98+
B | D | 4.0
99+
D | E | 6.0
100+
```
101+
102+
### Example 3: MSF on Specific Relationship Types
103+
104+
Find the spanning forest considering only specific relationship types:
105+
106+
```cypher
107+
CALL algo.MSF({
108+
relationshipTypes: ['ROAD', 'HIGHWAY'],
109+
relationshipWeightProperty: 'distance'
110+
}) YIELD src, dest, weight, relationshipType
111+
RETURN src.name AS from, dest.name AS to, weight, relationshipType
112+
```
113+
114+
### Example 4: MSF Starting from Specific Nodes
115+
116+
Compute the spanning forest starting from a subset of nodes:
117+
118+
```cypher
119+
MATCH (start:City) WHERE start.name IN ['A', 'B']
120+
WITH collect(start) AS startNodes
121+
CALL algo.MSF({
122+
sourceNodes: startNodes,
123+
relationshipWeightProperty: 'distance'
124+
}) YIELD src, dest, weight
125+
RETURN src.name AS from, dest.name AS to, weight
126+
```
127+
128+
### Example 5: Disconnected Graph
129+
130+
For a graph with multiple components, MSF returns multiple trees:
131+
132+
```cypher
133+
// Create two disconnected components
134+
CREATE (a:Node {name: 'A'})-[:CONNECTED {weight: 1}]->(b:Node {name: 'B'}),
135+
(b)-[:CONNECTED {weight: 2}]->(c:Node {name: 'C'}),
136+
(x:Node {name: 'X'})-[:CONNECTED {weight: 3}]->(y:Node {name: 'Y'})
137+
138+
// Find MSF
139+
CALL algo.MSF({
140+
relationshipWeightProperty: 'weight'
141+
}) YIELD src, dest, weight
142+
RETURN src.name AS from, dest.name AS to, weight
143+
```
144+
145+
**Result:** Two separate trees (A-B-C and X-Y)
146+
147+
## Algorithm Details
148+
149+
FalkorDB's MSF implementation uses an efficient matrix-based approach optimized for graph databases:
150+
151+
1. **Connected Components**: First identifies all connected components in the graph
152+
2. **MST per Component**: Computes a minimum spanning tree for each component using a variant of Kruskal's or Prim's algorithm
153+
3. **Edge Selection**: Selects edges in order of increasing weight, avoiding cycles
154+
155+
### Performance Characteristics
156+
157+
- **Time Complexity**: O(E log V) where E is the number of edges and V is the number of vertices
158+
- **Space Complexity**: O(V + E)
159+
- **Optimized**: Uses sparse matrix representation for efficient computation
160+
161+
## Best Practices
162+
163+
1. **Weight Properties**: Ensure weight properties are numeric (integers or floats)
164+
2. **Missing Weights**: Use `defaultValue` to handle edges without weight properties
165+
3. **Large Graphs**: For large graphs (100K+ nodes), consider filtering by `sourceNodes` or `relationshipTypes`
166+
4. **Directed vs Undirected**: The algorithm treats relationships as undirected for spanning forest purposes
167+
168+
## Related Algorithms
169+
170+
- **[WCC (Weakly Connected Components)](./wcc.md)**: Identify connected components before running MSF
171+
- **[BFS](./bfs.md)**: Traverse the resulting spanning forest
172+
- **[SPpath](./sppath.md)**: Find shortest paths using the spanning forest structure
173+
174+
## See Also
175+
176+
- [Cypher Procedures](../cypher/procedures.md)
177+
- [Graph Algorithms Overview](./index.md)

cypher/procedures.md

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -39,6 +39,10 @@ GRAPH.QUERY social "CALL db.labels() YIELD label"
3939
| db.idx.fulltext.createNodeIndex | `label`, `property` [, `property` ...] | none | Builds a full-text searchable index on a label and the 1 or more specified properties. |
4040
| db.idx.fulltext.drop | `label` | none | Deletes the full-text index associated with the given label. |
4141
| db.idx.fulltext.queryNodes | `label`, `string` | `node`, `score` | Retrieve all nodes that contain the specified string in the full-text indexes on the given label. |
42+
| db.idx.fulltext.queryRelationships | `relationshipType`, `string` | `relationship`, `score` | Retrieve all relationships that contain the specified string in the full-text indexes on the given relationship type. See [Full-Text Indexing](/cypher/indexing#full-text-indexing) for details. |
43+
| db.idx.vector.queryNodes | `label`, `attribute`, `k`, `query` | `node`, `score` | Retrieve up to k nodes with vectors most similar to the query vector using the specified label and attribute. See [Vector Indexing](/cypher/indexing#vector-indexing) for details. |
44+
| db.idx.vector.queryRelationships | `relationshipType`, `attribute`, `k`, `query` | `relationship`, `score` | Retrieve up to k relationships with vectors most similar to the query vector using the specified relationship type and attribute. See [Vector Indexing](/cypher/indexing#vector-indexing) for details. |
4245
| algo.pageRank | `label`, `relationship-type` | `node`, `score` | Runs the pagerank algorithm over nodes of given label, considering only edges of given relationship type. |
43-
| [algo.BFS](#BFS) | `source-node`, `max-level`, `relationship-type` | `nodes`, `edges` | Performs BFS to find all nodes connected to the source. A `max level` of 0 indicates unlimited and a non-NULL `relationship-type` defines the relationship type that may be traversed. |
46+
| algo.BFS | `source-node`, `max-level`, `relationship-type` | `nodes`, `edges` | Performs BFS to find all nodes connected to the source. A `max level` of 0 indicates unlimited and a non-NULL `relationship-type` defines the relationship type that may be traversed. See [BFS Algorithm](/algorithms/bfs) for details. |
47+
| algo.MSF | `config` | `src`, `dest`, `weight`, `relationshipType` | Computes the Minimum Spanning Forest of the graph. See [MSF Algorithm](/algorithms/msf) for details. |
4448
| dbms.procedures() | none | `name`, `mode` | List all procedures in the DBMS, yields for every procedure its name and mode (read/write). |

0 commit comments

Comments
 (0)