|
| 1 | +--- |
| 2 | +title: Document Unfolding and Cycle Detection Reference |
| 3 | +nextjs: |
| 4 | + metadata: |
| 5 | + title: Document Unfolding and Cycle Detection Reference |
| 6 | + description: Understanding @unfoldable annotation, cycle detection, and performance characteristics of document traversal in TerminusDB |
| 7 | + keywords: TerminusDB, @unfoldable, document unfolding, cycle detection, self-referencing documents, performance |
| 8 | + openGraph: |
| 9 | + images: https://assets.terminusdb.com/docs/technical-documentation-terminuscms-og.png |
| 10 | + alternates: |
| 11 | + canonical: https://terminusdb.org/docs/document-unfolding-reference/ |
| 12 | +media: [] |
| 13 | +--- |
| 14 | + |
| 15 | +TerminusDB provides automatic document unfolding for linked documents marked with the `@unfoldable` schema annotation. This reference guide explains how unfolding works, how cycle detection prevents infinite recursion, and performance characteristics of the implementation. |
| 16 | + |
| 17 | +--> Valid as of the 11.2 release. |
| 18 | + |
| 19 | +## What is Document Unfolding? |
| 20 | + |
| 21 | +Document unfolding is the process of automatically expanding referenced documents when retrieving data through the Document API, GraphQL, or WOQL. When a class is marked with `@unfoldable: []`, any references to documents of that class are automatically expanded inline instead of returning just an ID reference. |
| 22 | + |
| 23 | +### Example Schema |
| 24 | + |
| 25 | +```json |
| 26 | +{ |
| 27 | + "@type": "Class", |
| 28 | + "@id": "Person", |
| 29 | + "@unfoldable": [], |
| 30 | + "name": "xsd:string", |
| 31 | + "friend": { |
| 32 | + "@type": "Set", |
| 33 | + "@class": "Person" |
| 34 | + } |
| 35 | +} |
| 36 | +``` |
| 37 | + |
| 38 | +### Unfolded vs Non-Unfolded Results |
| 39 | + |
| 40 | +**Without `@unfoldable` (Reference Only):** |
| 41 | +```json |
| 42 | +{ |
| 43 | + "@id": "Person/Alice", |
| 44 | + "@type": "Person", |
| 45 | + "name": "Alice", |
| 46 | + "friend": "Person/Bob" // Just an ID string |
| 47 | +} |
| 48 | +``` |
| 49 | + |
| 50 | +**With `@unfoldable` (Automatically Expanded):** |
| 51 | +```json |
| 52 | +{ |
| 53 | + "@id": "Person/Alice", |
| 54 | + "@type": "Person", |
| 55 | + "name": "Alice", |
| 56 | + "friend": [ |
| 57 | + { |
| 58 | + "@id": "Person/Bob", |
| 59 | + "@type": "Person", |
| 60 | + "name": "Bob" |
| 61 | + } |
| 62 | + ] |
| 63 | +} |
| 64 | +``` |
| 65 | + |
| 66 | +## Cycle Detection |
| 67 | + |
| 68 | +When documents reference themselves directly or indirectly, TerminusDB's cycle detection mechanism prevents infinite recursion while ensuring all nodes are properly rendered. |
| 69 | + |
| 70 | +### How Cycle Detection Works |
| 71 | + |
| 72 | +The unfolding implementation uses a **path stack** to track the current traversal from root to the current node. When a document ID is encountered that's already in the current path, a cycle is detected: |
| 73 | + |
| 74 | +1. **Path Stack Maintained**: As traversal descends into children, document IDs are pushed onto the stack |
| 75 | +2. **Cycle Check**: Before expanding a document, check if its ID is already in the current path |
| 76 | +3. **ID Reference Returned**: If a cycle is detected, return just the `@id` string instead of expanding |
| 77 | +4. **Backtrack**: When returning from a child, pop its ID from the stack |
| 78 | + |
| 79 | +### Cycle Detection Behavior Examples |
| 80 | + |
| 81 | +#### Direct Self-Reference |
| 82 | + |
| 83 | +**Schema:** |
| 84 | +```json |
| 85 | +{ |
| 86 | + "@type": "Class", |
| 87 | + "@id": "LinguisticObject", |
| 88 | + "@unfoldable": [], |
| 89 | + "name": "xsd:string", |
| 90 | + "partOf": { |
| 91 | + "@type": "Set", |
| 92 | + "@class": "LinguisticObject" |
| 93 | + } |
| 94 | +} |
| 95 | +``` |
| 96 | + |
| 97 | +**Data:** |
| 98 | +```json |
| 99 | +{ |
| 100 | + "@id": "LinguisticObject/self", |
| 101 | + "@type": "LinguisticObject", |
| 102 | + "name": "Self Referencing", |
| 103 | + "partOf": ["LinguisticObject/self"] // Points to itself |
| 104 | +} |
| 105 | +``` |
| 106 | + |
| 107 | +**Result:** |
| 108 | +```json |
| 109 | +{ |
| 110 | + "@id": "LinguisticObject/self", |
| 111 | + "@type": "LinguisticObject", |
| 112 | + "name": "Self Referencing", |
| 113 | + "partOf": ["LinguisticObject/self"] // ID string, not expanded |
| 114 | +} |
| 115 | +``` |
| 116 | + |
| 117 | +#### Circular Reference Chain (A→B→A) |
| 118 | + |
| 119 | +**Data:** |
| 120 | +```json |
| 121 | +[ |
| 122 | + { |
| 123 | + "@id": "Node/A", |
| 124 | + "@type": "Node", |
| 125 | + "name": "Node A", |
| 126 | + "next": "Node/B" |
| 127 | + }, |
| 128 | + { |
| 129 | + "@id": "Node/B", |
| 130 | + "@type": "Node", |
| 131 | + "name": "Node B", |
| 132 | + "next": "Node/A" // Back to A |
| 133 | + } |
| 134 | +] |
| 135 | +``` |
| 136 | + |
| 137 | +**Result (retrieving Node/A):** |
| 138 | +```json |
| 139 | +{ |
| 140 | + "@id": "Node/A", |
| 141 | + "@type": "Node", |
| 142 | + "name": "Node A", |
| 143 | + "next": { |
| 144 | + "@id": "Node/B", |
| 145 | + "@type": "Node", |
| 146 | + "name": "Node B", |
| 147 | + "next": "Node/A" // Cycle detected, ID string returned |
| 148 | + } |
| 149 | +} |
| 150 | +``` |
| 151 | + |
| 152 | +#### Multiple Circular Paths |
| 153 | + |
| 154 | +For complex graphs with multiple interconnected cycles, each path is tracked independently. Nodes are expanded until they appear again in the current traversal path. |
| 155 | + |
| 156 | +**Graph:** |
| 157 | +``` |
| 158 | +A → B → C → A (cycle) |
| 159 | +A → D → A (cycle) |
| 160 | +B → D |
| 161 | +``` |
| 162 | + |
| 163 | +The cycle detection ensures no node is expanded more than once per path, preventing infinite recursion while rendering all reachable nodes. |
| 164 | + |
| 165 | +### Deep Nested Structures |
| 166 | + |
| 167 | +For long chains (e.g., 100+ nodes without cycles), TerminusDB traverses the entire structure: |
| 168 | + |
| 169 | +```json |
| 170 | +{ |
| 171 | + "@id": "ChainNode/0", |
| 172 | + "value": 0, |
| 173 | + "next": { |
| 174 | + "@id": "ChainNode/1", |
| 175 | + "value": 1, |
| 176 | + "next": { |
| 177 | + "@id": "ChainNode/2", |
| 178 | + "value": 2, |
| 179 | + // ... continues for all 100 nodes |
| 180 | + } |
| 181 | + } |
| 182 | +} |
| 183 | +``` |
| 184 | + |
| 185 | +## Work Limit Protection |
| 186 | + |
| 187 | +To prevent excessive resource consumption during document unfolding, TerminusDB implements a work limit that caps the total number of operations during traversal. |
| 188 | + |
| 189 | +### Configuration |
| 190 | + |
| 191 | +**Environment Variable:** `TERMINUSDB_DOC_WORK_LIMIT` |
| 192 | + |
| 193 | +**Default:** 500,000 operations |
| 194 | + |
| 195 | +**Setting Custom Limit:** |
| 196 | +```bash |
| 197 | +# Linux/macOS |
| 198 | +export TERMINUSDB_DOC_WORK_LIMIT=1000000 |
| 199 | + |
| 200 | +# Docker |
| 201 | +docker run -e TERMINUSDB_DOC_WORK_LIMIT=1000000 terminusdb/terminusdb-server:latest |
| 202 | + |
| 203 | +# Kubernetes ConfigMap |
| 204 | +env: |
| 205 | + - name: TERMINUSDB_DOC_WORK_LIMIT |
| 206 | + value: "1000000" |
| 207 | +``` |
| 208 | + |
| 209 | +### When Work Limit is Exceeded |
| 210 | + |
| 211 | +If document traversal exceeds the work limit: |
| 212 | + |
| 213 | +1. **Traversal Terminates**: Document retrieval stops |
| 214 | +2. **Error Returned**: Returns `DocRetrievalError::LimitExceeded` |
| 215 | +3. **Partial Results**: No partial data is returned |
| 216 | +4. **Document IRI Included**: Error message includes the document IRI that triggered the limit |
| 217 | + |
| 218 | +**Recommended Limits by Use Case:** |
| 219 | + |
| 220 | +| Use Case | Recommended Limit | Rationale | |
| 221 | +|----------|-------------------|-----------| |
| 222 | +| Simple documents | 100,000 | Default for most use cases | |
| 223 | +| Complex hierarchies | 500,000 (default) | Balanced performance/safety | |
| 224 | +| Large knowledge graphs | 1,000,000 - 5,000,000 | Deep traversals needed | |
| 225 | +| Real-time APIs | 50,000 - 100,000 | Prioritize response time | |
| 226 | + |
| 227 | +## Performance Characteristics |
| 228 | + |
| 229 | +### Path Stack Implementation |
| 230 | + |
| 231 | +TerminusDB uses a **Vec-based path stack** for cycle detection, which is optimal for this use case: |
| 232 | + |
| 233 | +**Why Vec (not HashSet):** |
| 234 | +- **Path stack semantics**: The `visited` collection tracks the current DFS path, not all visited nodes |
| 235 | +- **Small size**: Path depth is typically 10-50 nodes, not thousands |
| 236 | +- **Cache-friendly**: Sequential access pattern |
| 237 | +- **Stack mirroring**: Push/pop operations naturally mirror traversal stack |
| 238 | + |
| 239 | +Performance benchmarks show approx double speed of Vec across both small and large documents. |
| 240 | + |
| 241 | +**Empirical Results:** |
| 242 | +- For path depth < 100: Vec is faster than HashSet (no hash overhead) |
| 243 | +- For path depth > 100: Difference is negligible in practice |
| 244 | +- Real-world path depths: typically 10-50 nodes |
| 245 | + |
| 246 | +### Schema Design Recommendations |
| 247 | + |
| 248 | +**1. Limit Depth:** |
| 249 | +```json |
| 250 | +{ |
| 251 | + "@type": "Class", |
| 252 | + "@id": "Category", |
| 253 | + "@unfoldable": [], |
| 254 | + "name": "xsd:string", |
| 255 | + "parent": { |
| 256 | + "@type": "Optional", |
| 257 | + "@class": "Category" // Parent-child hierarchy |
| 258 | + }, |
| 259 | + "subcategories": { |
| 260 | + "@type": "Set", |
| 261 | + "@class": "SubCategory" // Use different class for children |
| 262 | + } |
| 263 | +} |
| 264 | +``` |
| 265 | + |
| 266 | +**2. Separate Unfoldable and Non-Unfoldable Relationships:** |
| 267 | +```json |
| 268 | +{ |
| 269 | + "@type": "Class", |
| 270 | + "@id": "Person", |
| 271 | + "@unfoldable": [], |
| 272 | + "name": "xsd:string", |
| 273 | + "profile": { |
| 274 | + "@type": "Optional", |
| 275 | + "@class": "Profile" // Profile is @unfoldable |
| 276 | + }, |
| 277 | + "posts": { |
| 278 | + "@type": "Set", |
| 279 | + "@class": "Post" // Post is NOT @unfoldable (too many) |
| 280 | + } |
| 281 | +} |
| 282 | +``` |
| 283 | + |
| 284 | +**3. Use Optional or Set/Cardinality for Potentially Circular References:** |
| 285 | +```json |
| 286 | +{ |
| 287 | + "@type": "Class", |
| 288 | + "@id": "Node", |
| 289 | + "@unfoldable": [], |
| 290 | + "next": { |
| 291 | + "@type": "Optional", // Allows termination, similar to Set/Cardinality |
| 292 | + "@class": "Node" |
| 293 | + } |
| 294 | +} |
| 295 | +``` |
| 296 | + |
| 297 | +## Troubleshooting |
| 298 | + |
| 299 | +### Document Retrieval Returns Just IDs |
| 300 | + |
| 301 | +**Symptom:** Expected nested objects, got ID strings |
| 302 | + |
| 303 | +**Cause:** Cycle detected or class not marked `@unfoldable` |
| 304 | + |
| 305 | +**Solution:** |
| 306 | +1. Verify class has `@unfoldable: []` annotation |
| 307 | +2. Check if circular reference exists (expected behavior) |
| 308 | +3. Review schema for proper unfoldable annotations |
| 309 | + |
| 310 | +### Work Limit Exceeded Errors |
| 311 | + |
| 312 | +**Symptom:** `DocRetrievalError::LimitExceeded` during retrieval |
| 313 | + |
| 314 | +**Cause:** Document graph too large or deeply nested |
| 315 | + |
| 316 | +**Solutions:** |
| 317 | +1. **Increase limit**: Set `TERMINUSDB_DOC_WORK_LIMIT` environment variable |
| 318 | +2. **Reduce unfoldable depth**: Mark fewer classes as `@unfoldable` |
| 319 | +3. **Break circular references**: Ensure proper data structure |
| 320 | +4. **Use pagination**: Fetch large collections separately |
| 321 | + |
| 322 | +### Performance Degradation |
| 323 | + |
| 324 | +**Symptom:** Slow document retrieval |
| 325 | + |
| 326 | +**Cause:** Large unfoldable graphs |
| 327 | + |
| 328 | +**Solutions:** |
| 329 | +1. **Profile query**: Check path depth and node count |
| 330 | +2. **Reduce unfoldable scope**: Only unfold necessary relationships |
| 331 | + |
| 332 | +## API Examples |
| 333 | + |
| 334 | +### Document API |
| 335 | + |
| 336 | +```bash |
| 337 | +# Retrieve with automatic unfolding (default) |
| 338 | +curl -X GET "http://localhost:6363/api/document/admin/mydb" \ |
| 339 | + -H "Authorization: Basic YWRtaW46cm9vdA==" \ |
| 340 | + -d '{"graph_type": "instance", "id": "Person/Alice", "as_list": true}' |
| 341 | +``` |
| 342 | + |
| 343 | +### GraphQL |
| 344 | + |
| 345 | +```graphql |
| 346 | +# Unfolding happens automatically for @unfoldable classes |
| 347 | +query { |
| 348 | + Person { |
| 349 | + name |
| 350 | + friend { # Automatically expanded |
| 351 | + name |
| 352 | + friend { # Nested expansion |
| 353 | + name |
| 354 | + } |
| 355 | + } |
| 356 | + } |
| 357 | +} |
| 358 | +``` |
| 359 | + |
| 360 | +### WOQL |
| 361 | + |
| 362 | +```javascript |
| 363 | +// Using WOQL to read documents with unfolding |
| 364 | +WOQL.read_document("Person/Alice", "v:Doc") |
| 365 | +``` |
| 366 | + |
| 367 | +## Related Documentation |
| 368 | + |
| 369 | +- [Schema Reference Guide](/docs/schema-reference-guide) - Complete schema annotation reference |
| 370 | +- [Document API Reference](/docs/document-insertion) - HTTP API for documents |
| 371 | +- [GraphQL Reference](/docs/graphql-query-reference) - GraphQL query syntax |
| 372 | +- [Path Queries](/docs/path-query-reference-guide) - Advanced path traversal |
| 373 | + |
| 374 | +## Summary |
| 375 | + |
| 376 | +**Key Takeaways:** |
| 377 | +- `@unfoldable` automatically expands linked documents |
| 378 | +- Cycle detection prevents infinite recursion using path stack |
| 379 | +- Vec-based implementation is optimal for path-bounded traversal |
| 380 | +- `TERMINUSDB_DOC_WORK_LIMIT` protects against excessive operations |
| 381 | +- ID references returned when cycles detected (not an error) |
| 382 | +- Path depth typically 10-50 nodes (not total document count) |
| 383 | + |
| 384 | +**Performance Notes:** |
| 385 | +- Vec path stack: O(d) lookup where d = depth (typically < 50) |
| 386 | +- Work limit default: 500,000 operations |
| 387 | +- Memory overhead: 8 bytes per path depth level |
| 388 | +- Cache-friendly sequential access pattern |
| 389 | + |
| 390 | +--- |
| 391 | + |
| 392 | +**Last Updated:** October 31, 2025 |
| 393 | +**Applies to:** TerminusDB 11.2+ |
0 commit comments