Skip to content

Hyperedge members/node_ids alias keys silently dropped (only nodes is read) #1561

Description

@askalot-io

Summary

build_from_json (and the merge path in build.py) silently drops hyperedge membership when a subagent emits the member list under the alias key members (or node_ids) instead of nodes. Edges already alias from/tosource/target, and source-on-a-node warns loudly — but hyperedges have no equivalent alias handling and no warning, so the membership just disappears.

This matters because the extraction subagents the graphify skill dispatches frequently produce members/node_ids for hyperedges (it's a very natural key name for an LLM), so the loss happens on graphify's own primary code path.

Reproducer

from graphify.build import build_from_json

ext = {
  "nodes": [
    {"id": "a", "label": "A", "file_type": "concept", "source_file": "x.md"},
    {"id": "b", "label": "B", "file_type": "concept", "source_file": "x.md"},
    {"id": "c", "label": "C", "file_type": "concept", "source_file": "x.md"},
  ],
  "edges": [
    {"from": "a", "to": "b", "relation": "references",
     "confidence": "EXTRACTED", "confidence_score": 1.0},
  ],
  "hyperedges": [
    {"id": "h1", "label": "Triad", "members": ["a", "b", "c"],
     "relation": "participate_in", "confidence": "INFERRED", "confidence_score": 0.75},
  ],
}

G = build_from_json(ext, root=None, directed=False)
print("edges:", G.number_of_edges())                 # 1  -> from/to alias WORKS
he = G.graph["hyperedges"][0]
print("keys:", list(he.keys()))                       # [... 'members' ...] -> stored verbatim
print("has .nodes list:", isinstance(he.get("nodes"), list))   # False -> membership lost

Output:

edges: 1
keys: ['id', 'label', 'members', 'relation', 'confidence', 'confidence_score']
has .nodes list: False

The hyperedge object survives, but since every consumer reads he["nodes"] (e.g. the rekey loop at build.py:309-311, which guards on isinstance(he.get("nodes"), list)), the members list is never rekeyed, validated, or used. No warning is printed.

Root cause

  • build.py:443-446 aliases edge from/tosource/target.
  • build.py:258-271 warns when a node uses source instead of source_file.
  • Hyperedge handling (build.py:309-311, build.py:518-526) only ever reads he.get("nodes"). There is no alias for members/node_ids and no warning when the recognized key is absent.

Suggested fix

In the hyperedge handling, normalize the member-list key the same way edges normalize from/to, e.g. early in build_from_json:

for he in extraction.get("hyperedges", []) or []:
    if isinstance(he, dict) and "nodes" not in he:
        for alias in ("members", "node_ids"):
            if isinstance(he.get(alias), list):
                he["nodes"] = he.pop(alias)
                break

Optionally emit a [graphify] WARNING: when an alias is used (mirroring the sourcesource_file node warning) so drift is visible rather than silent.

Environment

  • graphifyy 0.9.2
  • Python 3.12
  • Found while running the /graphify skill on a ~1,930-file monorepo; several extraction subagents emitted members/node_ids for hyperedges, which silently produced membership-less hyperedges.

Note

While debugging I also briefly suspected edge from/to and node source were being dropped — they are not; both are handled correctly. The hyperedge member-list alias is the only real gap I could confirm.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions