Skip to content

fix: harden graph JSON loading against corruption in build, affected, diagnostics #1536

Description

@guyoron1

Summary

Three core graph-loading paths call json.loads() without catching JSONDecodeError or OSError. If graph.json is corrupted (incomplete write from disk-full, power loss, concurrent modification, or manual edit), these paths crash with an unhelpful traceback instead of an actionable error message.

Affected Code Paths

1. graphify/build.py:652build_merge() (HIGH)

data = json.loads(graph_path.read_text(encoding="utf-8"))
  • Impact: Crashes --update (incremental builds) when the existing graph.json is corrupt.
  • User experience: Users cannot use incremental builds and get no guidance on recovery.

2. graphify/affected.py:212load_graph() (HIGH)

raw = json.loads(path.read_text(encoding="utf-8"))
  • Impact: Crashes graphify prs when the graph file is corrupt.
  • User experience: PR impact analysis becomes unavailable.

3. graphify/diagnostics.py:277_read_json_file() (MEDIUM)

data = json.loads(json_path.read_text(encoding="utf-8"))
  • Impact: Crashes graphify diagnose on corrupt JSON.
  • Note: This function already validates isinstance(data, dict) on the next line, but the json.loads itself is unguarded.

How Corruption Happens

  • Incomplete writes (disk full, power loss mid-write)
  • Race conditions during concurrent graphify extract runs
  • Manual edits introducing syntax errors
  • Version incompatibilities between graph formats

Suggested Fix

Wrap each json.loads call with try/except (json.JSONDecodeError, OSError) and raise a RuntimeError with:

  1. Which file failed to parse
  2. The underlying error
  3. Recovery guidance (e.g., "delete graph.json and run a full rebuild")

Note: export.py:492 has a similar json.loads but is already inside a try/except Exception: pass block (intentionally — if the old graph is unreadable, proceed with the write).

Scope

~15 lines per file, ~45 lines total. No behavioral changes to the happy path.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions