Add serializable, lazy-loadable graph stats on Plotter for GFQL optimization

## Problem
GFQL WHERE planning and pruning can blow up without cheap selectivity estimates. Today we recompute domain stats per query, and we have no persistent, reusable stats tied to a graph/Plotter instance.

## Proposal
Introduce a **first-class stats layer** on the Plotter stack (e.g., `g.stats`) that is **serializable** and **lazy-loadable**. Stats should be computed on demand, cached, and optionally persisted alongside a graph bundle or stored separately for reuse across sessions.

## Scope / Requirements
- **Serializable**: JSON (or msgpack) friendly; versioned schema; safe to round-trip across Python versions.
- **Lazy-loadable**: compute only when requested; allow a cache-backed mode (in-memory + optional disk).
- **First-class on Plotter**: accessible via `g.stats` with explicit compute APIs; carried with `Plotter`/`Graphistry` objects and optionally included in uploads.
- **DF-native**: pandas + cuDF compatible; avoid `.to_pandas()` in hot paths.
- **Optional**: zero overhead unless enabled/asked for.

## Candidate Stats (common in graph engines like Neo4j, TigerGraph, etc.)
- Table cardinalities (nodes/edges)
- Per-column NDV (approx OK; HLL-style)
- Per-column min/max + null fraction
- Degree distributions or summary stats (min/max/mean/quantiles)
- Optional: per-label or per-type stats (if labels/types exist)

## Why (GFQL priorities)
- Clause ordering / gating based on selectivity
- Inequality bounds pruning (min/max or quantiles)
- Semijoin thresholds for domain intersections
- Query diagnostics (explain-style stats)

## Deliverables
1. Stats data model + serialization format
2. `Plotter.stats` API + lazy compute + caching hooks
3. Integration points in GFQL WHERE planner/executor (behind feature flags)
4. Tests for parity, persistence, and cudf/pandas compatibility

## Non-goals (initial)
- Full cost-based optimizer
- Cross-graph/global stats registry

## Notes
This is intended to unblock GFQL planning and pruning work without baking in a full optimizer.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add serializable, lazy-loadable graph stats on Plotter for GFQL optimization #900

Problem

Proposal

Scope / Requirements

Candidate Stats (common in graph engines like Neo4j, TigerGraph, etc.)

Why (GFQL priorities)

Deliverables

Non-goals (initial)

Notes

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Add serializable, lazy-loadable graph stats on Plotter for GFQL optimization #900

Description

Problem

Proposal

Scope / Requirements

Candidate Stats (common in graph engines like Neo4j, TigerGraph, etc.)

Why (GFQL priorities)

Deliverables

Non-goals (initial)

Notes

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions