Skip to content

[ENH, GRAPH] Experimental: A module for time-series graphs that relies on only networkx, and simulation and algorithms for time-series #21

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 29 commits into from
Jan 5, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
29 commits
Select commit Hold shift + click to select a range
21a438c
Initial commit
adam2392 Sep 21, 2022
9084f19
Store change
adam2392 Oct 11, 2022
b273f8a
Adding API for time series graph for dodiscover
adam2392 Oct 19, 2022
56e42cc
Fix lint for timeseries
adam2392 Oct 26, 2022
2bf9977
Merge branch 'main' into timeseries
adam2392 Oct 26, 2022
e82b74d
Adding new time-series graph
adam2392 Oct 27, 2022
0d00542
Working time-series graphs with forward and backwards homologous edge…
adam2392 Nov 3, 2022
4b2d950
Freeze state before refactoring to remove mixin classes for time-series
adam2392 Nov 9, 2022
242fa80
Adding updated networkx graph for time-series
adam2392 Nov 14, 2022
3167a78
Adding set max lag method
adam2392 Nov 15, 2022
66eadba
Added time series stuff
adam2392 Nov 16, 2022
9fc8739
Adding updated timeseries graphs
adam2392 Nov 18, 2022
06d0487
Adding pds algorithms for time-series graphs
adam2392 Nov 29, 2022
e09c23b
WIP
adam2392 Dec 17, 2022
a73eb04
Merge branch 'main' into timeseries
adam2392 Dec 17, 2022
dcc9c77
Working tests and clean refactor
adam2392 Dec 21, 2022
670012c
Merge main
adam2392 Dec 21, 2022
6042128
Merging
adam2392 Dec 22, 2022
c6d0c40
Working prototype
adam2392 Dec 29, 2022
50c1a2c
Trying to get timeseries full pipeline working
adam2392 Jan 5, 2023
502ea64
Update circle ci
adam2392 Jan 5, 2023
c258ecf
Fix unit test and integration test
adam2392 Jan 5, 2023
4e9c262
Now full ci
adam2392 Jan 5, 2023
072325a
Reset cache
adam2392 Jan 5, 2023
ca9ce52
Adding sys info
adam2392 Jan 5, 2023
432228f
Adding sys info
adam2392 Jan 5, 2023
c8d33eb
Fix note
adam2392 Jan 5, 2023
d1ccdb0
Fix
adam2392 Jan 5, 2023
1304d2e
Fix
adam2392 Jan 5, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 4 additions & 3 deletions .circleci/config.yml
Original file line number Diff line number Diff line change
Expand Up @@ -46,8 +46,8 @@ jobs:
- run:
name: Install the latest version of Poetry
command: |
curl -sSL https://raw.githubusercontent.com/python-poetry/poetry/master/get-poetry.py | POETRY_UNINSTALL=1 python -
curl -sSL https://install.python-poetry.org | python3 -
curl -sSL https://install.python-poetry.org | python3 - --version 1.3.0
poetry --version
- run:
name: Set BASH_ENV
command: |
Expand All @@ -70,7 +70,7 @@ jobs:
command: sudo apt update && sudo apt install -y pandoc optipng
- python/install-packages:
pkg-manager: poetry
cache-version: "v2" # change to clear cache
cache-version: "v1" # change to clear cache
args: "--with docs"
- run:
name: Check poetry package versions
Expand Down Expand Up @@ -140,6 +140,7 @@ jobs:
- python/install-packages:
pkg-manager: poetry
cache-version: "v1" # change to clear cache
args: "--with docs"
- run:
name: make linkcheck
command: |
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/main.yml
Original file line number Diff line number Diff line change
Expand Up @@ -137,7 +137,7 @@ jobs:
run: pip install poetry-dynamic-versioning
- name: Install packages via poetry
run: |
poetry install --with test
poetry install --with test,ts
# TODO: uncomment, when MixedEdgeGraph PRed into networkx
# - name: Install Networkx (main)
# if: "matrix.networkx == 'main'"
Expand Down
29 changes: 0 additions & 29 deletions Makefile

This file was deleted.

9 changes: 5 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,11 +12,15 @@ Note: The API is subject to change without deprecation cycles due to the current

## Why?

Representation of causal inference models in Python are severely lacking. Moreover, sampling from causal models is non-trivial. However, sampling from simulations is a requirement to benchmark different structural learning, causal ID, or other causal related algorithms.
Representation of causal graphical models in Python are severely lacking.

PyWhy-Graphs implements a graphical API layer for ADMG, CPDAG and PAG. For causal DAGs, we recommend using the `networkx.DiGraph` class and
ensuring acylicity via `networkx.is_directed_acyclic_graph` function.

Existing packages that aim to represent causal graphs either break from the networkX API, or only implement a subset of the relevant causal graphs. By keeping in-line with the robust NetworkX API, we aim to ensure a consistent user experience and a gentle introduction to causal graphical models.

Moreover, sampling from causal models is non-trivial, but a requirement for benchmarking many causal algorithms in discovery, ID, estimation and more. We aim to provide simulation modules that are easily connected with causal graphs to provide a simple robust API for modeling causal graphs and then simulating data.

# Documentation

See the [development version documentation](https://py-why.github.io/pywhy-graphs/dev/index.html).
Expand Down Expand Up @@ -47,9 +51,6 @@ To install the package from github, clone the repository and then `cd` into the

poetry install

# for time-series graph functionality
poetry install --extras ts

# for vizualizing graph functionality
poetry install --extras viz

Expand Down
5 changes: 5 additions & 0 deletions docs/_templates/autosummary/base.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
{{ objname | escape | underline }}

.. currentmodule:: {{ module }}

.. auto{{ objtype }}:: {{ objname }}
31 changes: 28 additions & 3 deletions docs/api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -13,8 +13,8 @@ for classes (``CamelCase`` names) and functions
(``underscore_case`` names) of pywhy-graphs, grouped thematically by analysis
stage.

Most-used classes
=================
Causal graph classes
====================
These are the causal classes for Structural Causal Models (SCMs), or various causal
graphs encountered in the literature.

Expand Down Expand Up @@ -78,7 +78,21 @@ welcome feedback.
MixedEdgeGraph
bidirected_to_unobserved_confounder
m_separated


Timeseries
==========
The following are useful functions that operate specifically on time-series graphs.

.. currentmodule:: pywhy_graphs.classes.timeseries
.. autosummary::
:toctree: generated/

complete_ts_graph
empty_ts_graph
get_summary_graph
has_homologous_edges
nodes_in_time_order

Visualization of causal graphs
==============================
Visualization of causal graphs is different compared to networkx because causal graphs
Expand All @@ -91,3 +105,14 @@ to perform modular visualization of nodes and edges.
:toctree: generated/

draw
timeseries_layout

Utilities for debugging
=======================
.. currentmodule:: pywhy_graphs

.. autosummary::
:toctree: generated/

sys_info

49 changes: 39 additions & 10 deletions docs/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,7 @@

# If your documentation needs a minimal Sphinx version, state it here.
#
needs_sphinx = "4.0"
needs_sphinx = "5.0"

# Add any Sphinx extension module names here, as strings. They can be
# extensions coming with Sphinx (named 'sphinx.ext.*') or your custom
Expand All @@ -57,6 +57,7 @@
"sphinx_gallery.gen_gallery",
"sphinxcontrib.bibtex",
"sphinx_copybutton",
# 'sphinx.ext.napoleon',
"numpydoc",
# "IPython.sphinxext.ipython_console_highlighting",
]
Expand All @@ -68,16 +69,21 @@
# generate autosummary even if no references
# -- sphinx.ext.autosummary
autosummary_generate = True
autodoc_default_options = {"inherited-members": None}
autodoc_default_options = {
"inherited-members": None,
}
autodoc_inherit_docstrings = True
# autodoc_typehints = "signature"

# -- numpydoc
# Below is needed to prevent errors
numpydoc_xref_param_type = True
# numpydoc_xref_param_type = True
numpydoc_show_inherited_class_members = False
numpydoc_show_class_members = False
numpydoc_class_members_toctree = False
numpydoc_attributes_as_param_list = True
numpydoc_use_blockquotes = True
numpydoc_validate = True
# numpydoc_validate = True

numpydoc_xref_ignore = {
# words
Expand Down Expand Up @@ -108,13 +114,16 @@
"no",
"attributes",
"dictionary",
"ArrayLike",
"pywhy_nx.MixedEdgeGraph",
# pywhy-graphs
"causal",
"Node",
"circular",
"endpoint",
"TsNode",
"tsdict",
"TimeSeriesGraph",
"TimeSeriesDiGraph",
# networkx
"node",
"nodes",
Expand All @@ -136,6 +145,7 @@
"Graph",
"sets",
"value",
'edges is None', 'nodes is None', 'G = nx.DiGraph(D)',
# shapes
"n_times",
"obj",
Expand All @@ -159,12 +169,15 @@
"nx.MultiDiGraph": "networkx.MultiDiGraph",
"NetworkXError": "networkx.NetworkXError",
"pgmpy.models.BayesianNetwork": "pgmpy.models.BayesianNetwork",
"ArrayLike": "numpy.ndarray",
"ArrayLike": "numpy.typing.ArrayLike",
# pywhy-graphs
"ADMG": "pywhy_graphs.ADMG",
"PAG": "pywhy_graphs.PAG",
"CPDAG": "pywhy_graphs.CPDAG",
"pywhy_nx.MixedEdgeGraph": "pywhy_graphs.networkx.MixedEdgeGraph",
"TimeSeriesGraph": "pywhy_graphs.classes.timeseries.TimeSeriesGraph",
"TimeSeriesDiGraph": "pywhy_graphs.classes.timeseries.TimeSeriesDiGraph",
"TimeSeriesMixedEdgeGraph": "pywhy_graphs.classes.timeseries.TimeSeriesMixedEdgeGraph",
# joblib
"joblib.Parallel": "joblib.Parallel",
# pandas
Expand Down Expand Up @@ -199,15 +212,18 @@

intersphinx_mapping = {
"python": ("https://docs.python.org/3", None),
"numpy": ("https://numpy.org/devdocs", None),
"scipy": ("https://scipy.github.io/devdocs", None),
"numpy": ("https://numpy.org/doc/stable/", None),
"neps": ("https://numpy.org/neps", None),
"scipy": ("https://docs.scipy.org/doc/scipy/reference", None),
"networkx": ("https://networkx.org/documentation/latest/", None),
"nx-guides": ("https://networkx.org/nx-guides/", None),
"matplotlib": ("https://matplotlib.org/stable", None),
"pandas": ("https://pandas.pydata.org/pandas-docs/dev", None),
"pgmpy": ("https://pgmpy.org", None),
"pandas": ("https://pandas.pydata.org/pandas-docs/stable", None),
"sklearn": ("https://scikit-learn.org/stable", None),
"joblib": ("https://joblib.readthedocs.io/en/latest", None),
"pygraphviz": ("https://pygraphviz.github.io/documentation/stable/", None),
"graphviz": ("https://graphviz.readthedocs.io/en/stable/", None),
"sphinx-gallery": ("https://sphinx-gallery.github.io/stable/", None),
}
intersphinx_timeout = 5

Expand Down Expand Up @@ -315,10 +331,23 @@
("py:obj", "networkx.MixedEdgeGraph"),
("py:obj", "pywhy_graphs.networkx.MixedEdgeGraph"),
("py:obj", "pywhy_nx.MixedEdgeGraph"),
("py:class", "optional"),
("py:class", "array"),
("py:class", "pywhy_nx.classes.timeseries.TimeSeriesGraph"),
("py:class", "pywhy_nx.classes.timeseries.TimeSeriesDiGraph"),
("py:class", "pywhy_nx.classes.timeseries.TimeSeriesMixedEdgeGraph"),
("py:class", "pywhy_nx.classes.timeseries.StationaryTimeSeriesGraph"),
("py:class", "pywhy_nx.classes.timeseries.StationaryTimeSeriesDiGraph"),
("py:class", "pywhy_nx.classes.timeseries.StationaryTimeSeriesMixedEdgeGraph"),
("py:class", "pywhy_graphs.classes.timeseries.base.tsdict"),
("py:class", "networkx.classes.mixededge.MixedEdgeGraph"),
("py:class", "numpy._typing._array_like._SupportsArray"),
("py:class", "numpy._typing._nested_sequence._NestedSequence"),
]
nitpick_ignore_regex = [
('py:obj', r"pywhy_graphs\.classes\.timeseries*"),
('py:obj', r"networkx*"),
]


# -- Warnings management -----------------------------------------------------
Expand Down
59 changes: 59 additions & 0 deletions docs/reference/classes/index
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
.. _classes:

***********
Graph types
***********

Pywhy-graphs provides data structures and methods for storing causal graphs.

The classes heavily rely on NetworkX and follows a similar API.

The choice of graph class depends on the structure of the
graph you want to represent.

Which graph class should I use?
===============================

+-------------------+----------------------------------+-----------------------+
| Pywhy_graph Class | Edge Types | Latent confounders |
+===================+==================================+=======================+
| ADMG | directed, bidirected, undirected | Yes |
+-------------------+------------+--------------------+------------------------+

We also represent common equivalence classes of causal graphs.

+-------------------+----------------------------------+-----------------------+
| Pywhy_graph Class | Edge Types | Latent confounders |
+===================+==================================+=======================+
| CPDAG | directed, undirected | No |
+-------------------+----------------------------------+-----------------------+
| PAG | directed, bidirected, undirected | Yes |
+-------------------+---------------------------------+------------------------+
| MultiDiGraph | directed | Yes | Yes |
+-------------------+------------+--------------------+------------------------+

Causal graph types
==================

.. currentmodule:: pywhy_graphs.classes.timeseries
.. autoclass:: TimeSeriesGraph
:inherited-members:
.. autoclass:: TimeSeriesDiGraph
:inherited-members:
.. autoclass:: TimeSeriesMixedEdgeGraph
:inherited-members:
.. autoclass:: StationaryTimeSeriesCPDAG
:inherited-members:
.. autoclass:: StationaryTimeSeriesDiGraph
:inherited-members:
.. autoclass:: StationaryTimeSeriesGraph
:inherited-members:
.. autoclass:: StationaryTimeSeriesMixedEdgeGraph
:inherited-members:
.. autoclass:: StationaryTimeSeriesPAG
:inherited-members:

.. note:: NetworkX uses `dicts` to store the nodes and neighbors in a graph.
So the reporting of nodes and edges for the base graph classes may not
necessarily be consistent across versions and platforms; however, the reporting
for CPython is consistent across platforms and versions after 3.6.
22 changes: 22 additions & 0 deletions docs/references.bib
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,24 @@ @article{Gamez2011
doi = {10.1007/s10618-010-0178-6}
}


@InProceedings{Malinsky18a_svarfci,
title = {Causal Structure Learning from Multivariate Time Series in Settings with Unmeasured Confounding},
author = {Malinsky, Daniel and Spirtes, Peter},
booktitle = {Proceedings of 2018 ACM SIGKDD Workshop on Causal Disocvery},
pages = {23--47},
year = {2018},
editor = {Le, Thuc Duy and Zhang, Kun and Kıcıman, Emre and Hyvärinen, Aapo and Liu, Lin},
volume = {92},
series = {Proceedings of Machine Learning Research},
month = {20 Aug},
publisher = {PMLR},
pdf = {http://proceedings.mlr.press/v92/malinsky18a/malinsky18a.pdf},
url = {https://proceedings.mlr.press/v92/malinsky18a.html},
abstract = {We present constraint-based and (hybrid) score-based algorithms for causal structure learning that estimate dynamic graphical models from multivariate time series data. In contrast to previous work, our methods allow for both “contemporaneous” causal relations and arbitrary unmeasured (“latent”) processes influencing observed variables. The performance of our algorithms is investigated with simulation experiments and we briefly illustrate the proposed approach on some real data from international political economy.}
}


@article{Meek1995,
author = {Meek, Christopher},
year = {2013},
Expand Down Expand Up @@ -168,6 +186,10 @@ @article{Zhang2008
abstract = {Causal discovery becomes especially challenging when the possibility of latent confounding and/or selection bias is not assumed away. For this task, ancestral graph models are particularly useful in that they can represent the presence of latent confounding and selection effect, without explicitly invoking unobserved variables. Based on the machinery of ancestral graphs, there is a provably sound causal discovery algorithm, known as the FCI algorithm, that allows the possibility of latent confounders and selection bias. However, the orientation rules used in the algorithm are not complete. In this paper, we provide additional orientation rules, augmented by which the FCI algorithm is shown to be complete, in the sense that it can, under standard assumptions, discover all aspects of the causal structure that are uniquely determined by facts of probabilistic dependence and independence. The result is useful for developing any causal discovery and reasoning system based on ancestral graph models.}
}

@article{Zhang2008AncestralGraphs,
title = {Causal Reasoning with Ancestral Graphs},
}

@inproceedings{Zhang2011,
author = {Zhang, Kun and Peters, Jonas and Janzing, Dominik and Sch\"{o}lkopf, Bernhard},
title = {Kernel-Based Conditional Independence Test and Application in Causal Discovery},
Expand Down
2 changes: 1 addition & 1 deletion docs/whats_new/_contributors.rst
Original file line number Diff line number Diff line change
Expand Up @@ -20,5 +20,5 @@
.. |API| replace:: :raw-html:`<span class="badge badge-warning">API Change</span>` :raw-latex:`{\small\sc [API Change]}`


.. _Adam Li: https://py-why.github.io
.. _Adam Li: https://github.com/adam2392
.. _Julien Siebert: https://github.com/siebert-julien
1 change: 1 addition & 0 deletions docs/whats_new/v0.1.rst
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,7 @@ Changelog
- |Feature| Implement an acyclification algorithm for converting cyclic graphs to acyclic with :func:`pywhy_graphs.algorithms.acyclification`, by `Adam Li`_ (:pr:`17`)
- |Feature| Adding a layout for the nodes positions in the :func:`pywhy_graphs.viz.draw` function, by `Julien Siebert`_ (:pr:`26`)
- |Feature| Add :class:`pywhy_graphs.networkx.MixedEdgeGraph` for mixed-edge graphs, by `Adam Li`_ (:pr:`29`)
- |MajorFeature| Implement a series of graph classes for time-series graphs, such as ``pywhy_graphs.classes.timeseries.StationaryTimeSeriesMixedEdgeGraph``, by `Adam Li`_ (:pr:`21`)

Code and Documentation Contributors
-----------------------------------
Expand Down
Loading