CausalLens

CausalLens is a middleware layer that sits between any LLM and any application to verify causal claims in the model's output. Drop it in front of your chatbot, agent, RAG pipeline, or evaluation harness and get back a structured report of every causal claim the model made along with a verdict for each one.

The problem: LLMs are correlation engines, not causal reasoners

Large language models are trained to predict the next token given everything that came before. That objective rewards producing text that looks like the kinds of sentences humans write, including sentences of the form "X causes Y". What it does not reward is checking whether X actually causes Y. LLMs therefore emit causal claims that are:

confounded - a hidden variable drives both sides,
reversed - the model states effect-to-cause as cause-to-effect,
spurious - the two concepts merely co-occur in training data,
or contradictory - two claims in the same response imply a cycle.

In domains like medicine, law, science, and finance, acting on a correlation as though it were causation is dangerous. CausalLens catches those claims before they reach a user.

Why this matters (real example)

Consider a medical assistant that replies:

"Drinking coffee causes cancer because it is frequently consumed by cancer patients."

The grammar is fine, the sentence is confident, and the shape of the argument is familiar. The claim is also wrong: heavy coffee drinkers happen to smoke more, and smoking is the confounder driving the cancer association. A user who takes the answer at face value may give up their morning coffee for no benefit; a clinician who relies on it may mislead a patient.

CausalLens extracts the claim coffee -> cancer, builds a causal graph from the surrounding text, runs the claim through DoWhy, and returns a verdict:

[CORRELATION] coffee -> cancer
    Estimated effect is close to zero or does not survive refutation.

Installation

pip install causallens
python -m spacy download en_core_web_sm

CausalLens depends on spaCy, NetworkX, and DoWhy. These are installed automatically.

Quick start

from causallens import CausalLens

cl = CausalLens()
result = cl.verify("smoking causes cancer and also causes yellow teeth")

print(result.claims)
print(result.report)

result.claims is a list of CausalClaim objects:

[
  CausalClaim(cause='smoking', effect='cancer',       cue='causes', confidence=0.8),
  CausalClaim(cause='smoking', effect='yellow teeth', cue='causes', confidence=0.8),
]

result.report is a human-readable summary of the verification:

CausalLens verification report
========================================
Input length: 51 characters
Claims found: 2

Summary:
  causal          1
  correlation     1
  contradicted    0
  unverifiable    0

Claim-by-claim:
  1. [CAUSAL]      smoking -> cancer       (cue='causes', estimate=1.035)
  2. [CORRELATION] smoking -> yellow teeth (cue='causes', estimate=0.031)

Wrapping an LLM call

from causallens import CausalLens

def my_llm(prompt: str) -> str:
    ...  # call OpenAI, Anthropic, Ollama, etc.

cl = CausalLens()
safe_llm = cl.wrap(my_llm)

result = safe_llm("Why is smoking harmful?")
for v in result.verdicts:
    print(v.verdict, v.claim.cause, "->", v.claim.effect)

How it works

CausalLens runs a three-step pipeline.

1. Extract

causallens.extractor.ClaimExtractor parses the text with spaCy and looks for causal-language patterns: causes, leads to, results in, because of, due to, triggers, produces, and friends. Each match yields a CausalClaim(cause, effect, cue, sentence, confidence). The extractor handles forward cues (A causes B), backward cues (B because of A), and coordinated phrases (A causes B and C).

2. Graph

causallens.graph.CausalGraph builds a networkx.MultiDiGraph in which every cause/effect phrase is a node and every claim is an edge with a confidence score. The graph exposes a GML export that feeds directly into DoWhy and a DOT export for Graphviz visualization. It also detects cycles, which are automatically surfaced as unverifiable.

3. Verify

causallens.verifier.CausalVerifier pushes the graph through DoWhy:

identify an estimand for every (cause, effect) pair,
estimate the effect with linear-regression backdoor adjustment,
refute the estimate with a random-common-cause perturbation,

and returns a Verdict per claim: causal, correlation, contradicted, or unverifiable. If you do not have observational data, the verifier synthesizes a small dataset that is consistent with the extracted DAG so that DoWhy has something to fit; pass your own pandas.DataFrame via CausalLens(data=...) to use real data.

AI output without vs with CausalLens

Scenario	Raw LLM output	CausalLens-verified output
Medical chatbot	"Drinking coffee causes cancer." Emitted confidently.	Same text, plus `[CORRELATION] coffee -> cancer - estimate does not survive refutation`. Downstream app warns user.
Science explainer	"Higher CO2 causes rising sea levels." No traceable reasoning.	Same text, plus a verified DAG `CO2 -> temperature -> sea_level` with every edge labelled `[CAUSAL]`.
Business report	"Raising prices led to higher revenue." Based on one quarter.	Same text, plus `[CORRELATION] price -> revenue - estimate not robust to random common cause`. Reviewer digs deeper.
Policy analysis	"Minimum-wage hikes cause unemployment." Contested claim.	Same text, plus `[UNVERIFIABLE]` if no supporting structure is provided, preventing the app from treating it as settled.
Code assistant	"Using recursion causes stack overflow." True in context.	Same text, plus `[CAUSAL] recursion -> stack_overflow - estimate survives refutation`.

The original text is always returned unchanged; CausalLens adds a report, never rewrites the model's words.

API reference (short form)

from causallens import (
    CausalLens,          # top-level wrapper
    VerificationResult,  # what cl.verify() returns
    CausalClaim,         # one extracted claim
    ClaimExtractor,      # spaCy-based extractor
    CausalGraph,         # NetworkX graph of claims
    CausalVerifier,      # DoWhy-based verifier
    Verdict,             # per-claim result
)

CausalLens(spacy_model="en_core_web_sm", data=None) constructs a ready-to-use pipeline. cl.verify(text) returns a VerificationResult with .text, .claims, .verdicts, .graph, .summary, and .report.

Running the examples

python examples/basic_usage.py
python examples/medical_example.py
python examples/scientific_example.py

Testing

pip install -e ".[dev]"
pytest

The suite covers extraction, graph construction, verifier verdicts, and the top-level wrapper.

Contributing

Pull requests are welcome. If you are adding a feature, please:

Open an issue describing the use case first.
Add unit tests under tests/ and keep the suite green.
Keep the public API (CausalLens, VerificationResult, CausalClaim, Verdict) stable; extend rather than reshape.
Run pytest locally before pushing.

Ideas that would help the project:

richer cue-phrase library (multilingual, domain-specific),
support for user-supplied observational data with sensible column auto-mapping,
additional DoWhy refuters surfaced as verdict qualifiers,
plug-ins for popular LLM SDKs (OpenAI, Anthropic, LangChain).

License

CausalLens is released under the MIT License.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
causallens		causallens
examples		examples
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CausalLens

The problem: LLMs are correlation engines, not causal reasoners

Why this matters (real example)

Installation

Quick start

Wrapping an LLM call

How it works

1. Extract

2. Graph

3. Verify

AI output without vs with CausalLens

API reference (short form)

Running the examples

Testing

Contributing

License

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

CausalLens

The problem: LLMs are correlation engines, not causal reasoners

Why this matters (real example)

Installation

Quick start

Wrapping an LLM call

How it works

1. Extract

2. Graph

3. Verify

AI output without vs with CausalLens

API reference (short form)

Running the examples

Testing

Contributing

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages