Skip to content

evalops/cognitive-dissonance-dspy

Repository files navigation

Cognitive Dissonance DSPy

Requirements: Python 3.10+ DSPy Coq License: MIT

Stop arguing; prove it.
Cognitive Dissonance DSPy detects belief conflicts between LLM agents, translates formalizable claims to Coq, and attempts a machine‑checked proof. If a claim can't be formalized, we say so and punt (fall back to labeled heuristics).

  • Agents & optimization: built on DSPy's programmatic agent framework.
  • Proofs: compiled (coqc) and independently checked (coqchk) Coq artifacts.

To our knowledge, this is the first framework that combines DSPy‑based cognitive‑dissonance detection, NL→Coq translation, and online proving in one loop.


Why

Most multi‑agent systems resolve contradictions with debate, confidence scores, or arbitration heuristics. That's arm‑wrestling, not ground truth. When a claim is formalizable, we hand it to a theorem prover and return a proof object (or a failure).

Scope (initial): arithmetic + basic algebra, simple algorithmic properties (e.g., permutation + sortedness), and other first‑order fragments that Coq and standard tactics handle well. We'll expand coverage pragmatically.


How it works

[Agents (DSPy)] → [Belief Extractor] → [Claim Normalizer]
                               ↓ conflicts
                 [NL→Coq Translator + Spec Templates]
                               ↓ goals
                     [Coq Prover (coqc)] ──▶ [coqchk]
                               ↓
                     { PROVED | DISPROVEN | NO-PROOF }

Quick Start

Prerequisites

  • Python 3.10+
  • Coq theorem prover (coqc command available)
  • Ollama or compatible API endpoint (for agent reasoning)

Installation

# Clone the repository
git clone https://github.com/evalops/cognitive-dissonance-dspy.git
cd cognitive-dissonance-dspy

# Set up virtual environment and install dependencies
python -m venv venv
source venv/bin/activate  # Windows: venv\Scripts\activate
pip install -r requirements.txt

# Install Coq (Ubuntu/Debian)
sudo apt update && sudo apt install -y coq

# Install Coq (macOS)
brew install coq

# Verify installation
coqc --version

Example: When Agents Disagree

from formal_verification import FormalVerificationConflictDetector, Claim, PropertyType

detector = FormalVerificationConflictDetector()

claims = [
    Claim("alice", "2 + 2 = 4", PropertyType.CORRECTNESS, 0.95, time.time()),
    Claim("bob", "2 + 2 = 5", PropertyType.CORRECTNESS, 0.80, time.time()),
]

results = detector.analyze_claims(claims)
# Alice: ✅ PROVEN (reflexivity)
# Bob:   ❌ DISPROVEN (Unable to unify "5" with "4")

Run Examples

# Mathematical claims demonstration
python examples/mathematical_claims.py

# Advanced theorem proving
python examples/advanced_theorems.py

# Comprehensive framework demo
python examples/comprehensive_demo.py

What We Can Prove

Current coverage:

  • Arithmetic & algebra: 2 + 2 = 4 (✓), factorial 7 = 5040 (✓ in 502ms)
  • Algorithm properties: sorting correctness via Permutation + LocallySorted
  • Simple invariants: list properties, basic data structure constraints

Measured performance:

  • 15 claims → 80% success rate
  • Average proof time: 179.7ms
  • Conflicts resolved deterministically (no voting)

Architecture

formal_verification/
├── translator.py    # NL → Coq patterns  
├── prover.py        # subprocess wrapper for coqc
└── detector.py      # orchestrates the pipeline

cognitive_dissonance/
├── verifier.py      # DSPy agents for belief extraction
└── experiment.py    # co-training & optimization

Translation examples:

"2 + 2 = 4"Theorem arith : 2 + 2 = 4. reflexivity. Qed.
"factorial 5 = 120"Fixpoint + simpl proof
"sorts correctly"PermutationLocallySorted

Related Work

LLM × ITP integration:

  • LeanDojo (Yang et al. 2023): LLMs + Lean for proof search
  • APOLLO (Wang et al. 2024): GPT-4 generating Lean proofs
  • Minerva (Lewkowycz et al. 2022): math problem solving

Multi-agent verification:

  • BDI agent logics (Wooldridge & Fisher 2005)
  • Consensus mechanisms in distributed AI (recent IJCAI/AAMAS work)

Our contribution: First to combine DSPy cognitive dissonance detection + NL→Coq translation + online proving in one system.


Roadmap

  • Expand pattern library (induction, recursive data structures)
  • Better error messages when formalization fails
  • Integration with LeanDojo/APOLLO for cross-prover validation
  • Benchmark on existing theorem proving datasets

License

MIT

Contact

GitHub Issues | EvalOps

About

A multi-agent LLM system for detecting and resolving cognitive dissonance.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •