Skip to content
View puzzleclone's full-sized avatar

Block or report puzzleclone

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
PuzzleClone/readme.md

PuzzleClone

License GitHub stars Python Version

An open-source data generation framework for batch construction of verifiable, controllable, and diverse puzzles.

πŸ“‹ Overview

PuzzleClone is a data synthesis framework and comprehensive dataset for logical reasoning problems. It features:

  • βœ… Guaranteed Verifiability: Every problem is generated with a ground-truth solution and is formally verifiable via a symbolic solver or deterministic program execution, ensuring correctness.
  • 🎯 Granular Control: Offers fine-grained control over problem attributes like scale, structure, and difficulty through a set of adjustable parameters, enabling large-scale batch generation.
  • ✨ Flexible Adaptation: Facilitates the easy customization of problem scenarios and translation into different languages or domains.
  • πŸ“Š Expansive and Diverse Coverage: Based on PuzzleClone, we have curated a benchmark including 83,657 unique logical reasoning puzzles procedurally generated from 86 seed questions. The dataset spans:
    • Various applications of Satisfiability Modulo Theories (SMT) and SMT-like puzzles,
    • Classic logical puzzles like Sudoku, the Knapsack problem, and linear optimization (LP).
    • Diverse mathematical problems of varying difficulties.
  • πŸš€ State-of-the-Art Performance: Achieves SOTA results among open-source datasets, outperforming the public dataset by 18.4 points on SATBench (from 51.6 to 70.0).

πŸš€ Quick Start

git clone https://github.com/puzzleclone/PuzzleClone.git
cd PuzzleClone
pip install -r requirements.txt

Here are a few common use cases.

Generate a single test case for debugging

This runs the translator in test mode (-t), generating a sample question based on the specification file.

python translator.py -t path/to/spec.yaml

The output data ({spec_name}_data.jsonl) and some other files used for debugging like {spec_name}_synthesizer.py will be generated in temp/ directory.

Generate a full dataset for production

This runs the translator in production/deployment mode (-d) to generate a large number of problems and saves them to a specified output file (-o).

python translator.py -d path/to/spec.yaml -o data.jsonl

If no output file is specified, the output data will be saved to output/{spec_name}_data.jsonl.

Apply a new template to existing data

This uses the -g flag to load existing problem data and applies a new problem description or template (new_spec.yaml) to it.

python translator.py -d path/to/new_spec.yaml -g old_data.jsonl -o new_data.jsonl

Data transformation

A handful of scripts are provided to transform the data generated above to the standard formats for a benchmark. Please refer to the scripts and documentation in the data_processing_scripts/ directory.

Evaluation

See our evaluation toolkit PolyhedronEvaluator.

βš–οΈ License

This project is licensed under the Apache 2.0 License. See the LICENSE file for details.

Popular repositories Loading

  1. PuzzleClone PuzzleClone Public

    The official code of PuzzleClone (submitted to ACL'26)

    Python 2

  2. PuzzleCloneData PuzzleCloneData Public

    PC-83K, A benchmark of over 83k logical puzzles generated by PuzzleClone (submitted to ACL'26)

  3. PolyhedronEvaluator PolyhedronEvaluator Public

    A robust evaluator for comparing LLM's responses with the ground truth, accommodating various output formats

    Python