Turn spreadsheet (in CSV format) decision tables into optimized code.
Maintain business logic in a spreadsheet. Compiler generates efficient code.
Spreadsheet CSV RFC 4180 format:
# @name (@ prefix) starts a new decision (sub)table for name depending on name(s)
@proceed,signal,canStop
yes,green,
yes,yellow,no
no,yellow,yes
no,red,
Pseudocode output translated (by hand) to C language style nested if/else:
if (signal == yellow) {
if (canStop == no)
proceed = yes;
else
proceed = no;
} else {
if (signal == green)
proceed = yes;
else
proceed = no;
}When requirements change, update the spreadsheet and regenerate. No manual refactoring of nested if/else chains. Example translators for C and Python are included.
# Build (only requires an ANSI C compiler)
make
# Compile decision table to pseudocode
./dtc table.dtc > table.psu
# Generate C code
awk -f C.awk table.psu
# Creates: table.h and table.c
# Generate Python code
python3 psu2py.py table.psu > table.py
Try the included examples:
```bash
make examples
./test
make examples-py
python3 test.pyThe problem: Decision logic in code is fragile. A small requirements change can require reorganizing deeply nested if/else statements. Hand-optimization is error-prone and hard to review.
The solution: Express requirements as a decision table. The compiler automatically:
- Finds the optimal evaluation order
- Eliminates duplicate code paths
- Minimizes decision depth
- Generates consistent, reviewable output
Domain experts maintain the spreadsheet. Programmers integrate the generated code. Requirements stay traceable.
An early version of this program was developed in 1993 for avionics and embedded systems where:
- Requirements change frequently but must remain traceable
- Domain experts (not programmers) maintain the logic
- Memory and execution time are constrained
- Safety standards require deterministic behavior
The included power.dtc is an early version of an actual twin-engine aircraft power determination decision table.
Decision Table (.dtc) → Pseudocode (.psu) → Target Language (.h, .c, etc.)
CSV Format Optimized DAG Language-specific code
- Input: CSV formatted decision tables
- Compilation:
dtcbuilds an optimized decision directed acyclic graph (DAG) - Output: CSV formatted pseudocode with metadata documenting all names' values, indicating which are independent (input) and dependent (output)
- Translation: Language-specific tools (e.g.,
C.awk) generate final code
Disjunctive normal form CSV format using a prefix character, '@', to indicate a decision (sub)table:
@resultName,dependentName1,dependentName2,...
resultValue1,dependentName1Value1,dependentName2Value1,...
resultValue2,dependentName1Value2,dependentName2Value2,...
...
Names' values must be mutually exclusive and exhaustive:
- Mutually exclusive: Each value represents a distinct state (e.g., a signal cannot be both "green" and "red")
- Exhaustive: The values must cover all possible states of the name (e.g., "green", "red", "yellow" covers all traffic light states)
Example (traffic light decision as input to two other tables):
# Proceed when the light is green and don't when the light is red.
@proceed,signal
yes,green
no,red
# When the light is yellow, don't proceed when a stop can be done safely, otherwise proceed.
@proceed,signal,canStop
yes,yellow,no
no,yellow,yes
# Apply brake if not proceeding, otherwise don't.
@brake,proceed
yes,no
no,yes
# Apply accelerator when proceed is decided and the light is judged to be close to changing, otherwise don't.
@accelerator,proceed,isClose
yes,yes,yes
no,yes,no
no,no,
Empty cells represent "any value". The compiler finds the optimal way to evaluate these conditions.
Variable names and values can contain spaces, punctuation, and other special characters. The compiler uses standard CSV encoding (RFC 4180) for both input and output:
@Free shipping,Favored customer,Total order amount > $500.00
y,y,
y,n,y
n,n,n
Names containing commas or quotes require CSV quoting:
@result,"Order requires special ""HAZMAT""",destination
y,y,domestic
n,y,international
The translators (C.awk, psu2py.py) convert these to valid identifiers in the target language by replacing special characters with underscores.
Decision tables can be written in two styles:
Fully Specified Tables: Every possible combination of input values has an explicit row. For n binary inputs, this means 2^n rows.
@result,A,B
yes,0,0
yes,0,1
no,1,0
no,1,1
Sparse Tables: Empty cells indicate "any value" - the result does not depend on that input for this row. This reduces table size and often improves readability:
@result,A,B
yes,0,
no,1,
Both tables above are equivalent. The sparse form has 2 rows instead of 4.
Optimization implications:
- Sparse tables give the compiler more freedom to reorder tests, often producing better results
- Fully specified tables may constrain the search space, sometimes compiling faster
- Domain experts often find sparse tables easier to maintain since each row captures a logical rule rather than enumerating combinations
Many decision table methodologies use a two-stage approach:
- Stage 1 (Conditions → Rules): Input conditions are evaluated to determine which "rule" applies
- Stage 2 (Rules → Actions): The matched rule determines which actions fire
This pattern maps naturally to dtc using an intermediate name e.g. "rule":
# Stage 1: Conditions determine which rule matches
@rule,customer type,order value,destination
1,premium,,
2,standard,high,domestic
3,standard,high,international
4,standard,low,domestic
5,standard,low,international
# Stage 2: Rules determine actions
@free shipping,rule
y,1
y,2
n,3
n,4
n,5
@apply discount,rule
n,1
y,2
y,3
n,4
n,5
@add customs fee,rule
n,1
n,2
y,3
n,4
y,5
The dtc is not constrained by this pattern. Use (sub)tables to decompose your problem as desired.
The compiler merges all tables from all input files, solving them together into a single optimized DAG. This enables a "multiple sheets" decomposition style familiar to spreadsheet users:
# Compile tables from multiple files together
./dtc conditions.dtc actions.dtc overrides.dtc > combined.psuUse cases:
- Organizational separation: Group related tables by domain concern (e.g., pricing.dtc, shipping.dtc, discounts.dtc)
- Team workflows: Different domain experts maintain different files
- Conditional inclusion: Include or exclude table files based on configuration
Important consideration: Tables with interdependencies (shared names) benefit from joint optimization. However, compiling independent tables together increases optimization time exponentially since the search space combines. For independent decision problems:
# Independent tables: compile separately (faster)
./dtc power.dtc > power.psu
./dtc driving.dtc > driving.psu
# Interdependent tables: compile together (better optimization)
./dtc proceed.dtc brake.dtc accelerator.dtc > driving.psuIf compilation takes too long, consider whether your tables are truly interdependent. (Review the pseudocode metadata that identifies inputs, the values the dtc has determined are independent.) Splitting independent problems into separate compilation units reduces optimization time while producing equivalent results.
The pseudocode output is in CSV format, making it easy to parse in any language. Each line is a CSV record with the operation type in the first field.
Example output for the traffic light decision table, DisjunctiveNormalForm.dtc:
I,canStop,no
I,canStop,yes
I,isClose,no
I,isClose,yes
I,signal,green
I,signal,red
I,signal,yellow
O,accelerator,no
O,accelerator,yes
O,brake,no
O,brake,yes
O,proceed,no
O,proceed,yes
D,3
T,signal,yellow,1
L,2
T,signal,green,3
L,4
R,accelerator,no
R,brake,yes
R,proceed,no
J,0
L,3
R,brake,no
R,proceed,yes
T,isClose,no,5
L,6
R,accelerator,yes
J,0
L,5
R,accelerator,no
J,0
L,1
T,canStop,no,3
J,4
L,0
Metadata Lines:
I,var,val- Input (independent) name and value - provides type informationO,var,val- Output (dependent) name and value - provides type informationD,n- Depth (maximum decision depth, worst-case tests to reach a leaf) - complexity metric
Code Lines:
L,n- Label definition (numeric, 0 is exit)T,var,val,n- Test: if var equals val, jump to label nJ,n- Jump unconditionally to label n (0 = exit/return)R,var,val- Resolve: assign val to var
CSV Encoding: All names and values are CSV-encoded. Values containing commas, quotes, or newlines are quoted per RFC 4180:
I,"Order requires special ""HAZMAT""",y
R,"Add $17.50 foreign shipping fee",y
Metadata Purpose: The metadata lines enable automatic generation of:
- Type-safe code with enums in statically-typed languages
- Function signatures with correct parameter types
- Visual diagrams (UML, flowcharts) without parsing code
- Documentation showing all possible values for each name
This metadata makes the pseudocode self-describing and sufficient for translation to any programming language or visualization format.
The CSV pseudocode is designed to be easily parsed CSV text. The metadata (I and O lines) provides all information needed to generate type definitions and function signatures.
Languages with goto support can translate the pseudocode almost directly, preserving the optimized DAG structure:
| Language | goto Support | Translator | Notes |
|---|---|---|---|
| C | Full | C.awk |
Direct translation, optimal performance |
| C++ | Full | C.awk |
Same as C |
| Go | Full | Similar to C.awk |
Minor syntax differences (braces vs semicolons) |
Example (C):
awk -f C.awk table.psu
# Creates: table.h and table.cThis example generates complete C code with:
- Separate enum for each variable with namespaced values
- Header file with type definitions and function declaration
- Implementation file with goto-based decision logic preserving DAG structure
- All identifiers prefixed with table name to avoid conflicts
- Special characters in names converted to underscores for valid C identifiers
Languages without goto require translation to a state machine pattern. This adds dispatch overhead but maintains the DAG structure and compact representation:
| Language | Recommended Pattern | Translator |
|---|---|---|
| Python | while + if chain |
psu2py.py (included) |
| Rust | loop + match with State enum |
(not included) |
| Java | while + switch on numeric state |
(not included) |
| JavaScript | while + switch on numeric state |
(not included) |
How It Works: State machine translators convert labels to state IDs and use a loop with if/switch statement for dispatch. The DAG structure is preserved - nodes still converge, preventing code duplication.
Tradeoffs:
- Preserves DAG structure (no code duplication)
- Maintains compact representation
- Modern compilers optimize switch statements to jump tables
- Adds state variable and dispatch overhead compared to goto
- May lose some compiler optimizations available with goto
Performance: While not as optimal as goto, state machine translation still benefits from the dtc's optimization work (minimized depth, maximized sharing). The overhead is primarily the state variable update and switch dispatch, not the decision logic itself.
The decision table compiler generates a DAG where multiple decision paths converge at shared nodes. This structure:
- Eliminates duplicate code - Tree-structured if/else would duplicate downstream logic at every branch point (exponential bloat)
- Maps to machine code - goto translates directly to CPU jump instructions
- Avoids nesting limits - Flat structure prevents deeply nested if/else hitting compiler limits
- Enables sharing - Multiple paths to same outcome share the leaf node code
Example: In DisjunctiveNormalForm.psu, multiple decision paths converge to common tail code.
The example C.awk script translates CSV pseudocode into complete C header and implementation files with proper type safety and namespace isolation.
# Generate .psu pseudocode from decision table
./dtc table.dtc > table.psu
# Generate C header and implementation (single invocation)
awk -f C.awk table.psu
# Creates: table.h and table.cOr use the Makefile:
# Build examples (automatically generates .psu, .h, and .c files)
make examples
# This generates from .dtc → .psu → .h/.c:
# - power.dtc → power.psu → power.h, power.c
# - DisjunctiveNormalForm.dtc → DisjunctiveNormalForm.psu → DisjunctiveNormalForm.h, DisjunctiveNormalForm.c
# - test (executable demonstrating both tables)The Makefile includes pattern rules to automatically generate .psu from .dtc files and .h/.c from .psu files.
Variable names and values are converted to valid C identifiers:
- Spaces become underscores:
Favored customer→Favored_customer - Special characters become underscores:
Total order amount > $500.00→Total_order_amount____500_00 - Leading digits get underscore prefix:
1stChoice→_1stChoice
The example psu2py.py script translates CSV pseudocode into Python modules using a state machine pattern.
# Generate .psu pseudocode from decision table
./dtc table.dtc > table.psu
# Generate Python module
python3 psu2py.py table.psu > table.pyOr use the Makefile:
make examples-pyInput (decision.psu):
I,signal,green
I,signal,red
O,proceed,no
O,proceed,yes
D,2
T,signal,green,1
L,2
R,proceed,no
J,0
L,1
R,proceed,yes
J,0
L,0
Output (decision.py):
from enum import IntEnum, auto
class signal(IntEnum):
green = auto()
red = auto()
class proceed(IntEnum):
no = auto()
yes = auto()
def evaluate(signal):
_proceed = None
_s = 0
while True:
if _s == 0:
if signal == signal.green:
_s = 1
continue
_s = 2
continue
if _s == 2:
_proceed = proceed.no
return (_proceed)
if _s == 1:
_proceed = proceed.yes
return (_proceed)Usage:
import decision
result = decision.evaluate(decision.signal.green)
print(result.name) # "yes"Like C.awk, psu2py.py converts names to valid Python identifiers using the same rules (special characters become underscores).
The pseudocode metadata provides complete information for generating visual representations of decision logic.
The metadata (I, O, D) combined with the pseudocode structure contains all necessary information to generate:
UML Activity Diagrams:
- Decision nodes (diamonds) from
T,var,val,nconditionals - Action nodes (rectangles) from
R,var,valassignments - Control flow from
J,njumps - Entry point from start of code (before first label)
- Exit point from label 0 or
J,0jumps - Guard conditions from test expressions
Flowcharts via Graphviz/DOT: The pseudocode directly translates to DOT format for visualization:
# Hypothetical converter (not included)
awk -f psu2dot.awk power.psu > power.dot
dot -Tsvg power.dot > power.svgThe generated graph would show:
- Decision nodes (circles) for intermediate states
- Result nodes (boxes) with assignments
- Edges labeled with conditions
- DAG structure showing path convergence
Metadata Completeness: The pseudocode contains everything needed for diagram generation:
- All names and their possible values (I, O)
- Complete control flow graph (labels and jumps)
- Decision conditions with explicit comparisons
- Complexity metric (D) for documentation
- Entry and exit points clearly marked
This allows domain experts to review decision logic visually while maintaining a single source of truth (the .dtc file).
The decision table optimization problem is NP-complete. Finding the optimal decision tree that minimizes worst-case evaluation depth while maximizing node sharing involves exploring an exponentially large search space.
Compilation time varies based on:
- Number of input names: Each additional name exponentially increases the search space
- Number of unique values per name: More values = more possible test orderings
Practical expectations:
- Simple tables (3-5 independent values): Milliseconds
- Medium tables (8-10 independent values): Seconds
- Complex tables (15+ independent values): Minutes or longer
Without optimization, a naive implementation might:
- Test every condition sequentially (maximum depth)
- Duplicate code at every branch point (exponential size)
With dtc optimization, the power.dtc example (10 input names, 20 values):
- Achieves depth 8 instead of depth 20 (60% reduction). (With the -q, quick, flag, the depth is 10.)
- Shares nodes via DAG structure (compact output)
Compilation time is paid once. The generated code runs in O(depth) time.
- This is expected for complex tables - the tool is doing hard optimization work
- Use the -q (quick) flag to stop on the first complete heuristic pass. The result is correct but, probably, not optimal.
- Either way, the output will be much better than hand-written nested if/else
- GNU Public License (GPL) program
- Like any compiler, the input decision tables and the output pseudocode belong to you