Skip to content

nonogram_tutorial

Manlio Morini edited this page Jun 17, 2024 · 9 revisions

Nonograms puzzle

Solving nonograms via genetic algorithms

Nonograms, also known as Japanese puzzles, are logic puzzles that are sold by many news paper vendors. The challenge is to fill a grid with black and white pixels in such a way that a given description for each row and column, indicating the lengths of consecutive segments of black pixels, is adhered to. Although the Nonograms in puzzle books can usually be solved by hand, the general problem of solving Nonograms is NP-hard.


Introduction

Each row and column of a rectangular grid is annotated with the lengths of its distinct runs of occupied cells (clues). Using only these lengths you should find one valid configuration of empty and occupied cells, or show a failure message.

Example

Problem:                 Solution:

. . . . . . . .  3       . # # # . . . .  3
. . . . . . . .  2 1     # # . # . . . .  2 1
. . . . . . . .  3 2     . # # # . . # #  3 2
. . . . . . . .  2 2     . . # # . . # #  2 2
. . . . . . . .  6       . . # # # # # #  6
. . . . . . . .  1 5     # . # # # # # .  1 5
. . . . . . . .  6       # # # # # # . .  6
. . . . . . . .  1       . . . . # . . .  1
. . . . . . . .  2       . . . # # . . .  2
1 3 1 7 5 3 4 3          1 3 1 7 5 3 4 3
2 1 5 1                  2 1 5 1

Nonograms can be considered as a generalization of a well-known problem in Discrete Tomography: reconstructing hv-convex sets (where the black pixels in each row and column must be consecutive). For this Discrete Tomography problem, the description for each line consists of a single number, indicating the length of the segment of black pixels along that line. The problem of reconstructing hv-convex polyominoes can be solved in polynomial time, whereas the reconstruction problem for general hv-convex sets is NP-hard. Therefore, the reconstruction problem for Nonograms is also NP-hard (and, clearly, NP-complete).

The Nonogram problem can also be related to several job scheduling problems, where each row corresponds to a single processor and the jobs for the processors are indicated by the row descriptions. In such scheduling problems, the type of constraints that occur in Nonograms only apply to the rows, or the columns, but not both.


There can be considerable differences in the difficulty level of Nonograms. On the one hand, the Nonograms that appear in newspapers can typically be solved by applying a series of simple logical rules, each of which considers only a single horizontal or vertical line. These simple puzzles will always have a unique solution. On the other hand, large random puzzles can be very difficult to solve, even using a computer, and may have many different solutions. Clearly, the fact that solving Nonograms is NP-hard indicates that not all puzzles can be solved using simple logic reasoning.

Start coding

Problem:

. . . . . . . .  3
. . . . . . . .  2 1
. . . . . . . .  3 2
. . . . . . . .  2 2
. . . . . . . .  6
. . . . . . . .  1 5
. . . . . . . .  6
. . . . . . . .  1
. . . . . . . .  2
1 3 1 7 5 3 4 3
2 1 5 1

The problem above can be represented by two lists of lists:

row_clues = [[3], [2,1], [3,2], [2,2], [6], [1,5], [6], [1], [2]]
col_clues = [[1,2], [3,1], [1,5], [7,1], [5], [3], [4], [3]]

or, using C++ syntax:

std::vector<std::vector<unsigned>> row_clues =
{
  {3}, {2,1}, {3,2}, {2,2}, {6}, {1,5}, {6}, {1}, {2}
};
std::vector<std::vector<unsigned>> col_clues =
{
  {1,2}, {3,1}, {1,5}, {7,1}, {5}, {3}, {4}, {3}
};

These two data structures are the base of the nonogram_problem struct (which also offers various useful functions).

Genome

A candidate solution (i.e. an individual of the GAs population) can be encoded as a sequence of integers.

Consider the following problem:

. # # # . . . .  3
# # . # . . . .  2 1
. # # # . . # #  3 2
. . # # . . # #  2 2
. . # # # # # #  6
# . # # # # # .  1 5
# # # # # # . .  6
. . . . # . . .  1
. . . # # . . .  2
1 3 1 7 5 3 4 3
2 1 5 1

the pattern can be encoded (column-order) as:

COLUMN (zero-based indexing)
| 0  |  1  |  2  |  3  | 4| 5| 6| 7|

{1, 2, 0, 2, 0, 0, 0, 0, 4, 4, 2, 2}

 ^  ^                          ^  ^
 |  |                          |  +-- last block (size 3)
 |  |                          +----- second to last block (size 4)
 .  .                          .  .
 .  .                          .  .
 .  .                          .  .
 |  +-------------------------------- second block (size 2)
 +----------------------------------- first block (size 1)

where each number represents the distance of a block from the first allowed placing position. In total there are 12 blocks on 8 columns.

The encoding by column is an arbitrary choice (adopted in the source code). The same pattern could be represented (row-order) as:

{1, 0, 0, 1, 1, 2, 1, 2, 0, 0, 0, 4, 3}

This time there are 13 blocks on 9 rows.

Depending on the problem structure (i.e. number and type of the clues) one approach could be better than the other (from a performance point of view).

Anyway it's always possible to:

  1. swap row and column clues
  2. solve the resulting problem
  3. transpose the solution matrix to get the solution of the original problem

In C++ we can write:

nonogram_problem np;

// ...

// A candidate solution is a sequence of `np.blocks()` integers in the
// `[0, np.rows()[` interval.
ga_problem prob(np.blocks(), {0, np.rows()});

Note that the representation allows invalid individuals. Consider the following column:

.
.
.
.
1
2

The only solution is {0,0}:

#
.
#
#
1
2

but our representation also allows {0,1}, {0,2}, {0,3}, ... {3,2}, {3,3}.

To "keep it simple" invalid integers can be mapped to a valid range (e.g. via the modulo operator) thus creating a separation between the genotype (the list of integers) and the phenotype (the real position of the blocks).

This is very similar to the Grammatical Evolution scheme in which the objects the search algorithm operates on and what the fitness evaluation function interprets are distinct.

Decoding the genotype

The nonogram_problem::board function decodes an individual producing a filled true/false matrix:

vita::matrix<bool> board(const vita::i_ga &x) const

For each block it calculates:

  • the first allowed starting position (start) which depends on the already placed blocks;
  • the number of allowed positions (allowed) considering the already placed and the remaining blocks;
  • where to place the current block (placed): placed = start + encoded_distance % allowed.

A simple example should help. Given the column

.
.
.
.
.
.
.
.
.
1
2
2

and the {1,7,6} genome, the algorithm starts considering the position of the first block (1), supposing the remaining blocks placed "as far as possible":

start   = 0
allowed = 3
placed  = 1  // start + 1 % allowed
.  <-- start, allowed position
.  <-- allowed position
.  <-- allowed position
.
#
#
.
#
#
1
2
2

The first block is placed on the second cell:

.
#
.
.
.
.
.
.
.
1
2
2

Now we have:

start   = 3
allowed = 3
placed  = 4  // start + 7 % 3
.
#
.
.  <-- start, allowed position
.  <-- allowed position
.  <-- allowed position
.
#
#
1
2
2

The second block is placed on the fourth cell:

.
#
.
.
#
#
.
.
.
1
2
2

Note how the % operator is used to fix / map the genotype to a correct phenotype. For the last block:

start   = 7
allowed = 1
placed  = 7  // 7 + 6 % 1
.
#
.
.
#
#
.
.  <-- start, allowed position
.
1
2
2

and the final placing is:

.
#
.
.
#
#
.
#
#
1
2
2

Fitness function

The individuals of the population always "automatically" satisfy the column clues but they have different "performance" regarding the row clues.

Consider this puzzle:

Solution:                Candidate solution:
                         {1,2,0,2,0,0,0,0,4,4,2,0}

. # # # . . . .  3       . # # # . . . #  3 1    <-- ERROR
# # . # . . . .  2 1     # # . # . . . #  2 1 1  <-- ERROR
. # # # . . # #  3 2     . # # # . . # #  3 2
. . # # . . # #  2 2     . . # # . . # .  2 1    <-- ERROR
. . # # # # # #  6       . . # # # # # .  5      <-- ERROR
# . # # # # # .  1 5     # . # # # # # .  1 5
# # # # # # . .  6       # # # # # # . .  6
. . . . # . . .  1       . . . . # . . .  1
. . . # # . . .  2       . . . # # . . .  2
1 3 1 7 5 3 4 3          1 3 1 7 5 3 4 3
2 1 5 1                  2 1 5 1

the block on the last column is misplaced:

ROW  CORRECT  WRONG     DELTA
0     [3]     [3,1]     |3-3| + |0-1| = 1
1     [2,1]   [2,1,1]   |2-2| + |1-1| + |0-1| = 1
3     [2,2]   [2,1]     |2-2| + |2-1| = 1
4     [6]     [5]       |6-5| = 1

FITNESS = -sum(delta_i) = -4

The fitness function computes the distance between the correct and wrong sequences, matching elements by order and taking their absolute difference.

Since the framework tries to maximize the fitness, we have to take the opposite of the sum.

Source code

  • nonogram01.cc deal with a basic problem;
  • nonogram02.cc contains a more complex problem. Try to swap column / row clues to discover the performance difference.

Conclusion

GAs tend to produce good but sub-optimal solutions and this behaviour is acceptable only for some combinatorial optimization problems.

In this case we have a good enough GA representation that correctly detects biases in low-order schemas and, combining information, eventually converges.

There are further improvements and other (probably better?) approaches (e.g. see the this question on Stackoverflow and the CSP appendix).


Appendix - Nonograms as CSPs

Nonograms admit many possible representations as CSPs, though framing all of the important logic as constraints can be difficult. What follows are several suggestions for possible CSP formulations.

As shown in the following figure

A 10-cell row with 3 segment clues

the segment sizes for a row (or column) provide the basis for variables, domains and constraints. In this row of size 10, there are three segments (A, B and C) of sizes 2, 1 and 3, respectively – in that order going left to right. Adding in the mandatory blank(s) between each segment, the distance (in terms of the total number of tiles) from the start of A to the end of B is a minimum of 8 (2 + 1 + 1 + 1 + 3). Hence, A must begin in column 2 or earlier, and C must end in column 7 or later (using zero-based indexing).

This type of analysis hints at one potential CSP formulation:

  • variables are the start cells of a row (and column) segments;
  • domains are the feasible values for those variables, with some simple arithmetic providing very helpful initial restrictions: very few variables will have domains containing all integers from 0 to row / column size minus 1;
  • constraints embody relationships between adjacent segments, capturing the fact that segment S+1 cannot start until at least L+1 spots after segment S begins, where L is the length of segment S.

By creating variables, domains and constraints for each set of row and column segments, you can capture a good deal of the vital information in a nonogram in the explicit components of a CSP. Unfortunately, this doesn't capture all of the essential information. The interactions between row segments and column segments also require constraints. There are probably several ways to do this. One is to include one constraint per cell that embodies the following relationship:

If any row segment intersects this cell, then a column segment must also intersect it; and, if any column segment intersects it, then a row segment must also intersect it.

Using an Aggregate Representation

Another approach to solving nonograms involves variables and constraints that capture a higher level of representation. Instead of looking at segments and how they interact, this one considers whole rows and columns and their relationships.

Returning to the previous figure, note that, when given a set of segment sizes, we can easily calculate all possible linear patterns that fill the row and satisfy the segment specification. Similarly, all viable patterns for any column can also be calculated. This extra little bit of pre-processing has huge benefits.

If each row and column is a variable in a CSP, then these sets of viable patterns constitute their domains. Now all we need are some constraints.

When each variable is an entire row or column, and the domains are all possible fill patterns of the row or column (that are compatible with the segment clues), then the shared cell between a row and column provides the basis for domain filtering.

The above figure hints at the origins of some useful constraints. Every pairing of a row and a column will have one cell in common, and the row and column patterns must agree on the value of that cell: filled or unfilled. It also shows how the possible patterns of the row variable are filtered against those of the column variable. This leads to a 60% reduction in the row patterns due to the simple fact that in ALL of the 6 column patterns, the shared cell (C) is filled. So any row pattern with an unfilled cell C can be removed. Note that the opposite filtering (of the column patterns based on the row patterns) would fail to constrict the column patterns, since at least one row pattern fills the shared cell; and thus, all column patterns are consistent with at least one row pattern.

However, this approach can produce very large domains for certain variables (in certain puzzles), and this, in turn, can lead to computational bottlenecks.

References

  • A Reasoning Framework for Solving Nonograms by K.Joost Batenburg and Walter A. Kosters.