Skip to content

A tale of three sudoku solvers #84

@raphlinus

Description

@raphlinus

A fictional story to illuminate an aspect of quality in software engineering, told as a sequence of three sudoku solvers. The two main points I'm trying to get across are: first, deep understanding of the problem domain is valuable; second, defining appropriate abstraction boundaries is one of the most important tasks of software engineering.

The narrative frame is a startup whose offering is UGC sudoku. Users create sudoku, the server validates it before publishing, and the solution is also provided as a "hint" feature. Thus, the server needs to implement sudoku solving.

The first is based on the attempt by Ron Jeffries to bring a sudoku solver into existence through TDD. In the story, instead of giving up, the intrepid software engineer plugs along, taking inspiration from guides intended for humans, and ultimately gets the solver working. It's all special cases and uses heuristics to schedule the various pattern-matching stages, but works most of the time.

There are problems. Internet trolls have discovered they can DoS the site by uploading very difficult puzzles, and the SREs implement rate limiting and timeouts to mitigate that. The most devoted users have a taste for puzzles at a higher difficulty level, so they are unhappy that so many of their valid puzzles are being rejected.

Meanwhile, as part of narrative color, the charismatic CEO of the startup, courting investors, assures them that their moat is their proprietary sudoku solver, it's an incredibly difficult problem but they have a genius working on it. They get funding and are able to expand, causing the scaling problems described below.

The engineer discovers Peter Norvig's AI book and realizes there is a better solution. They advocate for a total rewrite. This is very controversial because, as everyone knows, you should never do a total rewrite, you should always incrementally improve the existing, working code. But the situation is becoming untenable and the engineer eventually gets the green light to proceed.

The second, then is the Peter Norvig solver. It's a good solver and works well for a while. As the company scales, there are issues. There's demand for other puzzle types. These get added, but the code becomes increasingly convoluted, and the fundamental problem is that only one person really understands it. I might add another section about unique solutions being a criterion for valid sudoku, but that might be going into the weeds.

Our hero spends a batch at Recurse center and encounters a long-haired programmer who casually mentions that Sudoku solving can be reduced to a SAT problem, and bangs out a 15 line Haskell program[TODO: I'd love people reading this outline, or the blog when it's published, to code golf that, so we can link to an actual solution. It doesn't have to be Haskell specifically, but I picked that to emphasize the research flavor of the solution] to spit out DIMACS CNF for a Sudoku instance, which can then be solved by any of the command line solvers including SAT competition winners. Hero returns to work and, against even fiercer resistance, advocates another from-scratch rewrite.

And life is good. The reduction of sudoku into a SAT problem is straightforward code that can be maintained by junior engineers, readily accommodating new sudoku variants etc. Different SAT solvers can be employed depending on engineering requirements: a simple DPLL one when total code size and understandability is desired, scaling to the fancy competition winners. The simple DPLL one can be maintained in house. By contrast, doing significant maintenance on the fancy ones is beyond the capabilities of most engineers (it's an incredibly specialized area), but taking it on as an open source dependency is completely viable.

SAT solving is an abstraction boundary, and a good one. It's not particularly leaky (a common problem of many abstractions, don't even get me started on GPU infrastructure). And it makes sense as an abstraction because you swap out the module on either side of the abstraction boundary.

Another point to illustrate is the difference between programming and software engineering in the sense of the Titus Winters talk. Norvig's solution is programming - something that makes sense for one person to develop, over a short period of time. If it is submitted as homework for a class in AI search techniques, it gets an A+. But it is not easily maintainable by multiple people, and does not easily scale to all the variants etc. The SAT solver solution has more code and more total complexity, but is much better at being software engineering.

The above is a fictional story, and the real world will differ from it in important ways. For one, sudoku solving is an incredibly well understood problem and there is lots written about it; it's literally a textbook problem. That is not the case for much of the stuff I'm working on now (2D graphics and UI infrastructure). But finding good abstraction boundaries is still important and makes the difference between bad, good, and excellent solutions. That is a major part of the work we're doing on the Xilem project.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions