# SAT Encoding: Solving Simpler Sudoku

In this post, I will describe how to encode a simpler version of Sudoku game as a SAT problem.

# Problem

The simpler version of Sudoku is played on a NxN board (of course, N>0). The goal of the game is to place N copies of numbers 1 thru N on this board satisfying the following constraints in conjunction with constraints stemming from numbers already placed on the board, e.g., cell (1,2) contains 4.

- Each cell contains exactly one copy of any number.
- Each row contains every number exactly once, i.e., there are no duplicate copies of a number in a row.
- Each column contains every number exactly once, i.e., there are no duplicate copies of a number in a column.

A solution to a given board will say which copy of a number should be placed in which cell.

# Constraints

As we observed in the introduction to SAT encoding, we can use the common constraint patterns to mechanically translate the constraints into CNF formula. To do so, we need to simplify the above constraints.

We can rewrite the above constraints as follows.

- Each cell contains at least one copy of any number.
- Each cell contains at most one copy of any number.
- Each row contains every number at least once.
- Each row contains every number at most once.
- Each column contains every number at least once.
- Each column contains every number at most once.

# Identifying and Eliminating Redundant Constraints

Since each row has N cells and each cell should have exactly one copy of any number (constraint 1 and 2), we know a row will have exactly N copies, but these N copies can be of the same number. By constraint 3, each row should have at least one copy of every number. So, since there are N numbers and a row will have exactly N copies, each row will have exactly one copy of every number. Hence, constraints 1, 2, and 3 imply constraint 4. In other words, constraint 4 is redundant in the presence of constraint 1, 2, and 3.

Likewise, we can prove constraint 6 is redundant in the presence of constraints 1, 2, and 5.

So, we end up with the following reduced set of constraints.

- Each cell contains at least one copy of any number.
- Each cell contains at most one copy of any number.
- Each row contains every number at least once.
- Each column contains every number at least once.

Now, we are ready to encode the problem.

# Encoding

We will use one variable vRxCyIn to represent the presence of number n in cell (Rx, Cy). Since we have NxN number of cells that can contain any of the N unique numbers, we will have NxNxN number of variables, i.e., vRxCyIn with x, y, and n ranging from 1 thru N.

Using the common constraint patterns, we can mechanically translate the problem into CNF formulae as follows.

- Use complex pattern X5 to encode constraint 1. For each cell (x, y), generate the clause
`(vRxCxI1|vRxCxI2|...|vRxCxIN)`

. - Use complex pattern X6 to encode constraint 2. For each cell (x, y), generate the CNF formula

`(!vRxCyI1|!vRxCxI2) & (!vRxCyI1|!vRxCyI3) & ... & (!vRxCyI1|!vRxCyIN) &`

(!vRxCyI2|!vRxCyI1) & (!vRxCyI2|!vRxCyI3) & ... & (!vRxCyI1|!vRxCyIN) &

... &

(!vRxCyIN|!vRxCyI1) & (!vRxCyIN|!vRxCyI2) & ... & (!vRxCyIN|!vRxCyIM) [where M=N-1]

- Use complex pattern X5 to encode constraint 3. For each row x and number z, generate the clause
`(vRxC1Iz|vRxC2Iz|...|vRxCNIz)`

. - Use complex pattern X5 to encode constraint 4. For each column y and number z, generate the clause
`(vR1CyIz|vR2CyIz|...|vRNCyIz)`

.

As for the constraints stemming from numbers already placed on the board, generate one clause *vRxCyIz* for each number z placed in cell (x,y).

## Eliminating Redundancy

Observe that a simple translation/encoding of constraint 2 will generated redundant clauses. This redundancy can be eliminated by generating only the sub-clauses `(!vRxCyIp|!vRxCyIq)`

where q>p as the sub-clauses where q<p will be generated when generating the sub-formula for number q. (This elimination is implemented on line 50 in the code snippet.)

# Solving

Now, we combine the formulae/clauses from the encoding steps into one CNF formula and feed to a SAT solver. If the solver provides a model, then we interpret *each variable vRxCyIz that is true in the model as placing number z in cell (x,y). *That’s it :)

Here’s a Groovy script that embodies this encoding and uses Z3 solver to solve the problem.

# For You To Do

If you are interested in getting your hands dirty, then try extending the encoding to solve the usual version of Sudoku, i.e., one copy of a number within each block.

# What next?

Next time, I will describe how to encode breadth first traversal of a directed graph as a SAT problem.