# SAT Encoding: Breadth First Traversal of Directed Graph

In this post, I will describe how to encode breadth-first traversal (search) (BFT/BFS) of a directed graph as a SAT problem. It will follow the same flow of the post describing how to solve a simple version of Sudoku by encoding it as a SAT problem.

# Problem

Starting the traversal from node A at level 0 in the above graph, we will visit nodes B and C at level 1 and nodes D, E, and F at level 2 (as denoted by the bracket notation in the diagram). In the same graph, starting the traversal from node B (level 0), we will visit nodes D and E at level 1, node A at level 2, node C at level 3, and node F at level 4.

## How is this related to BFS?

If you are thinking that both these orderings would not be generated by the queue-based BFS algorithm, then you are right. The reason for this is the design choices in the queue-based algorithm do not consider all possible orderings.

# Constraints

1. Since there are N nodes, there are at most N levels (in case of path graph).
2. Since each node should be visited exactly once, each node occurs at exactly one level.
3. When a node m is visited at level g, there are three possibilities for each successor n of m: a) n has already been visited earlier as it has a predecessor visited at level h < g, b) n will be visited at level g as it has a predecessor visited at level g-1, or c) n will be visited at level g+1 as m is its first predecessor to be visited.
In other words, for every edge (m, n), if node m occurs at level g, then n does not occur at level h > g+1.
4. Consider a mutation G of above graph without the edge (E, A). In graph G, if we start the traversal at node B. The traversal will end after visiting nodes D and E. To ensure all nodes are visited, the traversal should be “restarted” from an unvisited node. Such restarts can be admitted by associating the next level with the restart node. [Note: This association does not agree with the definition of level stated earlier but it can be retrofitted by subtracting the level of the restart node from the level of the visited nodes from the restart node.] So, in the mutation graph G, (B)(DE)(A)(C)(F) should be admitted as a valid BFT with the restart happening at A at level 3 while the traversal (B)(DEA)(C )(F) should be rejected. Based on the difference between these traversals, we observe a restart node should be the only node in its level.
Also, constraints 1–3 can be trivially satisfied by assigning the same level to all nodes. We want to prohibit such assignments by requiring nodes at a level have predecessors in the previous level.
Both these constraints can be captured as, for every pair of distinct nodes m and n occurring at level g, there exist two edges (p, m) and (q, n) such that p and q occur at level g-1.
5. We want the levels to start from 1. In other words, at least one node occurs at level 1.
6. There should be no gaps in level numbers, i.e., we do not want node A at level 1 and nodes B and C at level 3. In other words,
a) if there exists a node at level g, then there exists a node at level g-1, and
b) if there is no node at level g, then there are no nodes at level g+1.

# Encoding

Using the common constraint patterns, we can mechanically translate the constraints into CNF formulae as follows.

• Enforce constraint 1 by controlling the number of variables associated with levels.
• Use complex pattern X7 to encode constraint 2.
• Use simple pattern S6 to encode for every edge (m, n) part of constraint 3. For each edge, use compound pattern C6 to encode if node m occurs at level g, then n does not occur at level h > g+1 as it is equivalent to if node m occurs at level g, then, for all h > g+1, n does not occur at level h.
• Use simple pattern S6 to encode for every pair of distinct nodes m and n occurring at level g part of constraint 4. Use simple pattern S3 to encode there exist edges (p, m) and (q, n) part. Use Tseitin transformation to encode p and q occur at level g-1 part.
• Use simple pattern S5 to encode constraint 5.
• Use simple pattern S6 to encode for level g part and C5 to encode if ..., then there exists a node part of constraint 6a. Use simple pattern S6 to encode for level g part and C6 to encode if …, then there are not nodes part of constraint 6b (as there are no nodes can be state as for all nodes).

# Solving

Here’s a Groovy script that embodies this encoding and uses Z3 solver to solve the problem.