# SAT Encoding: Depth First Traversal of Directed Graph

In this post, I will describe how to encode the problem of depth-first traversal (search) (DFT/DFS) of a directed graph as a SAT problem. It will follow the same flow of the post describing how to encode breadth-first traversal (search) (BFT/BFS) of a directed graph as a SAT problem.

# Problem

Given a directed graph and a node n in the graph, **depth first traversal** of the graph visits all yet unvisited nodes reachable from a child node of a node before visiting another child node of the node.

Starting the traversal at node A in the above graph, we will visit nodes the order of B, D, and E or B, E, and D before visiting C and the yet unvisited nodes reachable from C. If the traversal started at node C and visited F before D, then we would visit nodes F, D, E, A, and B in order.

# Constraints

The input is a graph with N nodes and E directed edges. The output will be a sequence of N nodes as visited in a depth-first traversal of the graph.

- Each node should be visited exactly once. In other words,
*each node should occur in the output sequence exactly one*. - Only one node can be visited in each step during the traversal. Since there will be N steps (positions in the output) and due to constraint 1, a step will visit at least one node. So, we only need to consider
*each step visits at most one node*. - After visiting node m at step g, if one of m’s successor node n is visited at step h>g, then all nodes visited from g thru h (excluding g) should be
*unvisited when m was visited*and*reachable from m*.

If a node reachable from m was visited before visiting m, then it will not be visited after visiting m due to constraint 1; hence, we need to consider only the second part of the constraint.

Suppose q is visited in step j after visiting m and before visiting n, i.e., g<j<h. If q is reachable from m, then there should be an edge (p, q) such that node p is reachable from m and is visited at step i after visiting m and before visiting q, i.e., g≤i<j. Since p is reachable from m, this observation should also hold for p. Hence, for all nodes visited in steps i from g+1 thru h.

In other words,*for each edge (m, n), if m is visited at step g, n visited at step h, and q at step g<j<h, then there exists an edge (p, q) such that p occurs at step g≤i<j.*

# Encoding

Use one variable vNxPy to represent the occurrence of node Nx at step Py. Since we have N nodes and N steps, we will have N*N variables.

Using the common constraint patterns, we can mechanically translate the constraints into CNF formulae as follows.

- Use complex pattern X7 to encode constraint 1.
- Use complex pattern X6 to encode constraint 2.
- Use simple pattern S6 to encode
*for every edge (m, n, q) and step g<j<h,*part of constraint 3. Use simple pattern S3 to encode*if m occurs at step g, n occurs at step h, and q at step g<j<h, then*part of the constraint. Use simple pattern S5 to encode*there exists an edge (p, q) such that p occurs at step g≤i<j.*

# Solving

Now, we combine the formulae from encoding the constraints into one CNF formula and feed it to a SAT solver. If the solver provides a model, then we interpret *each variable vNxPy that is true in the model as visiting node x at step y. *That’s it :)

Here’s a Groovy script that embodies this encoding and uses Z3 solver to solve the problem.

# For You To Do

Explore the size of the formula generated by the above solution on graphs of different sizes.