# Finding Minimal Unsatisfiable Subsets¶

Sometimes, when searching for solutions to constraint problems,
no solution exists to the problem and the solver returns `UNSAT`

.
However, this bare answer is not especially informative on its own.
We might like to have some idea *why* the problem is unsatisfiable,
or have some likely candidates for constrats that could be relaxed to
find satisfiable solutions.

One way to get more information out of unsatisfiable query results is
to compute *minimal unsatisfiable subsets* of the original query.
These are subsets of the assertions of the original query that are
already unsatisfiable; and which are minimal, in the sense that
removing any one assertion from the subset renders it satisfiable.
The value of minimal unsatisfiable subsets (MUSs) is that they can be
understood as an *explanation* for why the query is unsatisfiable. In
good cases, a MUS may be significantly smaller than the original
query, and thus provide significant value to a human operator trying
to understand the query failure. Alternately, in a multi-stage
constraint satisfaction setup, MUSs can be feed backward to prune
search space in a previous search phase.

To allow the computation of MUSs in Grackle, we have implemented
the CAMUS algorithm [1], which can be layered on top of any
SAT or SMT solver that supports incremental solving. The CAMUS
algorithm, invented by Liffiton and Sakallah [LS2008], it able to take
an arbitrary collection of statements, the conjunction of which is
unsatisfiable, and compute the set of *all* minimal unsatisfiable
subsets of that collection. If desired, it can also be used to
compute just one MUS, saving some work.

Compared to some techniques for computing unsatisfiable cores, the CAMUS method has the advantage that it works directly with the set of clauses as defined by the input, as opposed to, e.g., a set of clauses produced after conjunctive normal form (CNF) conversion. This means the unsatisfiable subsets returned are directly stated in terms of the inputs to the CAMUS algorithm. This makes it easy to map these unsatisfiable subsets back into semantically meaningful units, and also gives the caller significant control over the granularity of the unsatisfiable core search.

Note: In order to be feasible, the CAMUS algorithm requires incremental solver support. To make use of this ability, the CAMUS procedures in Grackle require the use of the “online” solver backend, which maintains a persistent connection to the Yices SMT solver. The solver backend may be selected from the command line, and must be set to the online backed, as the default is to use the noninteractive “offline” backend.

## Example¶

Consider the following script. It establishes a number of symbolic integer variables, and posits a number of constraints on them (chosen arbitrarily for this example).

```
x = symbolic('x', 'integer');
y = symbolic('y', 'integer');
z = symbolic('z', 'integer');
w = symbolic('w', 'integer');
% (x,y) is a distinct point from (z,w)
background = not ( x == z & y == w );
clauses = ...
[ x > z, y <= w, 10 < z, 15 > w, y == z, x + 4*w < 20, w == 10 ];
% Is the problem satisfiable?
[r s] = check_sat( background & all(clauses) );
if(r)
% If so print the solution
'Satisfiable'
s
else
% If not, find all the MUSs
'Unsatisfiable'
mus = find_all_mus( background, clauses )
% Verify that each MUS is unsatisfiable
for i = 1:size(mus,1)
r = check_sat( background & all(clauses(mus(i,:))) );
if(r)
'Oops! MUS is actually satisfiable:'
mus(i,:)
end
end
end
```

The constraints are broken into two parts: a collection of “background” constraints and a sequence of clauses. The CAMUS algorithm will treat statements in the background clause as fixed: it will consider only situations where the background constraints are all satisfied. The collection of statements in the “clauses” list should be considered as a conjunction of additional constraints. It is these clauses that CAMUS considers the non-divisible units when searching for unsatisfiable subsets.

This particular collection of constraints is unsatisfiable, as the first
call to `check_sat`

will tell us. Thus we call the `find_all_mus`

function to compute the complete set of all minimal unsatisfiable
subsets. For this particular constraint set, there happen to be three
distinct minimal subsets. The output of this example script is shown below.

```
$ grackle --backend=online camus_ex.m
ans =
Unsatisfiable
mus =
1 1 1 0 1 1 0
0 1 1 0 1 0 1
1 0 1 0 0 1 1
```

The output of `find_all_mus`

is a matrix of boolean values. Each row
in the matrix identifies a separate MUS. The elements in each row
indicate if the corresponding clause is in the subset. Thus, the
first MUS listed above consists of the following five clauses: `x > z`

,
`y <= w`

, `10 < z`

, `y == z`

and `x + 4*w < 20`

. It is
straightforward to use logical indexing to select this subset of
clauses as `clauses(mus(1,:))`

. The verification loop at the end of
the script does exactly this to verify that each reported MUS is
actually unsatisfiable.

If we wanted only a single MUS, we could instead call the `find_mus`

function. Moreover, if we want to compute more than one MUS, but not
necessarily all, we can call `find_all_mus`

with an upper bound; then
at most that many MUSs will be computed. For example,
`find_all_mus(background, clauses, 100 )`

will compute at most 100 MUSs.

## Notes on performance¶

The time required to compute MUSs can be highly unpredictable, but in the worst cases, it is very expensive. The algorithm works in two steps: in the first, the solver is repeatedly queried to discover a collection of maximally satisfiable sets; and in the second phase, this collection of maximally satisfiable subsets is examined to discover minimal unsatisfiable subsets. Both phases of this algorithm can take exponential time, in the worst case. Asking for all MUSs, in particular, can result in a number of answers exponential in the number of clauses. If the initial query takes an appreciable amount of time, calculating MUSs using CAMUS is unlikely to yield satisfactory results, as it involves repeatedly querying the solver with problems of approximately the same difficulty as the initial problem. Calculating a single MUS can be significantly cheaper than calculating all, or even a bounded number of MUSs.

To mitigate this problem, it can help to limit the number of clauses submitted to CAMUS, either by incorporating more fact into the background clause, or by combining clauses together into coarser (and thus, fewer) clauses.

[1] | CAMUS is simultaneously a reference to the French author Albert Camus and also an acronym for Compute All Minimal Unsatisfiable Subsets. |

[LS2008] | “Algorithms for Computing All Minimal Unsatisfiable
Subsets.” Mark H. Liffiton and Karem A. Sakallah.
Journal of Automated Reasoning. 2008. |