Symbolic ExecutionΒΆ
Grackle executes program using symbolic execution. Symbolic execution, like
ordinary execution, executes the program one step at a time, but with two
differences: (1) As discussed in the previous section, the variables range over
symbolic values rather than ordinary expressions; and
(2) when conditional branches are encountered, such as if
statements or
while
loops, the simulator will need to consider multiple execution paths
if the branch may be either true or false.
As an example, consider the code below:
x = symbolic('x', 'real');
target = symbolic('target', 'real');
diff = 3;
if (x > target)
y = x - diff;
else
y = x + diff;
end
[r,c] = check_sat(abs(x-target) > diff & abs(y-target) > abs(x-target))
In this case, the condition x > target
can be either true or false depending
on the values assigned to x
and target
. Accordingly, Grackle will
execute the script up to the if
statement, and then execute both execution
paths over the state prior to the if
statement. When both paths reach the
end of their respective paths in the if
statement, Grackle will merge
the two execution paths into a single state. In this case, after the merge, the
assignment will map y
to the expression if (x > target) (x - 3) (x + 3), which
denotes the value x - 3 if x > target and x + 3 otherwise.
The post-dominator based merging stategy is particularly important for programs with loops. For example, in the following program, Grackle will execute 20 paths without merging, but would need to consider 2:sup:10 execution paths if it did not do path merging.
p = symbolic('x', 10, 1, 'logical');
true_count = 0;
false_count = 0;
for i = 1:10
if (p(i))
true_count = true_count + 1;
else
false_count = false_count + 1;
end
end
In this case, the condition p(i)
is indeterminate, and so each time it is
encountered, two execution paths are formed. Due to our use of path-based
merging however, a single execution path is formed after execution completes.
Internally, our path-based merging strategy operates by translating the source
program into a control flow graph. For each branch, Grackle computes the
immediate post-dominator of that branch, and uses that location as the merge
point. This approach allows Grackle to easily deal with a variety of source
language constructs, including not just if
statements, but also while
and for
loops, even in the presence of continue
and break
statements. In this encoding each loop will effectively introduce two potential
merge points: the first is at the entry of the loop, and used when continue
is called or the loop body terminates. The second is at the loop exit, and is
targetted by break
.
One significant limitation of our approach is that the symbolic simulator does not effectively deal with loops that may execute an unbounded number of times. In this case, the post-dominator-based merging strategy will get stuck executing the loop body. To deal with this and other programs where merging is inappropriate, we have introduced a built-in function symbolic_fork which allows one to create a path that bypasses merging.