CS195Y Lecture 17

3/9/16

Overview of Formal Logic:

If V is a set of variables then a literal is either a formula v in V or a formula (not v) for v in V
A formula is in CNF if is is a set of clauses, all connected with and
A clause is a set of literals all connected by or ie. a “disjunction of literals”
A unit clause is ?
The empty clause is ?
Implies is defined as: F1 --> F2 = (not F1) or F2

Let V be a set of variables:
F ::= v \in V
not F
F and F
F or F
De Morgan’s Laws and Distributivity:
not (F1 and F2) = (not F1) or (not F2)
not (F1 or F2) = ?
F1 or (F2 and F3) = (F1 or F2) and (F1 or F3)
F1 and (F2 or F3) = ?

Q: What is the difference between equality and equivalence?
A: It depends. For instance if we want to talk abot algebraic formulas, we might say 5 and 7-2 are equivalent, though they are syntactically different.

We can determine equivalence between boolean formulas using truth tables:

	not (F1 and F2)	(not F1) or (not F2)
F1 = T, F2 = T	F	F
F1 = T, F2 = F	T	T
F1 = F, F2 = T	T	T
F1 = F, F2 = F	T	T

What’s a model? IF you ask a logician or philosopher, they might say an instance. If you ask an Alloy programmer, they might say an Alloy module. Takeway: words like “model” are used in different (and contradicting ways)

The empty clause is False.
The empty CNF is True.

SAT Solver:
Input: Boolean CNF Output: “True! + Instance” or “False!”
(most SAT solvers give more than just “false”, they provide some information about what makes the instance unsatisfiable)
Notation: We will make and and or implicit, and variables are represented as numbers.
Ex: [3] [-3,-2] [2,-1] represents p3 and (not p3 or not p2) and (p2 or not p1)
Q: Is this formula satisfiable?
A: Satisfiable! The assignment 3, -2, -1 satisfies the formula.
Q: How did we construct this assignment?
A: First, we know that 3 has to be True (because it is a unit clause). Then we know that -3 is not true, so -2 has to be True to satisfy the second clause. Finally, we know that 2 will not be true, so -1 has to be True to satisfy the last clause.
Q: What if there are no unit clauses to start off the process?
A: Then you have to guess an assignment to a variable and expand from there.
Q: Is there a way to make an educated guess?
A: Not really, because finding the optimal guess is harder than the satisfiability problem itself. But we’ll come back to this

The formula we had in our example is a “2-Sat” problem, which means that each clause has at most two variables. It turns out that there is a polynomial algorithm to find a satisfying instance in this case.
Draw a directed graph with the following edges:

1 -> 2
3 -> -2
2- -> -1
2 -> -3

Then you can follow the edges to construct the instance 3, -2, -1
What if we had the following graph:

1 -> 2
-2 -> -3
-3 -> 1
-1 -> 2
2 -> -1
2 -> 1

Then 1 implies -1 and -1 implies 1, so this formula is unsatifiable
So this works for formulas with two literals in each clause.
But if have three literals in each clause, like [1,2,3] then you end up with exponential branching which makes the problem much harder to solve.
3-SAT has worst-case exponential runtime. It is in the class NP, which you can learn more about in CS51.
If you can figure out how to solve 3-SAT in polynomial time, there’s a $1 million dollar prize (and possible an interrogation room at the CIA) waiting for you!
However, just because the problem is NP-complete doesn’t mean it’s not worth trying. For example, compilers rely on solving problems that are NP-complete.
Research has continued to make progress with optimizing SAT solvers to do better than exponential in most cases (though the worst case will always be exponential)

If we have a unit clause, then we can do something called unit propagation, which allows us to simplify the rest of the formula (as we did above with 3)
In our example above, what’s special about 1? It only shows up once. More importantly, it is always negative so we can easily guess -1 and find a satisfying instance. Generally, if you have a literal x that is either always positive or always negative, it makes it quite easy to guess an assignment for x. However, this won’t necessarily give you all of the satisfying instances because there may be some instance where the value of x is the opposite of you guess.
Beyond these two techniques, the only way to solve SAT problems is to guess variable assignments and branch down both possibilities.

Main algorithm for solving SAT: DPLL (Davis–Putnam–Logemann–Loveland)

DPLL(V,F):
    F := propagate(F)
    F := pure_literal_elimination(F)
    if there is an empty clause in F:
        unsat
    if all clauses are unit clauses in F:
        sat + construct_instance(F)
    v := pick a variable
    try true for v
    try false for v

Better pseudocode will be provided later, but this is basically a general idea of the algorithm.
How can we make it better?

Iterate propagation and pure-elimination until there is no change
Make a smarter choice about which variable to pick for branching
Learn from previous guesses— add clauses after an unsuccessful branch representing the new information you have learned (CDCL: Conflict Driven Clause Learning)