Diagonalization and PCP

LECTURE NOTES

2 - 02 -01

DIAGONALIZATION (Sipser 4.2) & PCP (Sipser 5.2)

THE DIAGONALIZATION METHOD:

Diagonalization is a method discovered by Georg Cantor in 1873. This method is integral in the proof of the halting problem.

Diagonalization is also used in Sipser to describe countable and uncountable sets, and to prove the countability or uncountability of sets. Here, though, we will focus on the use of diagonalization in the proof of the halting problem.

In short (and described in more detail in the previous lecture), the proof of the halting problem proceeds as follows:

A_TM = {<M, w> | M is a TM and M accepts w}. Assume TM H decides A_TM. We then use H to build TM D that takes an <M> and accepts precisely when M does not accept <M>. Then, run D on itself: D(<D>). So,

1) H accepts <M, w> when M accepts w

2) D rejects <M> when M accepts <M>

3) D rejects <D> when D accepts <D>

This last statement is a contradiction. Therefore, H does not decide A_TM and A_TM is not decidable.

In order to show the diagonalization involved in this proof, we must create a table. The rows of the table are all the possible TM’s and the columns are their descriptions. The entry is listed as accept of the TM of the row accepts the description in the column. The accepts [A] and rejects [R] listed are random.

___<M₁> < M₂> < M₃> < M₄> …

M₁ | A R R A

M₂ | A A R R

M₃ | R R R A

M₄ | A R R A

. |

Therefore, by the table, for example, H(M₂, <M₃>) rejects. Now, let’s add D to the table.

___<M₁> < M₂> < M₃> < M₄> … <D> …

M₁ | A R R A A

M₂ | A A R R A

M₃ | R R R A A

M₄ | A R R A R

. |

D | R R A R ?

Here we see that it is at Dx<D> that we have the contradiction. And we know that D is different from every other TM because it differs from the diagonal. For TM M_n, D differs at the n^th column. In this case, that means that D is exactly the opposite of the diagonal because there are only two possibilities (A or R).

POST CORRESPONDENCE PROBLEM

The post correspondence problem (PCP) is undecidable. Let’s see why.

Let there be a collection of dominoes, each domino having a string on each side. Let a domino be represented by

[a | ab]

where ‘a’ is the string on one side and ‘ab’ is the string on the other. The idea is to take a collection of dominoes and order them such that the string that reads on the tops of the dominoes matches the string that reads along the bottom. For example,

[a | ab] [b | ca] [ca | a] [a | ab] [abc | c]

is such an arrangement of dominoes because both the top and the bottom read ‘abcaaabc’. Finding such a string is not possible with all groups of dominoes. The PCP problem determines whether a set of dominoes contains such a string – and is unsolvable.

PCP = {<P> | P is an instance of the Post Correspondence Problem with a match}

We will also use

MPCP = {<P> | P is an instance of the Post Correspondence Problem with a match that starts with the first domino}

THEOREM:

PCP is undecidable.

PROOF:

Let TM R decide PCP.

Construct TM S to decide A_TM.

If we can show that R does decide PCP then we will be able to chow that S does decide A_TM, but we already know that A_TM cannot be decided. Therefore, PCP cannot be decided.

The idea is that A_TM(<M>, w) and S constructs from this a PCP for all M that accept w. If PCP id decidable and any A_TM can be turned into an associated PCP, then A_TM would be decidable which is a contradiction.

So, we must find S that constructs a PCP P that has a match iff M accepts w. In order to do this, S first makes P’ which is an instance of an MPCP.

Sipser divides the proof into 7 parts – so let’s do the same thing.

Part 1:

Begin P’ with the first domino [# | # q₀w₁ w₂ w_3…w_n#], where q₀ is the start state for M and w₁ w₂ w_3…w_n is the input string.

So, the match begins as follows,

| # |

| # q₀w₁ w₂ w_{3 …}w_n# |

We need to add additional dominoes to extend the top to match the bottom. We do this by adding dominoes corresponding to single-step simulations of M on the input string. The next parts of the proof detail how to add dominoes to accomplish this.

Part 2:

Let T be the transition function.

For every a, b in M’s alphabet and every q, r in the set of states where q is not the reject state,

if T(q, a) = (r, b, R), then put [qa | br] into P’.

Part 3:

For every a, b in M’s alphabet and every q, r in the set of states where q is not the reject state,

if T(q, a) = (r, b, L), then put [cqa | rcb] into P’ where c is any element from the alphabet.

Part 4:

For every a in M’s alphabet,

Put [a | a] into P’.

Part 5:

Put [# | #] and [# | e#] into P’.

These rules tell us what dominoes to add as we step through M with the given input. From now on, we will use a hypothetical example to illustrate.

Let the alphabet be {0, 1, 2, e} where e is the empty bucket.

Let w = 0100 and let there be the transition T(q₀, 0) = (q₇, 2, R).

By Part 1, we begin the match with

| # |

| # q₀ 0 1 0 0 # |

Part 2 adds the domino [q₀0 | 2q₇]

Part 4 adds the dominoes [0 | 0], [1 | 1], [2 | 2], [e | e].

Along with the added dominoes from Part 5, we can now extend the dominoes to the following:

| # | q₀ 0 | 1 | 0 | 0 | # |

| # q₀ 0 1 0 0 # | 2 q₇ | 1 | 0 | 0 | # |

which lined up begins the match

# q₀ 0 1 0 0 #

# q₀ 0 1 0 0 # 2 q₇ 1 0 0 #

Now, let there be another transition as follows: T(q₇, 1) = (q₅, 2, R).

Then, we an add the domino [q₇1 | 0 q₅] and the dominoes extend to

| # | q₀ 0 | 1 | 0 | 0 | # | 2 | q₇ 1 | 0 | 0 | # |

| # q₀ 0 1 0 0 # | 2 q₇ | 1 | 0 | 0 | # | 2 | 0 q₅ | 0 | 0 | # |

which lined up begins the match

# q₀ 0 1 0 0 # 2 q₇ 1 0 0 #

# q₀ 0 1 0 0 # 2 q₇ 1 0 0 # 2 0 q₅ 0 0 #

Now, let there be another transition as follows: T(q₅, 0) = (q₉, 2, L).

Then, we an add one of the following dominoes:

[0q₅0 | q₉02] [1q₅0 | q₉12] [2q₅0 | q₉22] [eq₅0 | q₉e2]

In this case, the first one is the appropriate one to choose since 0 is to the left of the head at the moment in the bottom row of the match. Then, we choose the first domino and extend to

| # | q₀ 0 | 1 | 0 | 0 | # | 2 | q₇ 1 | 0 | 0 | # | 2 | 0 q₅ 0 | 0 | # |

| # q₀ 0 1 0 0 # | 2 q₇ | 1 | 0 | 0 | # | 2 | 0 q₅ | 0 | 0 | # | 2 | q₉0 2 | 0 | # |

which lined up begins the math

# q₀ 0 1 0 0 # 2 q₇ 1 0 0 # 2 0 q₅ 0 0 #

# q₀ 0 1 0 0 # 2 q₇ 1 0 0 # 2 0 q₅ 0 0 # 2 q₉0 2 0 #

We continue this way, simulating M on w until M reaches a halting state. If the halting state is reject, then we stop and reject. If it is an accept state, then we want to let the top catch up and finish matching the bottom.

Part 6:

Put [a q_accept | q_accept] and [q_accept a | q_accept] into P’

This has the affect of adding steps after the TM has halted at the accept state that are purely used to let the top catch up and match the bottom. The head, in a sense, continues to “eat” symbols until none are left. At this point, the top has caught up and we are matched.

Part 7:

Put the domino [q_accept # # | #]

Now, we can complete the match.

Say that the TM halts in an accept state and at that point the end of the line of dominoes matches up as follows:

# 2 1 accept 0 2 #

Then, we can use the rules from part 6 to get the following:

# 2 1 accept 0 2 # 2 1 accept 2 # 2 1 accept # 2 accept #

# 2 1 accept 0 2 # 2 1 accept 2 # 2 1 accept # 2 accept # accept #

Finally, using Part 7, we get

# 2 1 accept 0 2 # 2 1 accept 2 # 2 1 accept # 2 accept # accept # #

This concludes the construction of P’.

However, we are not done because P’ is an MPCP, so now we must show that we an turn it back into a PCP. To do so, we build the requirement that the string begin with the first domino directly into the problem so that it is unnecessary to state it explicitly. This is a bit of a trick – bet it works.

Let string u = abdefgh….z For any length string.

Then,

Let *u = * a * b * c * d * e……..* z: This adds * before every character.

Let u* = a * b * c * d * e * ……..z *: This adds * after every character.

Let *u* = * a * b * c * d *……….* z *: This adds * before and after every character.

Then,

if P’ was the collection of dominoes

{ [t₁ | b₁], [t₂ | b₂], [t₃ | b₃], [t₄ | b₄], [t₅ | b₅],…, [t_k | b_k] }

then let P be the colletion

{ [*t₁ | *b₁*], [*t₂ | b₂*], [*t₃ | b₃*], [*t₄ | b₄*], [*t₅ | b₅*],…, [*t_k | b_k*], [*# | #] }

Then, the only domino that could possibly start a match is the first one because it is the only way for the bottom to start with a * and the top must necessarily start with a *. The * will match up throughout and the last domino adds the final * to the top to complete the match.

We have now shown that we an construct S from A_TM. Therefore, if PCP was decidable, A_TM would be decidable. But we know that it is not, so PCP is undecidable.

I apologize if the notation was confusing. Please let me know if anything is unclear and I will attempt to clarify the notes. Thanks.