Finite Automata and Regular Expressions Are Useful in Defining the Syntax Rules for Constructs

CS 454 Theory of Computation Fall 2002

Solutions to Final Examination

(a) Define the following terms precisely:

(a) algorithm: A halting Turing machine

(b) transitive relation: R: A  A is transitive if for all x, y, z, (x,y) is in R and (y,z) is in R implies (x,z) is in R.

(c) context-free grammar: A context-free grammar is defined by 4 components: A finite set S of terminal symbols, N (disjoint from S) a finite set of non-terminals, A special symbol T from S (start symbol) and a finite set of rules of the form A  , where A is in N and  is in (N U S)*

(d) configuration of a Turing machine: Assuming that the Turing machine is a 1-tape machine, a configuration is a string of the form xpy where x and y are string over , the tape alphabet and p is a state of the machine.

(e) Turing recognizable language: A language L is called Turing recognizable if there is a Turing machine M such that on input x (over the alphabet of L), it halts in an accepting state if and only if x is in L. If x is not in L, it either never halts or it halts in a rejecting state.

(b) Show that the regular expressions R = (a2 U a3)* and R’ =  U a2.a* are equivalent.

LHS denotes the set of strings that can be written as a concatenation of any number of aa and aaa. RHS denotes strings of two or more a’s or . We will show that every string in R can be generated by R’ and vice-versa. Let x be a string in R. Thus, x is of the form aj for some j = 2p + 3q, p, q >= 0. If both p and q are 0, then x = e, which is in L’. Otherwise, either p >= 1, or q >= 1. Thus in both cases, the length of x is at least 2, and hence it is in R’.
Conversely, let aj be in R’. We will show that aj is in R. This requires showing that the set of integers of the form j = 2p + 3q (where p, q>= 0) includes all positive integers (including 0) except 1. We can show this by induction on j, starting with j = 2. i.e. The base case j = 2 is obvious since we can choose p = 1, q = 0. For j = 3, pick p = 0, q = 1. For any larger value j, write j as j = k + 2. By induction hypothesis, k can be written as k = 2p + 3q for some p and q>= 0. Thus j = 2(p+1)+3q. This completes the proof.

2 (a) Describe an encoding of the 3 tiles 1 x 2, 2 x 1 and 1 x 1 using a suitable alphabet and design a DFA that accepts encoded tilings of a 2 by n chess board with these tiles assuming that an unlimited supply of each tile is available.

One simple encoding will be similar to the one we used in the lab. We will use a double-stacked alphabet in which the upper track will represent the first row, the lower one will represent the second row. Here is one such encoding:

will be denoted by [2 2]

w will be denoted by 1

Finally, will be denoted by 3.

The alphabet used to encode tiled boards will be

A DFA that accepts the valid tilings is shown below:

3) (a) Show that the language L over {a, b} defined below is not regular:

L = { w | length of w is odd, and the middle bit in w is a}. For example, 10100 is in L, but 1001 and 0110111 are not in L.

The proof by pumping lemma is as follows: Let adversary choose an integer n. I choose the string w = 0n 1 0n. Let x = 0i, y = 0j and z = 0n-i -j 1 0n be an arbitrary valid partition of x by the adversary. By pumping twice, we get the string w’ = 0n+j 1 0n Clearly, the only occurrence of 1 in w’ is not in the middle position since j > 0. Thus w’ is not in L, contradicting the pumping lemma.

(b) Design a pushdown automaton for L.

4) Design a context-free grammar for the language L = {}. Is your grammar ambiguous? Justify your answer.

S  A | B | C | D

A  A c | E, E  aEb | F, F  aF | 

B  B c | G, G  aGb | H, H  aH | 

C  a C | J, J  bJc | K, K  bK | 

D  aD | L, L  bLc | M, M  cM | 

This grammar is clearly ambiguous since both A and C will generate the string abbccc.

5) For each of the following statements, state if it is true or false. Justify your answer.

(i) If A B and B is regular, then A is regular.

False. A = { an bn | n >= 0} and B be (0 + 1)* is a counter-example.

(ii) If A and B are recognizable languages, then A U B is recognizable.

True. Let P and Q be Turing machine that recognize A and B respectively. A Turing machine M that recognizes A U B can be designed as follows: on input x, it copies x on to a second tape. Then, on tape 1 it simulates P on x, on tape 2 it simulates Q on x. It alternates the moves of the simulation. If either simulation halts in an accepting state, M accepts and halts. If one of them rejects, then simulation continues with the other. If the second one also halts in a rejecting state, then M rejects and halts. It is clear that M accepts x if and only if P or Q (or both) accept x. Thus, L(M) = A U B. This means A U B is recognizable.

(iii) The set of regular languages over the alphabet {0,1} is countable.

True. Clearly, each regular language is uniquely identified by a DFA which in turn can be encoded as a finite binary string. Thus the collection of regular languages is in 1-1 correspondence with an infinite subset of the set of binary string. Since the set of binary strings is countable, so is the set of regular languages.

(iv) If A U B is regular, then either A or B (or both) must be regular.

False. We can choose A = { an bn | n >= 0} and B = (a U b)* \ A. Clearly, neither A nor B is regular, but A U B is (a U b)* and is regular.

(v) There is an algorithm to determine if a given context-free language is empty.

True. In fact, this algorithm was presented in class. The idea is to compute all the non-terminals of the grammar that can generate a terminal string. If the start symbol belongs to this collection, then answer NO, otherwise answer YES.

(vi) If A is context-free and B is regular, then A U B is context-free.

True. This is one of the home work problems.

6) Compute NULLABLE, FIRST and FOLLOWS for each non-terminal of the following grammar (over {a, b, c, d}):

S  aSd | A | B

A  aAc | C

B  bBd | C

C  bCc | 

Is this grammar LL(1)? Explain your answer.

7) Is disjointness problem for DFA’s decidable? Or equivalently, is the following language recursive? Justify your answer.

EQDFA = { <M1>#<M2> | M1 and M2 are DFA’s over the same alphabet such that there is no string that is accepted by both M1 and M2}

YES. This problem is decidable. The idea is simple. Design a DFA M3 that accepts the strings accepted by both M1 and M2. (This can be done using the following algorithm. M2 will have as its states pairs of the form (p,q) where p is a state of M1, q is a state of M2. Transitions are as follows: ((p,q), a) =  (r,s) if the 1 (p,a) = r and 2(q,a) = s where 1 and 2 are transition functions of M1 and M2. The start state and accepting states are chosen appropriately. It is clear that L(M) = L(M1) L(M2). Thus, L(M1) and L(M2) are disjoint if and only if L(M1) L(M2) = . Thus the next step of the algorithm is to test if L(M) = . If L(M) is empty, then output YES, else output NO.

8) Give a formal construction of a 1-tape, one-way infinite tape Turing machine M to accept the following language:

{ w # wR | w is in {0, 1}* }

9) Let L be a recognizable language over the alphabet {0,1} such that L contains exactly one string of length n for every n. Show that L is decidable.

Let M be a Turing machine for L. We are only given that L is recognizable so M is sure to halt and accept strings in L, but for strings not in L as input, M may not halt. We need to modify M into M’ so that M’ halts and rejects such inputs. The idea is as follows: on string x, M’ will first generate all the strings of length m= length of x and store them all in a tape, with a marker separating them. Then M’ will simulate M on each of these strings (including x) in a concurrent manner. This means one step of simulation for each string in a cyclical order. When M halts and accepts one string (we know that there is exactly one such string), M’ examines if the accepted string was the input x. If it is, then M’ accepts the input. Else it rejects the input. Clearly, this is a halting Turing machine for L, so L is decidable.

10) Convert the following grammar into Chomsky normal form:

S 0A0 | 1B1 | BB

A  C

B  S | A

C  S | 