CS173 Thursday Oct 10, 2002 ------------------------------------------------------------------------ Converting an RE to a NDFA Example: (1*01*0)*1* is a RE describing all strings of 1s and 0s in which the number of zeros is even. The correspoinding NDFA can be found at the bottom of p. 90 in Programming Language Pragmatics (the 254 text). If we apply the subset construction to the NDFA we get the DFA at the top of p. 91. This DFA has 5 states. Interestingly, it's easy to show that an equivalent two-state DFA exists. It turns out that for any regular language there exists a unique *minimal* DFA, and there's a straightforward construction to construct this minimal DFA given any equivalent DFA. First we add a dead state, if necessary, so every state has an outgoing transition on every input symbol. The construction then works inductively. Initially we place the states of the (not necessarily minimal) DFA into two equivalence classes: final states and non-final states. We then repeatedly search for an equivalence class C and an input symbol a such that when given a as input, the states in C make transitions to states in k > 1 different current equivalence classes. We then partition C into k classes in such a way that all states in a given new class would move to a member of the same old class on a. When we are unable to find a class to partition in this fashion we are done. In our example, the original placement puts states A, B, and E in one class (final states) and C and D in another. In all cases, a 1 leaves us in the current class, while a 0 takes us to the other class. Consequently, no class requires partitioning, and we are left with a two-state machine. ------------------------------------------------------------------------ Constructing an RE from an FA To construct a regular expression from a DFA (and thereby complete the proof that regular expressions and finite automata have the same expressive power), we replace each state in the DFA one by one with a corresponding regular expression. Just as we built a small FA for each operator and operand in a regular expression, we will now build a small regular expression for each state in the DFA. The basic idea is to eliminate the states of the FA one by one, replacing each state with a regular expression that describes the portion of the input string that labels the transitions into and out of the state being eliminated. ------------------------------------------------------------------------ Algorithm for Constructing an RE from an FA Given a DFA M we construct a regular expression R such that L(M) == L(R). Dynamic programming algorithm. Let Rij[t] be a regular expression describing all the ways to get from state i to state j (i.e. all the labels on paths that go from from i to j) without going through any intermediate state numbered higher than t. If states are numbered starting with 1, Rij[0] describes the ways to go from i to j without going through *any* intermediate states. Initially Rii[0] = the alternation of epsilon and all symbols on the self loop, if any, from i to i. Rij[0], i != j, is the alternation of all symbols on the arc from i to j, or the empty RE if there isn't such a loop. What we want for the whole expression is R1j[n] | R1k[n] | R1l[n] ..., where 1 is the start state and {j, k, l, ...} is the set of final states. The inductive step notes that Rij[k] = Rij[k-1] | Rik[k-1] Rkk[k-1]* Rkj[k-1] Example pp 91-92 in PLP: Start with a two-state DFA to accept all binary strings with an even number of zeros: S = {s1, s2} start state = s1 final states = {s1} T = {(s1, 1, s1), (s1, 0, s2), (s2, 1, s2), (s2, 0, s1)} What we want is R11[2]. R11[0] = 1|e R12[0] = 0 R21[0] = 0 R22[0] = 1|e R11[1] = (1|e) | (1|e) (1|e)* (1|e) R12[1] = 0 | (1|e) (1|e)* 0 R21[1] = 0 | 0 (1|e)* (1|e) R22[1] = (1|e) | 0 (1|e)* 0 R11[2] = ((1|e) | (1|e) (1|e)* (1|e)) | (0 (1|e) (1|e)* 0) ((1|e) | 0 (1|e)* 0)* (0 | 0 (1|e)* (1|e)) ------------------------------------------------------------------------ Summary of Results We have shown that all four of the following formalisms for expressing languages of strings are equivalent: * deterministic finite automata * nondeterministic finite automata * nondeterministic finite automata with epsilon transitions * regular expressions