GNFAs

Example from http://condor.depaul.edu/glancast/444class/docs/

A Generalized Nondeterministic Finite Automaton is similar to an NFA but the transition function takes a state and a regular expression in the alphabet instead of a state and an alphabet element.

A generalized NFA
          for strings containing 1101

A GNFA for strings in {1,0} that contain the substring "1101".

The idea is that in state q0 the transition to state q1 can be taken if the next input matches the regular expression 1101.

The formal definition is given (on page 73) by:

A generalized nondeterministic finite automaton is a 5-tuple, (Q,∑, δ, qstart, qaccept), where

  1. Q is the finite set of states,
  2. ∑ is the input alphabet
  3. δ : (Q - {qaccept} x (Q - { qstart }) → R is the transition function, where R is the set of all regular expressions over ∑
  4. qstart is the start state
  5. qaccept is the accept state

Note that there is only one accept state. However, this is no real restriction for a nondetrministic automaton. (Why?)

On the other hand, the transition function is defined on a different arguments than is the case for an ordinary NFA.

GNFA Transition Function Example

The transition function, δ, for the example GNFA:

The example GNFA

is

   q1 q2    q3   
q0 ε
q1 (0 | 1) 1101
q2 (0 | 1) ε

Language of a GNFA

The definition of the language of a GNFA is technically different than that of an NFA because the transition function is defined differently. However, the idea is really similar, but extended to allow regular expressions on the transitions.

The formal definition is given by (page 73):

A GNFA accepts a string w in ∑* if w = w1w2...wk, where each wi is in ∑* and a sequence of states q0, q1, ..., qk exists such that

  1. q0 is the start state
  2. qk is the accept state
  3. for each i, wi ∊ L(Ri) where Ri = δ(qi-1, qi); that is, where Ri is the expression on the arrow from qi-1 to qi.

Example String Accepted by GNFA

The string 0011011 is accepted by

Example GNFA

If 0011011 is partitioned into blocks w1=ε w2=0 w3=0 w4=1101 w5=1 w6=ε; that is,

0011011 = ε 0 0 1101 1 ε

then

  State Input Next
w1=ε ∊ L(δ(q0, q1)) q0 w1 q1
w2=0 ∊ L(δ(q1, q1)) q1 w2 q1
w3=0 ∊ L(δ(q1, q1)) q1 w3 q1
w4=1101 ∊ L(δ(q1, q2)) q1 w4 q2
w5=1 ∊ L(δ(q2, q2)) q2 w5 q2
w6=ε ∊ L(δ(q2, q3)) q2 w6 q3

For example, w1=ε ∊ L(δ(q0, q1)) means for state q0 and input w1, next state is q1

Converting an NFA to a GNFA

An NFA can be converted to an equivalent GNFA by

  1. Adding a new start state with an ε transition to the old start state.
  2. Adding a new accepting state and adding ε transitions from all old accepting states to the new one.
  3. Definining the transition function δ' for the GNFA in terms of the transition function δ for the NFA by

    δ'(qi, qj) = a if and only if δ(qi, a) = qj

Example conversion of NFA to GNFA

NFA

Steps 1 and 2 require adding a new start state and a new accepting state. Step 3 require no changes in the diagram except labeling the transitions out of the new start state and into the new accepting state.

NFA to GNFA

Eliminating a state in an GNFA

Any state except the start state and the accept state of a GNFA can be eliminated to obtain a new equivalent GNFA with one fewer states.

For example, to eliminate state q1 in

Eliminate a state
        in a gnfa
  1. replace each existing path q to q1 to q' by a path from q to q'
  2. Label the new path with a regular expression that describes the strings that would cause a transition from state q to q1 to q'

There are 3 arrows coming into state q1 from other states: q0, q2, and q4.

If state q1 is removed, paths must be replaced by new transitions:

  1. q0 to q2
  2. q2 to q2
  3. q4 to q2

Determining the Regular Expression Labels

step2

Eliminating q1, what regular expressions are needed?

step2

GNFA Regular Expression Labels

Suppose qa is to be removed and

  1. the path being considered goes from qi to qa to qj,
  2. transition from qi to qa is labeled by the regular expression Ria
  3. transition from qa to qj is labeled by the regular expression Raj
  4. transition from qi to qj is labeled by the regular expression Rij (Note this may be the empty regular expression, ࢝)

Then the path from qi to qj should be labeled

    RiaRaa*Raj ∪ Rij

Result

Original NFA (converted to a GNFA):

After eliminating state q1, recalculate labels for paths that went through q1:

Path Label
Expression
Actual
Label
q0 to q2 R01R11*R12 ∪ R02 ε0*1 ∪ empty = 0*1
q2 to q2 R21R11*R12 ∪ R22 00*1 ∪ empty = 0+1
q4 to q2 R41R11*R12 ∪ R42 00*1 ∪ empty = 0+1

Now the Theorem Again

A language is regular if and only if some regular expression describes it.
Proof requires two parts.
First Part: If a language is regular, then it is described by some regular expression.
Second Part: If a language is described by some regular expression, then it is a regular language.

NFA to Regular Expression

To show the first part, if we are given an NFA, we need to show that there is a regular expression that describes exactly the language of the NFA.

The construction of the regular expression begins with the NFA and each step will eliminate one state of the NFA until the only state(s) remaining are the start state and a final state.

Complete Example: NFA to Regular Expression

Exercise 1.16a

Convert this NFA to a regular expression that describes the same language.

Convert to GNFA

Eliminate non-start, non-final state 1.

Recalculate labels for paths through state 1

Path Label
Expression
Actual
Label
0 to 3 R01R11*R13 ∪ R03 εa*ε ∪ empty = a*
0 to 2 R01R11*R12 ∪ R02 εa*(a|b) ∪ empty = a*(a|b)
2 to 3 R21R11*R13 ∪ R23 ba*ε ∪ empty = ba*

Final Step: GNFA to Regular Expression

Finally eliminate state 2. Only one path remains from 0 to 3. Recalculate the label for the path through state 2.

Path Label
Expression
Actual
Label
0 to 3 R02R22*R23 ∪ R03 a*(a|b)ε*ba* ∪ a* = a*(a|b)ba* | a*

Summary for NFA to Regular Expression Example

A regular expression that describes the same language as this NFA is

     a*(a|b)ba* | a*

Regular Expresssion to NFA

To show that for any regular expression there is an NFA that recognizes the same language described by the regular expression, the proof describes a procedure for constructing the NFA from the regular expression.

  1. Parse the regular expression into its parts based on the regular expression operator precedence and parentheses used if any to determine the operands of each operator.
  2. star is highest, then concatenation, and union is lowest precedence.
  3. For each of the operators use the construction described in showing the closure properties of regular languages to construct an NFA for each operator and its operands.

See Lemma 1.55 in the text.

Example regular expression

   a(ba)*b ∪ bab
  1. Construct the NFA for (ba): concatenation of b and a
  2. Construct the NFA for (ba)*: star of (1)
  3. Construct the NFA for a(ba)*: concatenation of a and (2)
  4. Construct the NFA for a(ba)*b: concatenation of (3) and b
  5. Construct the NFA for ba: concatenation of b and a
  6. Construct the NFA for bab: concatenation of 5 and b
  7. Construct the NFA for a(ba)*b ∪ bab: union of (4) and (6)