Short note on application of context free grammar

Well-formed parentheses
The canonical example of a context free grammar is parenthesis matching, which is representative of the general case. There are two terminal symbols “(” and “)” and one nonterminal symbol S. The production rules are
S → SS
S → (S)
S → ()
The first rule allows Ss to multiply; the second rule allows Ss to become enclosed by matching parentheses; and the third rule terminates the recursion.
Well-formed nested parentheses and square brackets
A second canonical example is two different kinds of matching nested parentheses, described by the productions:
S → SS
S → ()
S → (S)
S → []
S → [S]
with terminal symbols [ ] ( ) and nonterminal S.
The following sequence can be derived in that grammar:
([ [ [ ()() [ ][ ] ] ]([ ]) ])

A regular grammar
Every regular grammar is context-free, but not all context-free grammars are regular. The following
context-free grammar, however, is also regular.
S → a
S → aS

S → bS
The terminals here are a and b, while the only non-terminal is S. The language described is all nonempty strings of s and s that end in .
This grammar is regular: no rule has more than one nonterminal in its right-hand side, and each of these nonterminals is at the same end of the right-hand side.
Every regular grammar corresponds directly to a nondeterministic finite automaton, so we know that this is a regular language.
Using pipe symbols, the grammar above can be described more tersely as follows:
S → a | aS | bS

Matching pairs

In a context-free grammar, we can pair up characters the way we do with brackets. The simplest example:
S → aSb
S → ab
This grammar generates the language , which is not regular (according to the Pumping Lemma for regular languages).
The special character ε stands for the empty string. By changing the above grammar to
S → aSb | ε
we obtain a grammar generating the language instead. This differs only in that it contains the empty string while the original grammar did not.
Algebraic expressions
Here is a context-free grammar for syntactically correct infix algebraic expressions in the variables x, y
and z:
S → x
S → y
S → z
S → S + S
S → S – S
S → S * S
S → S / S
S → ( S )
This grammar can, for example, generate the string
( x + y ) * x – z * y / ( x + x )
as follows:
S (the start symbol)
→ S – S (by rule 5)
→ S * S – S (by rule 6, applied to the leftmost S)
→ S * S – S / S (by rule 7, applied to the rightmost S)
→ ( S ) * S – S / S (by rule 8, applied to the leftmost S)
→ ( S ) * S – S / ( S ) (by rule 8, applied to the rightmost S)
→ ( S + S ) * S – S / ( S ) (etc.)
→ ( S + S ) * S – S * S / ( S )
→ ( S + S ) * S – S * S / ( S + S )
→ ( x + S ) * S – S * S / ( S + S )
→ ( x + y ) * S – S * S / ( S + S )
→ ( x + y ) * x – S * y / ( S + S )
→ ( x + y ) * x – S * y / ( x + S )
→ ( x + y ) * x – z * y / ( x + S )
→ ( x + y ) * x – z * y / ( x + x )
Note that many choices were made underway as to which rewrite was going to be performed next. These
choices look quite arbitrary. As a matter of fact, they are, in the sense that the string finally generated is
always the same. For example, the second and third rewrites
→ S * S – S (by rule 6, applied to the leftmost S)
→ S * S – S / S (by rule 7, applied to the rightmost S)

could be done in the opposite order:
→ S – S / S (by rule 7, applied to the rightmost S)
→ S * S – S / S (by rule 6, applied to the leftmost S)

Leave a Reply