Theory of Computation #10

Exercise 2.6.3P: Write a program to convert context-free grammars that do not generate the empty string into grammars in Chomsky normal form.

Again, a challenging problem. The choice of the data structure will be very important I think. Let’s start with an example.

S -> xSyy
S -> zNz
N -> x

So we how would this look in CNF?

S -> XA
A -> SB
B -> YY

S -> ZC
C -> NZ

N -> x

X -> x
Y -> y
Z -> z

We basically do two operations. The first step is to guarantee that the right-hand side is either only a terminal or consists of two terminals.

Let’s write a function for that. This function will return true if it’s already in CNF otherwise false.

Examples:

(check-CNF "XA") -> true
(check-CNF "xA") -> false
(check-CNF "x") -> true

Ok, we take a string and check two things. First, the length has to be less or equal 2. And then if the length is 2 if all characters are uppercase. And if the length is one, if the character is lowercase.

(defn check-CNF
  "Checks if a string is already in CNF format"
  [string]
  (let [length (count string)]
  (cond (> length 2) false
        (= length 2) (every? #(Character/isUpperCase %) string)
        (= length 1) (Character/isLowerCase (first string))
        :else false )))

We also write a few tests which all work fine. Great.

(deftest test-check
  (is (= (check-CNF "XA") true))
  (is (= (check-CNF "XAA") false))
  (is (= (check-CNF "xA") false))
  (is (= (check-CNF "x") true)))

Although it doesn’t look like much it saved us time.

Continue tomorrow

Theory of Computation #9

Exercise 2.4.3: Design an LL(1) parse table for the following grammar.

S -> xSz
S -> ySz
S -> lambda.

The table consists of columns which have terminals and rows for non-terminals.

[table]

x, y, z, lambda, EOS

S, xSz, ySz, z, lambda, error[/table]
Exercise 2.4.4: How many symbols of lookahead would be required by an LL parser when parsing strings based on the following grammar? Design a corresponding parse table.

S -> xSy
S -> xy

Ok, let’s take a look at some derivations:

xy
xxyy
xxxyyy
etc.

You can see that we need LL(2). And get the following table.

[table]

, xx, xy, yy, yx

S, xSy, xy, yy, yx[/table]
Exercise 2.5.4: Construct an LR(1) parse table for the grammar below.

S -> xSy
S -> lambda.

Ok, let’s start with some examples again of possible derivatives:

lambda
xy
xxyy
xxxyyy

And here’s the table:

[table]

, x, y, lambda, S

1, shift 2, , S -> lambda,

2,shift 4, shift 3, , 5

3, , S -> xSy, S -> xSy, 6

4, , shift 3, , ,

5, , shift 3, , ,

6, , , accept, , [/table]

Pretty simple, however took some thinking through. I started with the simplest example and implemented it, and then did the other one. Thanks to the recursive nature of the problem you solve one problem for all problems basically.

Here’s the stack content for xxyy:

1
1  x 2
1  x 2  x 4
1  x 2  x 4  y 3
1  x 2  x 4  y 3
1  x 2  S 1
1  x 2  S 1  y 3
1  S 6
accept

It works.

Theory of Computation #8

Excercise 2.1.2: What language is accepted by the pushdown automaton whose transition diagram is shown below?

ex2-I-2

So, we have one state which is also an acceptance state. We also have to transitions. The first one reads x, pops empty string and pushes x. The second one reads y, pops x and pushes empty string.

We can pop empty strings immediately therefore it starts with reading x. As soon as a y is read it we have to pop an x, i.e. we can only read as many ys as xs but we can read as many xs as we want.

Therefore the language is consists of all elements which hold x^n  y^m where m <= n. Some examples: x, xxxy, xxyxyyx, etc.

 

 

Exercise 2.3.5: Convert the following grammar, with start symbol S, into grammar in Chomsky normal form that generates the same language.

S -> xSy
S -> wNz
N -> S
N -> lambda.

As a reminder the Chomsky normal form (CNF) can have the following constructs:

A -> BC
A -> a or
S -> lambda

This language doesn’t include the empty string therefore we will only need the first two constructs.

Start with the first rule: S -> xSy. We can transform it to:

S -> XA
A -> SY

where

X -> x
Y -> y.

The next rule is: S -> wNz. N can either be S or lambda, therefore this rule can be expanded to:

S -> wz and S -> wSz. We can rewrite this in CNF as:

S -> WZ
S -> WB
B -> SZ

where

W -> w
Z -> z.

Theory of Computation #7

Let’s say we want to parse the string: xzy. We need a function which takes the string and returns whether it parses or not.

For this I’m thinking about a function which takes the current state and character and returns the next state.

(get-state :S \x) -> :A
(get-state :A \z) -> :A
etc.

Ok. We work with non-deterministic grammars therefore there are several paths available. But let’s start with the first one. The filter function seems to do a fine job extracting our options.

(filter #(% :A) test-grammar)

which returns

({:A {\y :B}}
 {:A {\z :A}})

That’s helpful. Now, we need to get the correct candidate and extract the next state. I changed my initial design a bit. I started out with the following format: {:State [\character :State]} but changed it to two hashmaps because of the nature of the problem.

First we extract the values, i.e. the right-hand side of the candidates. Here’s the code for candidates:

code> (filter #(% :A) test-grammar)
({:A {\y :B}} {:A {\z :A}})

Now we can mapcat the right-hand side:

(mapcat vals candidates)

which returns

code> (mapcat vals candidates)
({\y :B} {\z :A})

The next step is to map the characters on that list. This is a nice characteristics of hashmaps because our characters are our keywords. Then we filter it and return the first one. That’s the next state.

(defn get-state
  "Returns the next state given a state
   and character"
  [state character]
  (let [candidates (filter #(% state) test-grammar)]
    (first
     (filter identity
             (map #(% character) (mapcat vals candidates))))))

And it passed all the tests:

(deftest test-get-state
  (is (= (get-state :S \x) :A) )
  (is (= (get-state :S \y) nil))
  (is (= (get-state :A \y) :B))
  (is (= (get-state :A \z) :A))
  (is (= (get-state :B nil) true)))

Oki doki. So, now that we have the state, we need a function which takes a string and looks at the return value of our get-state stuff. It stops if the returning state is either nil (not valid) or true (valid).

(check-string "xzy") -> true
(check-string "xab") -> false

The code is surprisingly super-straight forward.

(defn test-string
  "Takes an input string and checks if it
   valid according to the grammar"
  [input-string]
  (loop [current-state :S
         rest-string input-string]
    (cond (= current-state nil) false
          (= current-state true) true
          :else (recur
                 (get-state current-state (first rest-string))
                 (rest rest-string)))))

We just get the state, compare it and return true or false. Now let’s throw some tests in:

(deftest test-test-string
  (is (= (test-string "xzy") true))
  (is (= (test-string "xyz") false))
  (is (= (test-string "xab") false))
  (is (= (test-string "xzzy") true))
  (is (= (test-string "") false)))  

And that’s it. That was surprisingly easy. And the code is super nice and easy. I have to say pretty great.