in Books, Doing Books, Theory of Computation

Theory of Computation #11

The next major step is to transform existing rules to CNF. We have to do two things:

A) Transform the terminals to new non-terminals
B) Shorten the rule

A) seems easier, so let’s do that first.

Examples: xA, By

In this case, we have to create a new variable – for the sake of simplicity its name will be the name of the terminal in uppercase – and replace it.

(transform-terminals "A -> xA") -> [{:A "XA"}, {:X "x"}]

Before that I need a function which decomposes the string.

(decompose-rule "A -> xA") -> {:A "xA"}

Here’s the code and test: (I edited the input later to directly take a decomposed map)

(defn decompose-rule
  "Takes a rule in the format A -> B and returns a hash-map
   containing the decomposed rule."
  [string]
  (let [[kw content] (clojure.string/split string #" -> ")]
    (hash-map (keyword kw) content)))


(deftest test-decompose
  (is (= (decompose-rule "A -> xA") {:A "xA"})) 
  (is (= (decompose-rule "B -> y") {:B "y"}))
  (is (= (decompose-rule "C -> yBz") {:C "yBz"})))

Ok, now we can concentrate on the right-side of the rules. Want we want the function from before:

(transform-terminals "A -> xA") -> [{:A "XA"}, {:X "x"}]

The idea is pretty basic. We extract each terminal, i.e. lower case, and create a new rule for that, afterwards we take the initial rule and uppercase everything.

(defn transform-terminals
  "Takes a decomposed map and generates new non-terminals for each terminal"
  [decomposed]
  (let [[name rule] (first decomposed)]
    (conj 
     (map
      #(hash-map (keyword (clojure.string/upper-case %))  (str %))
      (filter #(Character/isLowerCase %) rule))
     (hash-map name (clojure.string/upper-case rule)))))


(deftest test-transform
  (is (= (transform-terminals {:A "xA"})
         [{:A "XA"} {:X "x"}]))
  (is (= (transform-terminals {:B "xAy"})
         [{:B "XAY"} {:X "x"} {:Y "y"}])))

Now we have a list of rules which may be too long. Let’s take care of that.

For the sake of simplicity I will use variables A to H for existing rules. And everything higher than that for new rules which are needed for shortening the existing one.

Let’s say we have the following transformed rules:

{:A "CHC"}
{:B "DABA"}

What we want is:

{:A "CI"}
{:I "HC"}

{:B "DJ"}
{:J "AK"}
{:K "BA"}

Here’s the function to do that:

(defn shorten-rule
  "Takes a transformed rule and returns new ones"
  [hmap start-index]
  (let [[rname rule] (first hmap)
        length (count rule)
        num-sequence (iterate inc (int start-index))
        extra-indexes (apply str (map char (take (- length 2) num-sequence)))
        end-string (apply str (drop (- length 2) rule))]
    (loop [rule-string (str (apply str (interleave rule extra-indexes)) end-string)
           index (name rname)
           result nil]
      (if (seq rule-string)
        (recur
         (drop 2 rule-string)
         (second rule-string)
         (conj result
               (hash-map
                (keyword (str index))
                (apply str (take 2 rule-string)))))
        result))))


(deftest test-shorten
  (is (= (shorten-rule {:A "CHC"} \I)  
         [{:I "HC"} {:A "CI"}]))
  (is (= (shorten-rule {:B "ABCD"} \L)  
         [{:M "CD"} {:L "BM"} {:B "AL"}]))
  (is (= (shorten-rule {:C "BABA"} \O)  
         [{:P "BA"} {:O "AP"} {:C "BO"}])))

It works. So, now we need to put everything together.

Continue tomorrow

Write a Comment

Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.