Shannon Guessing Game NetLogo Model

Produced for the book series "Artificial Intelligence";

Author: W. J. Teahan; Publisher: Ventus Publishing Aps, Denmark.

powered by NetLogo

view/download model file: Shannon-Guessing-Game.nlogo

WHAT IS IT?

This model shows how a language model can be constructed from some training text and then used to predict text - i.e. play the "Shannon Guessing Game", a game where the agent (human or computer) tries to predict upcoming text, one letter at a time, based on the prior text it has already seen.

Note: There is some possible confusion of the term "model" for this NetLogo "model". NetLogo uses the term "model" to refer to a program/simulation. This can be confused with the use of the term "model" in the phrase "language model" which is a mechanism for assigning a probability to a sequence of symbols by means of a probability distribution. In the information listed below, the two types of models will be explicitly distinguished by using the phrases "NetLogo model" and "language model".

The type of language model used in this NetLogo model is a fixed order Markov model - that is, according to the Markov property, the probability distribution for the next step and future steps depends only on the current state of the system and does not take into consideration the system's previous state. The language model is also fixed order - that is, the probability is conditioned on a fixed length prior context.


HOW IT WORKS

The language model is based on the PPMC compression algorithm devised by John Cleary and Ian Witten. It uses a back-off mechanism called escaping, to smooth the probability estimations with lower order models in order to overcome the zero frequency problem. This is the problem that occurs when an upcoming character has not been seen before in the context, and therefore has zero frequency. Assigning a zero probability estimate will result in an infinite code length (since log 0 = infinity) which means that it is impossible to encode the occurrence using this estimate. To overcome this, the estimates are calculated by smoothing with lower order models using the escape mechanism.

The escape probability is estimated using Method C devised by Alistair Moffat (hence why it is called PPMC). The smoothing method is often erroneously called Witten-Bell smoothing although it was invented by Alistair Moffat. The PPMC language model implemented by this NetLogo model does not implement full exclusions or update exclusions.


WHAT IS ITS PURPOSE?

The purpose of this NetLogo model is show how to implement a language model based on the PPMC compression scheme, and then use it to predict text one character at a time.


HOW IT WORKS

The data being modelled is conceptualised as a stream of events (i.e. analogously to Event Stream Processing or ESP) where events occur with a specific order for a specific stream, but at the same time can occur simultaneously on separate streams.

A training text sequence is used to train the language model. This language model is then used to process a test text sequence. Each stream event from the training data is represented by a state turtle agent, and paths between states are represented by path link agents. When the language model is used for prediction, such as when playing the Shannon Guessing Game as in this NetLogo model, then walker turtle agents are used to walk the network of states to determine which of the current contexts are active. These are then used to make the probability estimates. Text turtle agents are used for maintaining information about a particular text such as sums and a variable for storing the language model.


HOW TO USE IT

To use this NetLogo model, perform the following steps:
1. Select a training text to load using the which-training-text slider.
2. Load and build a language model for it using the load-training-text button.
3. Select a testing text to perform the prediction on using the which-testing-text slider.
4. Then predict the testing text one character at a time by pressing the predict-text button.


THE INTERFACE

The model's Interface buttons are defined as follows:

- load-training-text: This loads the training text specified by the which-training-text slider. At the same time, it constructs the language model from it.

- load-testing-text: This loads the testing text specified by the which-testing-text slider.

- predict-text: This predicts the text in the testing text one character at a time.

The model's Interface slider and choosers are defined as follows:

- max-depth-of-tree: This is the maximum depth of the context tree. The order of the model is (max-depth-of-tree - 1).

- output-options: This determines how much output is written to the output box during the prediction phase.

- which-training-text: This slider allows the user to select from a small but eclectic list of natural language texts to use as the data for training the language model.

- which-testing-text: This slider allows the user to select from a small but eclectic list of natural language texts to use as the data on which the language model is tested by predicting each character one after the other.


THINGS TO NOTICE

Notice how the prediction improves as the max-depth-of-tree variable is increased from 1 to 5 or 6, then drops off slightly. However, this depends to a great extent on the training text used to build the language model and how closely it is related to the testing text.

Notice how poor the prediction is when the training and testing texts are from different languages.

Notice also that for the first few characters in the testing text, the prediction is relatively poor. Why is this?

In contrast, notice how well the language model does at predicting the testing text if it is trainedStream on exactly the same text (i.e. the training and texts are exactly the same). Note: This situation seldomly appears in real life prediction because usually the testing sequence is completely unpredictable. Note also that the best the language model for higher order models can do for predicting the next character is often only 1/2 or 1 bit to encode even in the situation where it has been trained on the testing text. Why is this? Hint: What would happen if the length of the training text was much longer?


THINGS TO TRY

Try altering the max-depth-of-tree to see how this affects the prediction.

Try different combinations of training and testing texts.


EXTENDING THE MODEL

Extend the PPMC language model so that it implements full exclusions and update exclusions.

Can you devise further methods to improve the prediction of the language model?


RELATED MODELS

See the Language Modelling NetLogo model and the Cars Guessing Game NetLogo model.


CREDITS AND REFERENCES

This model was created by William John Teahan.

To refer to this model in publications, please use:

Shannon Guessing Game NetLogo model.
Teahan, W. J. (2010). Artificial Intelligence. Ventus Publishing Aps.


PROCEDURES

; Shannon Guessing Game model.
;
; Shows how the Shannon Guessing Game works.
;
; Copyright 2010 William John Teahan. All Rights Reserved.
;
breed [texts text]
breed [states state]
breed [walkers walker]
directed-link-breed [paths path]

states-own 
[ depth         ;; depth in the tree
  stream        ;; the name of the stream of sensory or motor events
  event         ;; the sensory or motor event
  event-count   ;; the number of times te event has occurred
] 
walkers-own
[ location      ;; the location of a state the walker is currently visiting
  depth         ;; depth in the tree model
]
texts-own
[ model         ;; the model associated with the text
  total         ;; total of lengths of paths traversed
  window        ;; window of recent path lengths
  wcount        ;; sum of path lengths stored in the window
]

globals
[ testing-chars ;; the testing string of characters
  testing-char  ;; the current testing character
  testing-pos   ;; position in the testing string
  encoding-list ;; the encoding list for the current testing character
  total-entropy ;; total cross-entropy for the sequence being encoded
]

to load-training-text-char-events [filename window-size this-model]
;; loads the training text character events from the file.
;; window-size is the length of the window (context-size + 1)
;; where window-size = max depth of tree + 1
;; i.e. order of context model = max depth of tree - 1
  let i 0
  let j 0
  let k 0
  let chars ""
  let event-list []
  let item-list []
  
  import-drawing "Image-Shannon.jpg" ; put something in environment to look at

  create-texts 1
  [ set color black ; don't make this turtle visible on the screen
    set model (word this-model " Chars")
    set total 0
    set window []
    set wcount 0
    hide-turtle ; make it invisible
  ]
    
  file-open filename

  ;; Read in all the data in the file
  set k 0 ; k is the number of characters
  let ch "null"
  while [not file-at-end?]
  [
    set chars file-read-line
    
    set i 0
    while [i + window-size <= length chars]
      [ set event-list []
        set j window-size
        while [j > 0]
          [ set ch item (i + j - 1) chars
            set item-list (list "char" ch)
            ifelse (event-list = [])
              [ set event-list (list item-list) ]
              [ set event-list fput item-list event-list ]
            set j j - 1
          ]
        set event-list fput (list "model" (word this-model " Chars")) event-list
        
        add-events event-list
        set i i + 1
        set k k + 1
      ]
  ]
  file-close
  
  output-type k
  output-print " characters loaded"
end

to start-new-walker
;; starts a new walker at the root of the training text model

  let this-state one-of states with [stream = "model"]

  create-walkers 1
    [
      set size 0 ; do not make it visible
      set depth 0 ; at root of the tree
      set location this-state ;; start at first state
      move-to location
    ]
end

to update-walkers [this-stream this-event]
;; updates all the walkers by processing the stream's event

  let new-location nobody
  ask walkers
  [
    ask location
      [ set new-location one-of out-path-neighbors with [stream = this-stream and event = this-event]]
        
    if (new-location = nobody)
      [ die ]
    
    move-to new-location
    set location new-location
    set depth depth + 1
  ]
  start-new-walker
end

to kill-walkers
;; kills all the current walkers
  ask walkers [ die ]
end

to load-training-text
  ca ; clear everything including the state model and walkers

  let filename which-training-text-filename
  load-training-text-char-events filename max-depth-of-tree + 1 which-training-text
end

to load-testing-text
;; loads the testing text characters from the file specified by the chooser which-text.

  clear-output
  
  set testing-chars ""
  set testing-pos 0 ; reset prediction point from start of text    
  file-open which-testing-text-filename

  ;; Read in all the data in the file
  let ch "null"
  set testing-chars ""
  while [not file-at-end?]
  [
    set testing-chars (word testing-chars " " file-read-line) ; replace eoln with space
  ]
    
  file-close

  kill-walkers  
  start-new-walker
end

to list-predictions
;; lists the model's predictions given the current walker's positions

  let d 0
  let i 0
  let l nobody
  let types 0
  let tokens 0
  let testing-count 0
  let found-char false
  set encoding-list []

  if (output-options = "Show all predictions")
  [ output-print "\nPredictions (at various depths in the training model's tree):"
    output-print "(Each character is listed followed by its frequency of occurrence and"
    output-print "estimated probability)" ]

  foreach sort-by [[depth] of ?1 > [depth] of ?2] walkers with [depth <= max-depth-of-tree]
  [
    set i 0
    set d [depth] of ?
    set l [location] of ?
    
      ; find out the number of types and tokens in first pass
      ; also work out encoding of current character
    set types 0
    set tokens 0
    ask [out-path-neighbors] of l
      [ set types types + 1
        set tokens tokens + event-count
        if (testing-char = event)
          [ set testing-count event-count ]
      ]
    if found-char = false
      [ ifelse (testing-count = 0) ; zero frequency problem
          [ set encoding-list lput (list types (types + tokens)) encoding-list ]
          [ set found-char true
            set encoding-list lput (list testing-count (types + tokens)) encoding-list ]]
        
    if (output-options = "Show all predictions")
    [ output-type "depth " output-type d output-type ":  "
      output-type "(types = " output-type types
      output-type " tokens = " output-type tokens
      output-type " escape probability = " output-type types
      output-type "/" output-type (types + tokens) output-type ")"
    
      ; list all the predictions in the second pass
      foreach sort-by [[event-count] of ?1 > [event-count] of ?2] ([out-path-neighbors] of l)
      [ if (remainder i 5) = 0 [ output-print "" ]
        output-type "  \"" output-type ([event] of ?) output-type "\""
        output-type " [" output-type ([event-count] of ?) output-type "]"
        output-type " (" output-type ([event-count] of ?) output-type "/"
        output-type (types + tokens) output-type ")"
        set i i + 1
      ]
      output-print ""
    ]
  ]

  if found-char = false ; new character never been seen before; order -1 encoding
  [ set encoding-list lput (list 1 256) encoding-list ] ; 256 is size of alphabet
end

to list-encoding
;; lists the encoding for the current character

  let this-entropy 0
  
  if (output-options != "Show total only")
  [ output-type "\nEncodings and entropy for next character in sequence \""
    output-type testing-char
    output-print "\":" ]

  let i 0
  foreach encoding-list
  [ if (output-options != "Show total only")
    [ if (i > 0)
        [ output-type " X" ] ; stands for times (i.e. multiplication)
      output-type " " output-type (first ?)
      output-type "/" output-type (last ?) ]
    set this-entropy this-entropy - log ((first ?) / (last ?)) 2
    set i i + 1
  ]
  set total-entropy total-entropy + this-entropy

  if (output-options != "Show total only")
  [
    output-print ""
    output-type "Cross-Entropy for this character = "
    output-type this-entropy
    output-print " bits"
    output-type "Total Cross-Entropy for all characters so far = "
    output-type total-entropy
    output-print " bits"
  ]
end

to predict-text
  let output-text-width 80 ; width of the output text display before wrapping to next line
  
  clear-output
  output-print "Text so far:"
  output-type "\""
  let p 0
  while [p <= testing-pos]
  [ ifelse (p + output-text-width > testing-pos)
      [ output-type substring testing-chars p testing-pos
        output-print "\"" ]
      [ output-print substring testing-chars p (p + output-text-width) ]
      set p p + output-text-width   
  ]  
  if (testing-pos > length testing-chars)
    [ user-message "Reached end of testing string!"
      stop
    ]

  set testing-char (item testing-pos testing-chars)
    
  list-predictions  
  list-encoding

  user-message "Press OK to continue for next character:"

  update-walkers "char" (item testing-pos testing-chars)
  set testing-pos testing-pos + 1
end

to-report which-text-filename [which-text]
;; reports the filename associated with the chooser which-text.
  
  let filename ""
  if (which-text = "English Bible")
    [ set filename "Text_english_bible.txt" ]
  if (which-text = "German Bible")
    [ set filename "Text_german_bible.txt" ]
  if (which-text = "Shakespeare's As You Like It")
    [ set filename "Text_shakespeare_as_you_like_it.txt" ]
  if (which-text = "Jane Austen's Pride and Prejudice")
    [ set filename "Text_austen_p_and_p.txt" ]
  if (which-text = "Jane Austen's Sense and Sensibility")
    [ set filename "Text_austen_s_and_s.txt" ]
  report filename
end

to-report which-training-text-filename
;; reports the training text filename.

  report which-text-filename which-training-text
end

to-report which-testing-text-filename
;; reports the testing text filename.

  report which-text-filename which-testing-text
end

to add-events [list-of-events]
;; add events in the list-of-events list to the events tree.
;; each item of the list-of-events list must consist of a two itemed list.
;; e.g. [[hue 0.9] [brightness 0.8]]

  let this-depth 0
  let this-stream ""
  let this-event ""
  let this-state nobody
  let next-state nobody
  let these-states states
  let matching-states []
  let matched-all-so-far true
  
  foreach list-of-events
  [ set this-stream first ?
    set this-event last ?

    ;; check to see if state already exists
    set matching-states these-states with [stream = this-stream and event = this-event and depth = this-depth]
    ifelse (matched-all-so-far = true) and (count matching-states > 0)
      [
        set next-state one-of matching-states
        ask next-state
        [ set event-count event-count + 1 ]
        set these-states [out-path-neighbors] of next-state
      ]
      [ ;; state does not exist - create it
        set matched-all-so-far false
        create-states 1
        [
          set depth this-depth
          set stream this-stream
          set event this-event
          set event-count 1
                      
          set size 0 ; these states are never visualised (thereare too many of them)
          set next-state self
        ]
      ]

    if (this-state != nobody)
      [ ask this-state [ create-path-to next-state ]]
         
    ;; go down the tree
    set this-state next-state    
    set this-depth this-depth + 1
  ]
end
;
; Copyright 2010 by William John Teahan.  All rights reserved.
;
; Permission to use, modify or redistribute this model is hereby granted,
; provided that both of the following requirements are followed:
; a) this copyright notice is included.
; b) this model will not be redistributed for profit without permission
;    from William John Teahan.
; Contact William John Teahan for appropriate licenses for redistribution for
; profit.
;
; To refer to this model in publications, please use:
;
; Teahan, W. J. (2010).  Shannon Guessing Game NetLogo model.
;   Artificial Intelligence. Ventus Publishing Aps.
;