Shannon Guessing Game NetLogo Model
Produced for the book series "Artificial Intelligence";
Author: W. J. Teahan; Publisher: Ventus Publishing Aps, Denmark.
powered by NetLogo
view/download model file: Shannon-Guessing-Game.nlogo
WHAT IS IT?
This model shows how a language model can be constructed from some training text and then used to predict text - i.e. play the "Shannon Guessing Game", a game where the agent (human or computer) tries to predict upcoming text, one letter at a time, based on the prior text it has already seen.
Note: There is some possible confusion of the term "model" for this NetLogo "model". NetLogo uses the term "model" to refer to a program/simulation. This can be confused with the use of the term "model" in the phrase "language model" which is a mechanism for assigning a probability to a sequence of symbols by means of a probability distribution. In the information listed below, the two types of models will be explicitly distinguished by using the phrases "NetLogo model" and "language model".
The type of language model used in this NetLogo model is a fixed order Markov model - that is, according to the Markov property, the probability distribution for the next step and future steps depends only on the current state of the system and does not take into consideration the system's previous state. The language model is also fixed order - that is, the probability is conditioned on a fixed length prior context.
HOW IT WORKS
The language model is based on the PPMC compression algorithm devised by John Cleary and Ian Witten. It uses a back-off mechanism called escaping, to smooth the probability estimations with lower order models in order to overcome the zero frequency problem. This is the problem that occurs when an upcoming character has not been seen before in the context, and therefore has zero frequency. Assigning a zero probability estimate will result in an infinite code length (since log 0 = infinity) which means that it is impossible to encode the occurrence using this estimate. To overcome this, the estimates are calculated by smoothing with lower order models using the escape mechanism.
The escape probability is estimated using Method C devised by Alistair Moffat (hence why it is called PPMC). The smoothing method is often erroneously called Witten-Bell smoothing although it was invented by Alistair Moffat. The PPMC language model implemented by this NetLogo model does not implement full exclusions or update exclusions.
WHAT IS ITS PURPOSE?
The purpose of this NetLogo model is show how to implement a language model based on the PPMC compression scheme, and then use it to predict text one character at a time.
HOW IT WORKS
The data being modelled is conceptualised as a stream of events (i.e. analogously to Event Stream Processing or ESP) where events occur with a specific order for a specific stream, but at the same time can occur simultaneously on separate streams.
A training text sequence is used to train the language model. This language model is then used to process a test text sequence. Each stream event from the training data is represented by a state turtle agent, and paths between states are represented by path link agents. When the language model is used for prediction, such as when playing the Shannon Guessing Game as in this NetLogo model, then walker turtle agents are used to walk the network of states to determine which of the current contexts are active. These are then used to make the probability estimates. Text turtle agents are used for maintaining information about a particular text such as sums and a variable for storing the language model.
HOW TO USE IT
To use this NetLogo model, perform the following steps:
1. Select a training text to load using the which-training-text slider.
2. Load and build a language model for it using the load-training-text button.
3. Select a testing text to perform the prediction on using the which-testing-text slider.
4. Then predict the testing text one character at a time by pressing the predict-text button.
The model's Interface buttons are defined as follows:
- load-training-text: This loads the training text specified by the which-training-text slider. At the same time, it constructs the language model from it.
- load-testing-text: This loads the testing text specified by the which-testing-text slider.
- predict-text: This predicts the text in the testing text one character at a time.
The model's Interface slider and choosers are defined as follows:
- max-depth-of-tree: This is the maximum depth of the context tree. The order of the model is (max-depth-of-tree - 1).
- output-options: This determines how much output is written to the output box during the prediction phase.
- which-training-text: This slider allows the user to select from a small but eclectic list of natural language texts to use as the data for training the language model.
- which-testing-text: This slider allows the user to select from a small but eclectic list of natural language texts to use as the data on which the language model is tested by predicting each character one after the other.
THINGS TO NOTICE
Notice how the prediction improves as the max-depth-of-tree variable is increased from 1 to 5 or 6, then drops off slightly. However, this depends to a great extent on the training text used to build the language model and how closely it is related to the testing text.
Notice how poor the prediction is when the training and testing texts are from different languages.
Notice also that for the first few characters in the testing text, the prediction is relatively poor. Why is this?
In contrast, notice how well the language model does at predicting the testing text if it is trainedStream on exactly the same text (i.e. the training and texts are exactly the same). Note: This situation seldomly appears in real life prediction because usually the testing sequence is completely unpredictable. Note also that the best the language model for higher order models can do for predicting the next character is often only 1/2 or 1 bit to encode even in the situation where it has been trained on the testing text. Why is this? Hint: What would happen if the length of the training text was much longer?
THINGS TO TRY
Try altering the max-depth-of-tree to see how this affects the prediction.
Try different combinations of training and testing texts.
EXTENDING THE MODEL
Extend the PPMC language model so that it implements full exclusions and update exclusions.
Can you devise further methods to improve the prediction of the language model?
See the Language Modelling NetLogo model and the Cars Guessing Game NetLogo model.
CREDITS AND REFERENCES
This model was created by William John Teahan.
To refer to this model in publications, please use:
Shannon Guessing Game NetLogo model.
Teahan, W. J. (2010). Artificial Intelligence. Ventus Publishing Aps.
; Shannon Guessing Game model. ; ; Shows how the Shannon Guessing Game works. ; ; Copyright 2010 William John Teahan. All Rights Reserved. ; breed [texts text] breed [states state] breed [walkers walker] directed-link-breed [paths path] states-own [ depth ;; depth in the tree stream ;; the name of the stream of sensory or motor events event ;; the sensory or motor event event-count ;; the number of times te event has occurred ] walkers-own [ location ;; the location of a state the walker is currently visiting depth ;; depth in the tree model ] texts-own [ model ;; the model associated with the text total ;; total of lengths of paths traversed window ;; window of recent path lengths wcount ;; sum of path lengths stored in the window ] globals [ testing-chars ;; the testing string of characters testing-char ;; the current testing character testing-pos ;; position in the testing string encoding-list ;; the encoding list for the current testing character total-entropy ;; total cross-entropy for the sequence being encoded ] to load-training-text-char-events [filename window-size this-model] ;; loads the training text character events from the file. ;; window-size is the length of the window (context-size + 1) ;; where window-size = max depth of tree + 1 ;; i.e. order of context model = max depth of tree - 1 let i 0 let j 0 let k 0 let chars "" let event-list  let item-list  import-drawing "Image-Shannon.jpg" ; put something in environment to look at create-texts 1 [ set color black ; don't make this turtle visible on the screen set model (word this-model " Chars") set total 0 set window  set wcount 0 hide-turtle ; make it invisible ] file-open filename ;; Read in all the data in the file set k 0 ; k is the number of characters let ch "null" while [not file-at-end?] [ set chars file-read-line set i 0 while [i + window-size <= length chars] [ set event-list  set j window-size while [j > 0] [ set ch item (i + j - 1) chars set item-list (list "char" ch) ifelse (event-list = ) [ set event-list (list item-list) ] [ set event-list fput item-list event-list ] set j j - 1 ] set event-list fput (list "model" (word this-model " Chars")) event-list add-events event-list set i i + 1 set k k + 1 ] ] file-close output-type k output-print " characters loaded" end to start-new-walker ;; starts a new walker at the root of the training text model let this-state one-of states with [stream = "model"] create-walkers 1 [ set size 0 ; do not make it visible set depth 0 ; at root of the tree set location this-state ;; start at first state move-to location ] end to update-walkers [this-stream this-event] ;; updates all the walkers by processing the stream's event let new-location nobody ask walkers [ ask location [ set new-location one-of out-path-neighbors with [stream = this-stream and event = this-event]] if (new-location = nobody) [ die ] move-to new-location set location new-location set depth depth + 1 ] start-new-walker end to kill-walkers ;; kills all the current walkers ask walkers [ die ] end to load-training-text ca ; clear everything including the state model and walkers let filename which-training-text-filename load-training-text-char-events filename max-depth-of-tree + 1 which-training-text end to load-testing-text ;; loads the testing text characters from the file specified by the chooser which-text. clear-output set testing-chars "" set testing-pos 0 ; reset prediction point from start of text file-open which-testing-text-filename ;; Read in all the data in the file let ch "null" set testing-chars "" while [not file-at-end?] [ set testing-chars (word testing-chars " " file-read-line) ; replace eoln with space ] file-close kill-walkers start-new-walker end to list-predictions ;; lists the model's predictions given the current walker's positions let d 0 let i 0 let l nobody let types 0 let tokens 0 let testing-count 0 let found-char false set encoding-list  if (output-options = "Show all predictions") [ output-print "\nPredictions (at various depths in the training model's tree):" output-print "(Each character is listed followed by its frequency of occurrence and" output-print "estimated probability)" ] foreach sort-by [[depth] of ?1 > [depth] of ?2] walkers with [depth <= max-depth-of-tree] [ set i 0 set d [depth] of ? set l [location] of ? ; find out the number of types and tokens in first pass ; also work out encoding of current character set types 0 set tokens 0 ask [out-path-neighbors] of l [ set types types + 1 set tokens tokens + event-count if (testing-char = event) [ set testing-count event-count ] ] if found-char = false [ ifelse (testing-count = 0) ; zero frequency problem [ set encoding-list lput (list types (types + tokens)) encoding-list ] [ set found-char true set encoding-list lput (list testing-count (types + tokens)) encoding-list ]] if (output-options = "Show all predictions") [ output-type "depth " output-type d output-type ": " output-type "(types = " output-type types output-type " tokens = " output-type tokens output-type " escape probability = " output-type types output-type "/" output-type (types + tokens) output-type ")" ; list all the predictions in the second pass foreach sort-by [[event-count] of ?1 > [event-count] of ?2] ([out-path-neighbors] of l) [ if (remainder i 5) = 0 [ output-print "" ] output-type " \"" output-type ([event] of ?) output-type "\"" output-type " [" output-type ([event-count] of ?) output-type "]" output-type " (" output-type ([event-count] of ?) output-type "/" output-type (types + tokens) output-type ")" set i i + 1 ] output-print "" ] ] if found-char = false ; new character never been seen before; order -1 encoding [ set encoding-list lput (list 1 256) encoding-list ] ; 256 is size of alphabet end to list-encoding ;; lists the encoding for the current character let this-entropy 0 if (output-options != "Show total only") [ output-type "\nEncodings and entropy for next character in sequence \"" output-type testing-char output-print "\":" ] let i 0 foreach encoding-list [ if (output-options != "Show total only") [ if (i > 0) [ output-type " X" ] ; stands for times (i.e. multiplication) output-type " " output-type (first ?) output-type "/" output-type (last ?) ] set this-entropy this-entropy - log ((first ?) / (last ?)) 2 set i i + 1 ] set total-entropy total-entropy + this-entropy if (output-options != "Show total only") [ output-print "" output-type "Cross-Entropy for this character = " output-type this-entropy output-print " bits" output-type "Total Cross-Entropy for all characters so far = " output-type total-entropy output-print " bits" ] end to predict-text let output-text-width 80 ; width of the output text display before wrapping to next line clear-output output-print "Text so far:" output-type "\"" let p 0 while [p <= testing-pos] [ ifelse (p + output-text-width > testing-pos) [ output-type substring testing-chars p testing-pos output-print "\"" ] [ output-print substring testing-chars p (p + output-text-width) ] set p p + output-text-width ] if (testing-pos > length testing-chars) [ user-message "Reached end of testing string!" stop ] set testing-char (item testing-pos testing-chars) list-predictions list-encoding user-message "Press OK to continue for next character:" update-walkers "char" (item testing-pos testing-chars) set testing-pos testing-pos + 1 end to-report which-text-filename [which-text] ;; reports the filename associated with the chooser which-text. let filename "" if (which-text = "English Bible") [ set filename "Text_english_bible.txt" ] if (which-text = "German Bible") [ set filename "Text_german_bible.txt" ] if (which-text = "Shakespeare's As You Like It") [ set filename "Text_shakespeare_as_you_like_it.txt" ] if (which-text = "Jane Austen's Pride and Prejudice") [ set filename "Text_austen_p_and_p.txt" ] if (which-text = "Jane Austen's Sense and Sensibility") [ set filename "Text_austen_s_and_s.txt" ] report filename end to-report which-training-text-filename ;; reports the training text filename. report which-text-filename which-training-text end to-report which-testing-text-filename ;; reports the testing text filename. report which-text-filename which-testing-text end to add-events [list-of-events] ;; add events in the list-of-events list to the events tree. ;; each item of the list-of-events list must consist of a two itemed list. ;; e.g. [[hue 0.9] [brightness 0.8]] let this-depth 0 let this-stream "" let this-event "" let this-state nobody let next-state nobody let these-states states let matching-states  let matched-all-so-far true foreach list-of-events [ set this-stream first ? set this-event last ? ;; check to see if state already exists set matching-states these-states with [stream = this-stream and event = this-event and depth = this-depth] ifelse (matched-all-so-far = true) and (count matching-states > 0) [ set next-state one-of matching-states ask next-state [ set event-count event-count + 1 ] set these-states [out-path-neighbors] of next-state ] [ ;; state does not exist - create it set matched-all-so-far false create-states 1 [ set depth this-depth set stream this-stream set event this-event set event-count 1 set size 0 ; these states are never visualised (thereare too many of them) set next-state self ] ] if (this-state != nobody) [ ask this-state [ create-path-to next-state ]] ;; go down the tree set this-state next-state set this-depth this-depth + 1 ] end ; ; Copyright 2010 by William John Teahan. All rights reserved. ; ; Permission to use, modify or redistribute this model is hereby granted, ; provided that both of the following requirements are followed: ; a) this copyright notice is included. ; b) this model will not be redistributed for profit without permission ; from William John Teahan. ; Contact William John Teahan for appropriate licenses for redistribution for ; profit. ; ; To refer to this model in publications, please use: ; ; Teahan, W. J. (2010). Shannon Guessing Game NetLogo model. ; Artificial Intelligence. Ventus Publishing Aps. ;