Shannon Guessing Game NetLogo Model
Produced for the book series "Artificial Intelligence";
Author: W. J. Teahan; Publisher: Ventus Publishing Aps, Denmark.
powered by NetLogo
view/download model file: Shannon-Guessing-Game.nlogo
WHAT IS IT?
This model shows how a language model can be constructed from some training text and then used to predict text - i.e. play the "Shannon Guessing Game", a game where the agent (human or computer) tries to predict upcoming text, one letter at a time, based on the prior text it has already seen.
Note: There is some possible confusion of the term "model" for this NetLogo "model". NetLogo uses the term "model" to refer to a program/simulation. This can be confused with the use of the term "model" in the phrase "language model" which is a mechanism for assigning a probability to a sequence of symbols by means of a probability distribution. In the information listed below, the two types of models will be explicitly distinguished by using the phrases "NetLogo model" and "language model".
The type of language model used in this NetLogo model is a fixed order Markov model - that is, according to the Markov property, the probability distribution for the next step and future steps depends only on the current state of the system and does not take into consideration the system's previous state. The language model is also fixed order - that is, the probability is conditioned on a fixed length prior context.
HOW IT WORKS
The language model is based on the PPMC compression algorithm devised by John Cleary and Ian Witten. It uses a back-off mechanism called escaping, to smooth the probability estimations with lower order models in order to overcome the zero frequency problem. This is the problem that occurs when an upcoming character has not been seen before in the context, and therefore has zero frequency. Assigning a zero probability estimate will result in an infinite code length (since log 0 = infinity) which means that it is impossible to encode the occurrence using this estimate. To overcome this, the estimates are calculated by smoothing with lower order models using the escape mechanism.
The escape probability is estimated using Method C devised by Alistair Moffat (hence why it is called PPMC). The smoothing method is often erroneously called Witten-Bell smoothing although it was invented by Alistair Moffat. The PPMC language model implemented by this NetLogo model does not implement full exclusions or update exclusions.
WHAT IS ITS PURPOSE?
The purpose of this NetLogo model is show how to implement a language model based on the PPMC compression scheme, and then use it to predict text one character at a time.
HOW IT WORKS
The data being modelled is conceptualised as a stream of events (i.e. analogously to Event Stream Processing or ESP) where events occur with a specific order for a specific stream, but at the same time can occur simultaneously on separate streams.
A training text sequence is used to train the language model. This language model is then used to process a test text sequence. Each stream event from the training data is represented by a state turtle agent, and paths between states are represented by path link agents. When the language model is used for prediction, such as when playing the Shannon Guessing Game as in this NetLogo model, then walker turtle agents are used to walk the network of states to determine which of the current contexts are active. These are then used to make the probability estimates. Text turtle agents are used for maintaining information about a particular text such as sums and a variable for storing the language model.
HOW TO USE IT
To use this NetLogo model, perform the following steps:
1. Select a training text to load using the which-training-text slider.
2. Load and build a language model for it using the load-training-text button.
3. Select a testing text to perform the prediction on using the which-testing-text slider.
4. Then predict the testing text one character at a time by pressing the predict-text button.
THE INTERFACE
The model's Interface buttons are defined as follows:
- load-training-text: This loads the training text specified by the which-training-text slider. At the same time, it constructs the language model from it.
- load-testing-text: This loads the testing text specified by the which-testing-text slider.
- predict-text: This predicts the text in the testing text one character at a time.
The model's Interface slider and choosers are defined as follows:
- max-depth-of-tree: This is the maximum depth of the context tree. The order of the model is (max-depth-of-tree - 1).
- output-options: This determines how much output is written to the output box during the prediction phase.
- which-training-text: This slider allows the user to select from a small but eclectic list of natural language texts to use as the data for training the language model.
- which-testing-text: This slider allows the user to select from a small but eclectic list of natural language texts to use as the data on which the language model is tested by predicting each character one after the other.
THINGS TO NOTICE
Notice how the prediction improves as the max-depth-of-tree variable is increased from 1 to 5 or 6, then drops off slightly. However, this depends to a great extent on the training text used to build the language model and how closely it is related to the testing text.
Notice how poor the prediction is when the training and testing texts are from different languages.
Notice also that for the first few characters in the testing text, the prediction is relatively poor. Why is this?
In contrast, notice how well the language model does at predicting the testing text if it is trainedStream on exactly the same text (i.e. the training and texts are exactly the same). Note: This situation seldomly appears in real life prediction because usually the testing sequence is completely unpredictable. Note also that the best the language model for higher order models can do for predicting the next character is often only 1/2 or 1 bit to encode even in the situation where it has been trained on the testing text. Why is this? Hint: What would happen if the length of the training text was much longer?
THINGS TO TRY
Try altering the max-depth-of-tree to see how this affects the prediction.
Try different combinations of training and testing texts.
EXTENDING THE MODEL
Extend the PPMC language model so that it implements full exclusions and update exclusions.
Can you devise further methods to improve the prediction of the language model?
RELATED MODELS
See the Language Modelling NetLogo model and the Cars Guessing Game NetLogo model.
CREDITS AND REFERENCES
This model was created by William John Teahan.
To refer to this model in publications, please use:
Shannon Guessing Game NetLogo model.
Teahan, W. J. (2010). Artificial Intelligence. Ventus Publishing Aps.
PROCEDURES
; Shannon Guessing Game model.
;
; Shows how the Shannon Guessing Game works.
;
; Copyright 2010 William John Teahan. All Rights Reserved.
;
breed [texts text]
breed [states state]
breed [walkers walker]
directed-link-breed [paths path]
states-own
[ depth ;; depth in the tree
stream ;; the name of the stream of sensory or motor events
event ;; the sensory or motor event
event-count ;; the number of times te event has occurred
]
walkers-own
[ location ;; the location of a state the walker is currently visiting
depth ;; depth in the tree model
]
texts-own
[ model ;; the model associated with the text
total ;; total of lengths of paths traversed
window ;; window of recent path lengths
wcount ;; sum of path lengths stored in the window
]
globals
[ testing-chars ;; the testing string of characters
testing-char ;; the current testing character
testing-pos ;; position in the testing string
encoding-list ;; the encoding list for the current testing character
total-entropy ;; total cross-entropy for the sequence being encoded
]
to load-training-text-char-events [filename window-size this-model]
;; loads the training text character events from the file.
;; window-size is the length of the window (context-size + 1)
;; where window-size = max depth of tree + 1
;; i.e. order of context model = max depth of tree - 1
let i 0
let j 0
let k 0
let chars ""
let event-list []
let item-list []
import-drawing "Image-Shannon.jpg" ; put something in environment to look at
create-texts 1
[ set color black ; don't make this turtle visible on the screen
set model (word this-model " Chars")
set total 0
set window []
set wcount 0
hide-turtle ; make it invisible
]
file-open filename
;; Read in all the data in the file
set k 0 ; k is the number of characters
let ch "null"
while [not file-at-end?]
[
set chars file-read-line
set i 0
while [i + window-size <= length chars]
[ set event-list []
set j window-size
while [j > 0]
[ set ch item (i + j - 1) chars
set item-list (list "char" ch)
ifelse (event-list = [])
[ set event-list (list item-list) ]
[ set event-list fput item-list event-list ]
set j j - 1
]
set event-list fput (list "model" (word this-model " Chars")) event-list
add-events event-list
set i i + 1
set k k + 1
]
]
file-close
output-type k
output-print " characters loaded"
end
to start-new-walker
;; starts a new walker at the root of the training text model
let this-state one-of states with [stream = "model"]
create-walkers 1
[
set size 0 ; do not make it visible
set depth 0 ; at root of the tree
set location this-state ;; start at first state
move-to location
]
end
to update-walkers [this-stream this-event]
;; updates all the walkers by processing the stream's event
let new-location nobody
ask walkers
[
ask location
[ set new-location one-of out-path-neighbors with [stream = this-stream and event = this-event]]
if (new-location = nobody)
[ die ]
move-to new-location
set location new-location
set depth depth + 1
]
start-new-walker
end
to kill-walkers
;; kills all the current walkers
ask walkers [ die ]
end
to load-training-text
ca ; clear everything including the state model and walkers
let filename which-training-text-filename
load-training-text-char-events filename max-depth-of-tree + 1 which-training-text
end
to load-testing-text
;; loads the testing text characters from the file specified by the chooser which-text.
clear-output
set testing-chars ""
set testing-pos 0 ; reset prediction point from start of text
file-open which-testing-text-filename
;; Read in all the data in the file
let ch "null"
set testing-chars ""
while [not file-at-end?]
[
set testing-chars (word testing-chars " " file-read-line) ; replace eoln with space
]
file-close
kill-walkers
start-new-walker
end
to list-predictions
;; lists the model's predictions given the current walker's positions
let d 0
let i 0
let l nobody
let types 0
let tokens 0
let testing-count 0
let found-char false
set encoding-list []
if (output-options = "Show all predictions")
[ output-print "\nPredictions (at various depths in the training model's tree):"
output-print "(Each character is listed followed by its frequency of occurrence and"
output-print "estimated probability)" ]
foreach sort-by [[depth] of ?1 > [depth] of ?2] walkers with [depth <= max-depth-of-tree]
[
set i 0
set d [depth] of ?
set l [location] of ?
; find out the number of types and tokens in first pass
; also work out encoding of current character
set types 0
set tokens 0
ask [out-path-neighbors] of l
[ set types types + 1
set tokens tokens + event-count
if (testing-char = event)
[ set testing-count event-count ]
]
if found-char = false
[ ifelse (testing-count = 0) ; zero frequency problem
[ set encoding-list lput (list types (types + tokens)) encoding-list ]
[ set found-char true
set encoding-list lput (list testing-count (types + tokens)) encoding-list ]]
if (output-options = "Show all predictions")
[ output-type "depth " output-type d output-type ": "
output-type "(types = " output-type types
output-type " tokens = " output-type tokens
output-type " escape probability = " output-type types
output-type "/" output-type (types + tokens) output-type ")"
; list all the predictions in the second pass
foreach sort-by [[event-count] of ?1 > [event-count] of ?2] ([out-path-neighbors] of l)
[ if (remainder i 5) = 0 [ output-print "" ]
output-type " \"" output-type ([event] of ?) output-type "\""
output-type " [" output-type ([event-count] of ?) output-type "]"
output-type " (" output-type ([event-count] of ?) output-type "/"
output-type (types + tokens) output-type ")"
set i i + 1
]
output-print ""
]
]
if found-char = false ; new character never been seen before; order -1 encoding
[ set encoding-list lput (list 1 256) encoding-list ] ; 256 is size of alphabet
end
to list-encoding
;; lists the encoding for the current character
let this-entropy 0
if (output-options != "Show total only")
[ output-type "\nEncodings and entropy for next character in sequence \""
output-type testing-char
output-print "\":" ]
let i 0
foreach encoding-list
[ if (output-options != "Show total only")
[ if (i > 0)
[ output-type " X" ] ; stands for times (i.e. multiplication)
output-type " " output-type (first ?)
output-type "/" output-type (last ?) ]
set this-entropy this-entropy - log ((first ?) / (last ?)) 2
set i i + 1
]
set total-entropy total-entropy + this-entropy
if (output-options != "Show total only")
[
output-print ""
output-type "Cross-Entropy for this character = "
output-type this-entropy
output-print " bits"
output-type "Total Cross-Entropy for all characters so far = "
output-type total-entropy
output-print " bits"
]
end
to predict-text
let output-text-width 80 ; width of the output text display before wrapping to next line
clear-output
output-print "Text so far:"
output-type "\""
let p 0
while [p <= testing-pos]
[ ifelse (p + output-text-width > testing-pos)
[ output-type substring testing-chars p testing-pos
output-print "\"" ]
[ output-print substring testing-chars p (p + output-text-width) ]
set p p + output-text-width
]
if (testing-pos > length testing-chars)
[ user-message "Reached end of testing string!"
stop
]
set testing-char (item testing-pos testing-chars)
list-predictions
list-encoding
user-message "Press OK to continue for next character:"
update-walkers "char" (item testing-pos testing-chars)
set testing-pos testing-pos + 1
end
to-report which-text-filename [which-text]
;; reports the filename associated with the chooser which-text.
let filename ""
if (which-text = "English Bible")
[ set filename "Text_english_bible.txt" ]
if (which-text = "German Bible")
[ set filename "Text_german_bible.txt" ]
if (which-text = "Shakespeare's As You Like It")
[ set filename "Text_shakespeare_as_you_like_it.txt" ]
if (which-text = "Jane Austen's Pride and Prejudice")
[ set filename "Text_austen_p_and_p.txt" ]
if (which-text = "Jane Austen's Sense and Sensibility")
[ set filename "Text_austen_s_and_s.txt" ]
report filename
end
to-report which-training-text-filename
;; reports the training text filename.
report which-text-filename which-training-text
end
to-report which-testing-text-filename
;; reports the testing text filename.
report which-text-filename which-testing-text
end
to add-events [list-of-events]
;; add events in the list-of-events list to the events tree.
;; each item of the list-of-events list must consist of a two itemed list.
;; e.g. [[hue 0.9] [brightness 0.8]]
let this-depth 0
let this-stream ""
let this-event ""
let this-state nobody
let next-state nobody
let these-states states
let matching-states []
let matched-all-so-far true
foreach list-of-events
[ set this-stream first ?
set this-event last ?
;; check to see if state already exists
set matching-states these-states with [stream = this-stream and event = this-event and depth = this-depth]
ifelse (matched-all-so-far = true) and (count matching-states > 0)
[
set next-state one-of matching-states
ask next-state
[ set event-count event-count + 1 ]
set these-states [out-path-neighbors] of next-state
]
[ ;; state does not exist - create it
set matched-all-so-far false
create-states 1
[
set depth this-depth
set stream this-stream
set event this-event
set event-count 1
set size 0 ; these states are never visualised (thereare too many of them)
set next-state self
]
]
if (this-state != nobody)
[ ask this-state [ create-path-to next-state ]]
;; go down the tree
set this-state next-state
set this-depth this-depth + 1
]
end
;
; Copyright 2010 by William John Teahan. All rights reserved.
;
; Permission to use, modify or redistribute this model is hereby granted,
; provided that both of the following requirements are followed:
; a) this copyright notice is included.
; b) this model will not be redistributed for profit without permission
; from William John Teahan.
; Contact William John Teahan for appropriate licenses for redistribution for
; profit.
;
; To refer to this model in publications, please use:
;
; Teahan, W. J. (2010). Shannon Guessing Game NetLogo model.
; Artificial Intelligence. Ventus Publishing Aps.
;
