Friday, 16 December 2016

predicting sequences

In today's post we are going to be predicting the parent sequence given a subsequence. This is a nice addition to the other tools we have to work with sequences, and one I've been thinking about implementing for a while now. The subsequence can either be exact in which case it will only match the parent sequence if it is a perfect subsequence, or the version we use in this post where the subsequence can skip a couple of elements and still predict the right parent sequence. Here we just consider sequences of letters, and then later words, but the back-end is general enough that it should apply to sequences of many types of objects.

Let's jump into an example. Consider these two sequences (defined using the labor saving gm notation mentioned in my last post):
a = {A.B.C.D.E.F.G}
b = {U.V.W.B.C.D.X.Y.Z}
Then convert that to sw and load into the console:
$ ./gm2sw-v2.py gm-examples/simple-sequences.gm > sw-examples/simple-sequences.sw
$ ./the_semantic_db_console.py
Welcome!

sa: info off
sa: load simple-sequences.sw
Here is what our two sequences expand to:
full |range> => range(|1>,|2048>)
encode |end of sequence> => pick[10] full |range>

-- encode words:
encode |A> => pick[10] full |range>
encode |B> => pick[10] full |range>
encode |C> => pick[10] full |range>
encode |D> => pick[10] full |range>
encode |E> => pick[10] full |range>
encode |F> => pick[10] full |range>
encode |G> => pick[10] full |range>
encode |U> => pick[10] full |range>
encode |V> => pick[10] full |range>
encode |W> => pick[10] full |range>
encode |X> => pick[10] full |range>
encode |Y> => pick[10] full |range>
encode |Z> => pick[10] full |range>

-- encode classes:
encode |a> => pick[10] full |range>
encode |b> => pick[10] full |range>

-- encode sequence names:

-- encode low level sequences:
-- empty sequence
pattern |node 0: 0> => random-column[10] encode |end of sequence>

-- A . B . C . D . E . F . G
pattern |node 1: 0> => random-column[10] encode |A>
then |node 1: 0> => random-column[10] encode |B>

pattern |node 1: 1> => then |node 1: 0>
then |node 1: 1> => random-column[10] encode |C>

pattern |node 1: 2> => then |node 1: 1>
then |node 1: 2> => random-column[10] encode |D>

pattern |node 1: 3> => then |node 1: 2>
then |node 1: 3> => random-column[10] encode |E>

pattern |node 1: 4> => then |node 1: 3>
then |node 1: 4> => random-column[10] encode |F>

pattern |node 1: 5> => then |node 1: 4>
then |node 1: 5> => random-column[10] encode |G>

pattern |node 1: 6> => then |node 1: 5>
then |node 1: 6> => random-column[10] encode |end of sequence>

-- U . V . W . B . C . D . X . Y . Z
pattern |node 2: 0> => random-column[10] encode |U>
then |node 2: 0> => random-column[10] encode |V>

pattern |node 2: 1> => then |node 2: 0>
then |node 2: 1> => random-column[10] encode |W>

pattern |node 2: 2> => then |node 2: 1>
then |node 2: 2> => random-column[10] encode |B>

pattern |node 2: 3> => then |node 2: 2>
then |node 2: 3> => random-column[10] encode |C>

pattern |node 2: 4> => then |node 2: 3>
then |node 2: 4> => random-column[10] encode |D>

pattern |node 2: 5> => then |node 2: 4>
then |node 2: 5> => random-column[10] encode |X>

pattern |node 2: 6> => then |node 2: 5>
then |node 2: 6> => random-column[10] encode |Y>

pattern |node 2: 7> => then |node 2: 6>
then |node 2: 7> => random-column[10] encode |Z>

pattern |node 2: 8> => then |node 2: 7>
then |node 2: 8> => random-column[10] encode |end of sequence>
Now put it to use. First up, given the first element in the sequence, predict the rest of the parent sequence:
sa: next |A>
incoming_sequence: ['A']
|B . C . D . E . F . G>
|B>

sa: next |U>
incoming_sequence: ['U']
|V . W . B . C . D . X . Y . Z>
|V>
The first letters in our sequences are distinct, so our code has no trouble finding a unique parent sequence. Note that our code also returns the list of elements that are only one step ahead of the current position, in this case |B> and |V>. Now, what if we give it a non-unqiue subsequence?
sa: next |B.C>
incoming_sequence: ['B', 'C']
nodes 1: 0.1|node 1: 1> + 0.1|node 2: 3>
intersected nodes: 0.1|node 1: 2> + 0.1|node 2: 4>
|D . E . F . G>
|D . X . Y . Z>
2|D>
So, B is found at |node 1: 1> and |node 2: 3> in our stored sequences, and B.C at |node 1: 2> and |node 2: 3>, resulting in two matching parent sequences: |D . E . F . G> and |D . X . Y . Z>, and a one step ahead prediction of |D>. Next we include 'D' and see it is still ambiguous:
sa: next |B.C.D>
incoming_sequence: ['B', 'C', 'D']
nodes 1: 0.1|node 1: 1> + 0.1|node 2: 3>
intersected nodes: 0.1|node 1: 2> + 0.1|node 2: 4>
nodes 1: 0.1|node 1: 2> + 0.1|node 2: 4>
intersected nodes: 0.1|node 1: 3> + 0.1|node 2: 5>
|E . F . G>
|X . Y . Z>
|E> + |X>
And since we don't uniquely know which sequence B.C.D belongs to, the one step ahead prediction is for an E or a X. But if we then prepend an A or a W, we again have unique parent sequences:
sa: next |A.B.C>
incoming_sequence: ['A', 'B', 'C']
nodes 1: 0.1|node 1: 0>
intersected nodes: 0.1|node 1: 1>
nodes 1: 0.1|node 1: 1>
intersected nodes: 0.1|node 1: 2>
|D . E . F . G>
|D>

sa: next |W.B.C>
incoming_sequence: ['W', 'B', 'C']
nodes 1: 0.1|node 2: 2>
intersected nodes: 0.1|node 2: 3>
nodes 1: 0.1|node 2: 3>
intersected nodes: 0.1|node 2: 4>
|D . X . Y . Z>
|D>
Or another example:
sa: next |B.C.D.E>
incoming_sequence: ['B', 'C', 'D', 'E']
nodes 1: 0.1|node 1: 1> + 0.1|node 2: 3>
intersected nodes: 0.1|node 1: 2> + 0.1|node 2: 4>
nodes 1: 0.1|node 1: 2> + 0.1|node 2: 4>
intersected nodes: 0.1|node 1: 3> + 0.1|node 2: 5>
nodes 1: 0.1|node 1: 3> + 0.1|node 2: 5>
intersected nodes: 0.1|node 1: 4>
|F . G>
|F>

sa: next |B.C.D.X>
incoming_sequence: ['B', 'C', 'D', 'X']
nodes 1: 0.1|node 1: 1> + 0.1|node 2: 3>
intersected nodes: 0.1|node 1: 2> + 0.1|node 2: 4>
nodes 1: 0.1|node 1: 2> + 0.1|node 2: 4>
intersected nodes: 0.1|node 1: 3> + 0.1|node 2: 5>
nodes 1: 0.1|node 1: 3> + 0.1|node 2: 5>
intersected nodes: 0.1|node 2: 6>
|Y . Z>
|Y>
So it all works as desired. Here is a quick demonstration where we skip a couple of sequence elements, as might happen in a noisy room, in this case C.D, and it still works:
sa: next |B.E>
incoming_sequence: ['B', 'E']
nodes 1: 0.1|node 1: 1> + 0.1|node 2: 3>
intersected nodes: 0.1|node 1: 4>
|F . G>
|F>

sa: next |B.X>
incoming_sequence: ['B', 'X']
nodes 1: 0.1|node 1: 1> + 0.1|node 2: 3>
intersected nodes: 0.1|node 2: 6>
|Y . Z>
|Y>
That's the basics. Subsequences predicting parent sequences, with tolerance for noisy omission of elements. Now apply it to simple sentences encoded as sequences. Consider this knowledge:
$ cat gm-examples/george.gm
A = {george.is.27.years.old}
B = {the.mother.of.george.is.jane}
C = {the.father.of.george.is.frank}
D = {the.sibling.of.george.is.liz}
E = {jane.is.47.years.old}
F = {frank.is.50.years.old}
G = {liz.is.29.years.old}
H = {the.age.of.george.is.27}
I = {the.age.of.jane.is.47}
J = {the.age.of.frank.is.50}
K = {the.age.of.liz.is.29}
L = {the.mother.of.liz.is.jane}
M = {the.father.of.liz.is.frank}
N = {the.sibling.of.liz.is.george}
Process it as usual:
$ ./gm2sw-v2.py gm-examples/george.gm > sw-examples/george-gm.sw
$ ./the_semantic_db_console.py
sa: load george-gm.sw
And first consider what sequences follow "the":
sa: next |the>
incoming_sequence: ['the']
|mother . of . george . is . jane>
|father . of . george . is . frank>
|sibling . of . george . is . liz>
|age . of . george . is . 27>
|age . of . jane . is . 47>
|age . of . frank . is . 50>
|age . of . liz . is . 29>
|mother . of . liz . is . jane>
|father . of . liz . is . frank>
|sibling . of . liz . is . george>
2|mother> + 2|father> + 2|sibling> + 4|age>
And these simple sentences enable us to ask simple questions:
sa: next |the.mother.of>
incoming_sequence: ['the', 'mother', 'of']
nodes 1: 0.1|node 2: 0> + 0.1|node 3: 0> + 0.1|node 4: 0> + 0.1|node 8: 0> + 0.1|node 9: 0> + 0.1|node 10: 0> + 0.1|node 11: 0> + 0.1|node 12: 0> + 0.1|node 13: 0> + 0.1|node 14: 0>
intersected nodes: 0.1|node 2: 1> + 0.1|node 12: 1>
nodes 1: 0.1|node 2: 1> + 0.1|node 12: 1>
intersected nodes: 0.1|node 2: 2> + 0.1|node 12: 2>
|george . is . jane>
|liz . is . jane>
|george> + |liz>

sa: next |the.age.of>
incoming_sequence: ['the', 'age', 'of']
nodes 1: 0.1|node 2: 0> + 0.1|node 3: 0> + 0.1|node 4: 0> + 0.1|node 8: 0> + 0.1|node 9: 0> + 0.1|node 10: 0> + 0.1|node 11: 0> + 0.1|node 12: 0> + 0.1|node 13: 0> + 0.1|node 14: 0>
intersected nodes: 0.1|node 8: 1> + 0.1|node 9: 1> + 0.1|node 10: 1> + 0.1|node 11: 1>
nodes 1: 0.1|node 8: 1> + 0.1|node 9: 1> + 0.1|node 10: 1> + 0.1|node 11: 1>
intersected nodes: 0.1|node 8: 2> + 0.1|node 9: 2> + 0.1|node 10: 2> + 0.1|node 11: 2>
|george . is . 27>
|jane . is . 47>
|frank . is . 50>
|liz . is . 29>
|george> + |jane> + |frank> + |liz>

sa: next |sibling.of>
incoming_sequence: ['sibling', 'of']
nodes 1: 0.1|node 4: 1> + 0.1|node 14: 1>
intersected nodes: 0.1|node 4: 2> + 0.1|node 14: 2>
|george . is . liz>
|liz . is . george>
|george> + |liz>
Or making use of sequence element skipping (in the current code up to 3 sequence elements), we can ask more compact questions:
sa: next |father.george>
incoming_sequence: ['father', 'george']
nodes 1: 0.1|node 3: 1> + 0.1|node 13: 1>
intersected nodes: 0.1|node 3: 3>
|is . frank>
|is>

sa: next |age.george>
incoming_sequence: ['age', 'george']
nodes 1: 0.1|node 8: 1> + 0.1|node 9: 1> + 0.1|node 10: 1> + 0.1|node 11: 1>
intersected nodes: 0.1|node 8: 3>
|is . 27>
|is>
Obviously the brain stores knowledge about the world using more than just rote sentences (unless you are bad at studying for exams), but I think it is not a bad first step. Who knows, maybe very young children do just store simple sequences, without "decorations"? Certainly in adults music knowledge of lyrics and notes feels to be simple sequences. But we still don't have a good definition of what it means to understand something. To me it feels like some well constructed network. ie understanding something means it is thoroughly interlinked with related existing knowledge. But how do you code that?

Finally, an important point to make is the above is only interesting in that I'm doing it in a proposed brain like way. Using grep it is trivial to find subsequences of parent sequences. For example:
$ grep "father.*george" george.gm
C = {the.father.of.george.is.frank}

$ grep "the.*age.*of" george.gm
H = {the.age.of.george.is.27}
I = {the.age.of.jane.is.47}
J = {the.age.of.frank.is.50}
K = {the.age.of.liz.is.29}
And the way we represent our high order sequences has a lot of similarity to linked lists.

BTW, I should mention I tried the strict version of the next operator on the spelling dictionary example, using the subsequence f.r, resulting in this prediction for the next letter:
173|e> + 157|a> + 120|o> + 114|i> + 49|u> + 8|y> + 2|.> + |t>
So pretty much just vowels.

Next post, learning and recalling a sequence of random frames.

No comments:

Post a Comment