Saturday, 5 November 2016

learning and recalling chunked sequences

So, it is very common (universal?) for people to chunk difficult to recall, or long, sequences. Perhaps a password, the alphabet, or digits of pi. So I thought it would be useful to implement this idea in my notation, and as a sort of extension to learning sequences in my last post. The idea is simple enough, instead of learning a single long sequence, break the sequence into chunks, and then learn their respective sub-sequences. Here is how my brain chunks the alphabet and pi, though other people will have different chunking sizes: (ABC)(DEF)(GHI)... and (3.14)(15)(92)(65)(35)... Giving this collection of sequences:
alphabet: ABC, DEF, GHI, ...
ABC: A, B, C
DEF: D, E, F
GHI: G, H, I
...

pi: 3.14, 15, 92, 65, 35, ...
3.14: 3, ., 1, 4,
15: 1, 5
92: 9, 2
65: 6, 5
35: 3, 5
...
Given we already know how to learn sequences, this is easy to learn. Here is the code (using a constant chunk size of 3), here is the knowledge before learning, and after. I guess I should show a little of what that looks like. First the random (though in other uses, it would be preferable to use a more semantically similar encoding) encode stage:
full |range> => range(|1>,|2048>)
encode |end of sequence> => pick[10] full |range>
encode |A> => pick[10] full |range>
encode |B> => pick[10] full |range>
encode |C> => pick[10] full |range>
encode |D> => pick[10] full |range>
encode |E> => pick[10] full |range>
encode |F> => pick[10] full |range>
encode |G> => pick[10] full |range>
encode |H> => pick[10] full |range>
encode |I> => pick[10] full |range>
encode |J> => pick[10] full |range>
encode |K> => pick[10] full |range>
encode |L> => pick[10] full |range>
encode |M> => pick[10] full |range>
encode |N> => pick[10] full |range>
encode |O> => pick[10] full |range>
encode |P> => pick[10] full |range>
encode |Q> => pick[10] full |range>
encode |R> => pick[10] full |range>
encode |S> => pick[10] full |range>
encode |T> => pick[10] full |range>
encode |U> => pick[10] full |range>
encode |V> => pick[10] full |range>
encode |W> => pick[10] full |range>
encode |X> => pick[10] full |range>
encode |Y> => pick[10] full |range>
encode |Z> => pick[10] full |range>
encode |A B C> => pick[10] full |range>
encode |D E F> => pick[10] full |range>
encode |G H I> => pick[10] full |range>
encode |J K L> => pick[10] full |range>
encode |M N O> => pick[10] full |range>
encode |P Q R> => pick[10] full |range>
encode |S T U> => pick[10] full |range>
encode |V W X> => pick[10] full |range>
encode |Y Z> => pick[10] full |range>
encode |3> => pick[10] full |range>
encode |.> => pick[10] full |range>
encode |1> => pick[10] full |range>
encode |4> => pick[10] full |range>
encode |5> => pick[10] full |range>
encode |9> => pick[10] full |range>
encode |2> => pick[10] full |range>
encode |6> => pick[10] full |range>
encode |8> => pick[10] full |range>
encode |7> => pick[10] full |range>
encode |3 . 1> => pick[10] full |range>
encode |4 1 5> => pick[10] full |range>
encode |9 2 6> => pick[10] full |range>
encode |5 3 5> => pick[10] full |range>
encode |8 9 7> => pick[10] full |range>
encode |9 3 2> => pick[10] full |range>
encode |3 8 4> => pick[10] full |range>
The main thing to note here is that we are not just learning encodings for single symbols eg |A> or |3>, but also for chunks of symbols too. eg |A B C> and |3 . 1>. And in general, we can do similar encodings for anything we want to stuff into a ket. Once we have encodings for our objects we can learn their sequences. Here are a couple of them:
-- alphabet
-- A B C, D E F, G H I, J K L, M N O, P Q R, S T U, V W X, Y Z
start-node |alphabet> => random-column[10] encode |A B C>
pattern |node 0: 0> => start-node |alphabet>
then |node 0: 0> => random-column[10] encode |D E F>

pattern |node 0: 1> => then |node 0: 0>
then |node 0: 1> => random-column[10] encode |G H I>

pattern |node 0: 2> => then |node 0: 1>
then |node 0: 2> => random-column[10] encode |J K L>

pattern |node 0: 3> => then |node 0: 2>
then |node 0: 3> => random-column[10] encode |M N O>

pattern |node 0: 4> => then |node 0: 3>
then |node 0: 4> => random-column[10] encode |P Q R>

pattern |node 0: 5> => then |node 0: 4>
then |node 0: 5> => random-column[10] encode |S T U>

pattern |node 0: 6> => then |node 0: 5>
then |node 0: 6> => random-column[10] encode |V W X>

pattern |node 0: 7> => then |node 0: 6>
then |node 0: 7> => random-column[10] encode |Y Z>

pattern |node 0: 8> => then |node 0: 7>
then |node 0: 8> => append-column[10] encode |end of sequence>


-- A B C
-- A, B, C
start-node |A B C> => random-column[10] encode |A>
pattern |node 1: 0> => start-node |A B C>
then |node 1: 0> => random-column[10] encode |B>

pattern |node 1: 1> => then |node 1: 0>
then |node 1: 1> => random-column[10] encode |C>

pattern |node 1: 2> => then |node 1: 1>
then |node 1: 2> => append-column[10] encode |end of sequence>


-- D E F
-- D, E, F
start-node |D E F> => random-column[10] encode |D>
pattern |node 2: 0> => start-node |D E F>
then |node 2: 0> => random-column[10] encode |E>

pattern |node 2: 1> => then |node 2: 0>
then |node 2: 1> => random-column[10] encode |F>

pattern |node 2: 2> => then |node 2: 1>
then |node 2: 2> => append-column[10] encode |end of sequence>

...
where we see both the high level sequence of the alphabet chunks (ABC)(DEF)..., and the lower sequences of single letters A, B, C and D, E, F. The pi sequence has identical structure, so I'll omit that. For the curious, see the pre-learning sw file.

That's the learn stage taken care of, now the bit that took a little more work, code that recalls sequences, no matter how many layers deep. Though I've only so far tested it on a two-layer system. Here is the pseudo code:
  next (*) #=> then clean select[1,1] similar-input[pattern] |_self>
  name (*) #=> clean select[1,1] similar-input[encode] extract-category |_self>

  print-sequence |*> #=>
    if not do-you-know start-node |_self>:
      return |_self>
    if name start-node |_self> == |_self>:                    -- prevent infinite loop when an object is its own sequence
      print |_self>
      return |>
    |node> => new-GUID |>
    current "" |node> => start-node |_self>
    while name current "" |node> != |end of sequence>:
      if not do-you-know start-node name current "" |node>:
        print name current "" |node>
      else:
        print-sequence name current "" |node>
      current "" |node> => next current "" |node>
    return |end of sequence>
And the corresponding python:
def new_print_sequence(one,context,start_node=None):
  if start_node is None:                                          # so we can change the operator name that links to the first element in the sequence.
    start_node = "start-node"
  if len(one.apply_op(context,start_node)) == 0:                  # if we don't know the start-node, return the input ket
    return one
  print("print sequence:",one)

  def next(one):
    return one.similar_input(context,"pattern").select_range(1,1).apply_sigmoid(clean).apply_op(context,"then")
  def name(one):
    return one.apply_fn(extract_category).similar_input(context,"encode").select_range(1,1).apply_sigmoid(clean)
    
  if name(one.apply_op(context,start_node)).the_label() == one.the_label():
    print(one)                                                               # prevent infinte loop when an object is its own sequence. Maybe should have handled at learn stage, not recall?
    return ket("")  
  current_node = one.apply_op(context,start_node)  
  while name(current_node).the_label() != "end of sequence":
    if len(name(current_node).apply_op(context,start_node)) == 0:
      print(name(current_node))      
    else:
      new_print_sequence(name(current_node),context,start_node)
    current_node = next(current_node)
  return ket("end of sequence")
And finally, put it to use:
$ ./the_semantic_db_console.py
Welcome!

sa: load chunked-alphabet-pi.sw
sa: new-print-sequence |alphabet>
print sequence: |alphabet>
print sequence: |A B C>
|A>
|B>
|C>
print sequence: |D E F>
|D>
|E>
|F>
print sequence: |G H I>
|G>
|H>
|I>
print sequence: |J K L>
|J>
|K>
|L>
print sequence: |M N O>
|M>
|N>
|O>
print sequence: |P Q R>
|P>
|Q>
|R>
print sequence: |S T U>
|S>
|T>
|U>
print sequence: |V W X>
|V>
|W>
|X>
print sequence: |Y Z>
|Y>
|Z>
|end of sequence>

sa: new-print-sequence |pi>
print sequence: |pi>
print sequence: |3 . 1>
|3>
|.>
|1>
print sequence: |4 1 5>
|4>
|1>
|5>
print sequence: |9 2 6>
|9>
|2>
print sequence: |6>
|6>
print sequence: |5 3 5>
|5>
|3>
|5>
print sequence: |8 9 7>
|8>
|9>
|7>
print sequence: |9 3 2>
|9>
|3>
|2>
print sequence: |3 8 4>
|3>
|8>
|4>
print sequence: |6>
|6>
|end of sequence>
And we can print individual sub-sequences:
sa: new-print-sequence |D E F>
print sequence: |D E F>
|D>
|E>
|F>
|end of sequence>

sa: new-print-sequence |Y Z>
print sequence: |Y Z>
|Y>
|Z>
|end of sequence>

sa: new-print-sequence |8 9 7>
print sequence: |8 9 7>
|8>
|9>
|7>
|end of sequence>
Some notes:
1) There are of course other ways to implement learning and recalling chunked sequences. In my implementation above, when a subsequence hits an "end of sequence" it escapes from the while loop, and the high level sequence resumes. But an alternative would be for the end of say the |8 9 7> subsequence to link back to the parent pi sequence, and then resume that sequence. In which case we would have this:
sa: new-print-sequence |8 9 7>
print sequence: |8 9 7>
|8>
|9>
|7>
print sequence: |9 3 2>
|9>
|3>
|2>
print sequence: |3 8 4>
|3>
|8>
|4>
print sequence: |6>
|6>
|end of sequence>
So, does |8 9 7> live as an independent sequence with no link to the parent sequence, or does the final |7> link back to the pi sequence? I don't know for sure, but I suspect it is independent, because consider the case where |8 9 7> is in multiple high level sequences. The |7> wouldn't know where to link back to.
2) I have had for a long time my similarity metric called simm, that returns the similarity of superpositions (1 for exact match, 0 for disjoint, values in between otherwise). But I have so far failed to implement a decent simm for sequences (aside from mapping strings to ngrams, and then running simm on that). I now suspect/hope chunking of sequences might be a key part.
3) presumably the chunking of sequences structure is used by the brain for more than just difficult passwords, eg perhaps grammar. Seems likely to me that if a structure is used somewhere by the brain, then it is used in many other places too. ie, if a structure is good, then reuse it.

No comments:

Post a Comment