The Semantic DB Project

Thursday, 14 April 2016

new tool: phi-transform

I'm still stumbling in the dark on this topic! Today I implemented phi-transform. The idea is break an image into k*k image-ngrams, then find the most similar layer-1 superposition (see yesterday's work), and substitute that in, in its place. Then map back to an image, and then average them all. The code is here, but you will need to understand some of the project for it to make sense. Anyway, the goal is to do a little normalization. The more normalization we can do, the easier pattern recognition becomes.

First, our method:

-- create 100 mnist test images (using the test-csv):
$ ./mnist-data-to-images.py 100

-- filter the sw-file from yesterday down to the layer-1 results (that is all we need):
$ cd sw-examples/
$ grep "^layer-1 " mnist-100--save-average-categorize--0_5.sw > mnist-100--layer-1--0_5.sw

-- phi-transform the test images (with a run-time of 62 minutes):
$ cd ..
$ ./phi-transform.py 10 work-on-handwritten-digits/test-images/*.bmp

-- make pretty pictures:
$ cd work-on-handwritten-digits/
$ ls -1tr test-images/*.bmp > image-list.txt
$ montage -geometry +2+2 @image-list.txt mnist-test-100.jpg
$ ls -1tr phi-images/* > image-list.txt
$ montage -geometry +2+2 @image-list.txt mnist-test-100--phi-transform-mnist-100-0_5.jpg

And now the results. First the 100 test images:

Now, after the phi-transform:

So, it is sort of normalized a bit. But I'm not sure it is a useful improvement. This phi-transform only used the average-categorize-image-ngrams from 100 training images. What if we ramp that up to 1000? And of course, there are two other variables we don't know the best values for. The image-ngram size, and the average-categorize threshold. Currently we are using 10, and 0.5 respectively. Also, I don't know if the phi-transform is better or worse than just doing a simple Gaussian blur. See my edge-enhance code for that. And of course, there is always the possibility that average-categorize won't help us at all, but I suspect it will.

Wednesday, 13 April 2016

average categorize some mnist digits

So, it's taken some thinking, and computing time, but I have made a start on digit recognition. Though to be honest I'm not 100% convinced this approach will work, but it might. So worth exploring. The outline of my method is. Get some images from the MNIST set. Convert them to images. Convert those to k*k image ngrams (a kind of 2D version of the standard ngram idea). Then those into superpositions. Apply my average-categorize to that, then finally convert back to images, and tile the result. Yeah, my process needs optimizing, but it will do for now, while in the exploratory phase.

Now, the details, and then the results:

-- map MNIST idx format to csv, with tweaked code from here:
$ ./process_mnist.py

-- map first 100 mnist image-data to images:
$ ./mnist-data-to-images.py 100

-- map images to 10*10 image-ngrams sw file:
$ ./create-image-sw.py 10 work-on-handwritten-digits/images/mnist-image-*.bmp

-- rename sw file to something more specific:
$ mv image-ngram-superpositions-10.sw mnist-100--image-ngram-superpositions-10.sw

-- average categorize it (in this case with threshold t = 0.8):
$ ./the_semantic_db_console.py
sa: load mnist-100--image-ngram-superpositions-10.sw
sa: average-categorize[layer-0,0.8,phi,layer-1]
sa: save mnist-100--save-average-categorize--0_8.sw

-- convert these back to images (currently the source sw file is hard-coded into this script):
$ ./create-sw-images.py

-- tile the results:
$ cd work-on-handwritten-digits/
$ mv average-categorize-images mnist-100--average-categorize-images--threshold-0_8
$ ls -1tr mnist-100--average-categorize-images--threshold-0_8/* > image-list.txt
$ montage -geometry +2+2 @image-list.txt mnist-100--average-categorize-0_8.jpg

where montage is part of ImageMagick.

Now the results, starting with the 100 images we used as the training set (which resulted in 32,400 layer-0 superpositions/image-ngrams):

Average categorize with t = 0.8, with a run-time of 1 1/2 days, and resulted in 1,238 image ngrams:

Average categorize with t = 0.7, with a run-time of 6 1/2 hours, and resulted in 205 image ngrams:

Average categorize with t = 0.65, with a run-time of 3 hours 20 minutes, and resulted in 103 image ngrams:

Average categorize with t = 0.6, with a run-time of 1 hours 50 minutes, and resulted in 58 image ngrams:

Average categorize with t = 0.5, with a run-time of 43 minutes, and resulted in 18 image ngrams:

Now we have to see if we can do anything useful with these average-categorize image ngrams. The hope is to map test images to linear combinations of the image-ngrams, which we can represent in superposition form (details later), and then do digit classification on that using similar[op]. Though I'm pretty sure that is not sufficient processing to be successful, but it is a starting point. Eventually I'd like to use several layers of average-categorize, which I'm hoping will be more powerful.

Wednesday, 16 March 2016

introducing network k similarity

So, a while back I had the thought that I wanted to map sw files to an integer, with the property that if the two sw files are structurally equivalent, independent of operator and ket names, they would give the same integer. And if they were different, they would give different integers. I made an attempt, but didn't get very far. Well, now I have made some progress. BTW, turns out this is quite similar to testing if two graphs are isomorphic. Our sw files are in general a representation for directed, labelled, weighted graphs. So as a starting point I decided to try finding mappings to integers for the simpler case of undirected, unlabeled (or all the same label, really) and unweighted (all the same coeff of 1) graphs. That is, we needed code to map each node in the graph to an integer, and then combine the results from all nodes with some associative, Abelian function, so that the order we consider the nodes doesn't change the final integer. I decided to use multiplication, but I don't know if there is perhaps an advantage to use some other function.

Consider this undirected, unlabeled, unweighted network:

op |A> => |B> + |C> + |G>
op |B> => |A> + |D> + |H>
op |C> => |A> + |D> + |E>
op |D> => |C> + |F> + |B>
op |E> => |C> + |F> + |G>
op |F> => |E> + |D> + |H>
op |G> => |A> + |E> + |H>
op |H> => |G> + |F> + |B>

which has this adjacency matrix:

sa: matrix[op]
[ A ] = [  0  1  1  0  0  0  1  0  ] [ A ]
[ B ]   [  1  0  0  1  0  0  0  1  ] [ B ]
[ C ]   [  1  0  0  1  1  0  0  0  ] [ C ]
[ D ]   [  0  1  1  0  0  1  0  0  ] [ D ]
[ E ]   [  0  0  1  0  0  1  1  0  ] [ E ]
[ F ]   [  0  0  0  1  1  0  0  1  ] [ F ]
[ G ]   [  1  0  0  0  1  0  0  1  ] [ G ]
[ H ]   [  0  1  0  0  0  1  1  0  ] [ H ]

So now what? Well, there are not a lot of options really. About the only one is to consider op^k:

sa: op |A>
|B> + |C> + |G>

sa: op^2 |A>
3|A> + 2|D> + 2|H> + 2|E>

sa: op^3 |A>
7|B> + 7|C> + 7|G> + 6|F>

sa: op^4 |A>
21|A> + 20|D> + 20|H> + 20|E>

sa: op^5 |A>
61|B> + 61|C> + 61|G> + 60|F>

And similarly for say node E:

sa: op |E>
|C> + |F> + |G>

sa: op^2 |E>
2|A> + 2|D> + 3|E> + 2|H>

sa: op^3 |E>
6|B> + 7|C> + 7|G> + 7|F>

sa: op^4 |E>
20|A> + 20|D> + 20|H> + 21|E>

sa: op^5 |E>
60|B> + 61|C> + 61|G> + 61|F>

Now we need some mapping of these superpositions to integers. The simplest is just count the number of kets in each superposition, and sum up the coefficients of the superpositions.

For example:

-- how many kets in op^2 |A>:
sa: how-many op^2 |A>
|number: 4>

-- sum of the coefficients in op^5 |E>:
sa: count-sum op^5 |E>
|number: 243>

Giving this candidate code:

# define our primes:
primes = [2,3,5,7,11,13,17,19,23,29,31,37,41,43,47,53,59,61,67,71]

# define our node to signature function:
# node is a ket, op is a string, k is a positive integer
#
def node_to_signature(node,op,k):
  signature = 1
  r = node
  for n in range(0,2*k+2,2):
    v1 = int(r.count())
    v2 = int(r.count_sum())
    signature *= primes[n]**v1
    signature *= primes[n+1]**v2
    r = r.apply_op(context,op)
  return signature

where:
- r.count() counts the number of kets in the superposition r
- r.count_sum() sums the coefficients in the superposition r
- r.apply_op(context,op) is the backend python for "op |r>"

Multiply the result for each node together, and we get an integer for the entire graph.

For example, network-1 given above (code here):

$ ./k_similarity.py

Usage: ./k_similarity.py network.sw k

$ ./k_similarity.py sw-examples/network-1.sw 0
final 0 signature: 1679616

$ ./k_similarity.py sw-examples/network-1.sw 1
final 1 signature: 19179755540281619947585464477539062500000000

$ ./k_similarity.py sw-examples/network-1.sw 2
final 2 signature: 6476350707318135130077117995114645986807357994898007018142410234416950335365537077653915561137590232737115400989220342662084412388620595718383789062500000000

$ ./k_similarity.py sw-examples/network-1.sw 3
final 3 signature: 24915731868195494319495815784319613492832634952662828691337747373649371307162868449881081569598627271299028119958195828952049161143923339977360231242362436120389977749741771679154317963360167622141780171409392307815307137197414891280901439715432140783946679111853342421234802445604704216529913721698446113005853797982997735094149548130522011842820562607650700432722674863502965065307257559822955178526515947974154452386341191890795350397708587635938073745727539062500000000

And yeah, they are quite large integers! So here is a partial optimization (basically we reverse our list of v's so that the largest v's get applied to the smallest primes):

def node_to_signature(node,op,k):
  signature = 1
  r = node
  v_list = []
  for n in range(0,k+1):
    v1 = int(r.count())
    v2 = int(r.count_sum())
    v_list.append(v1)
    v_list.append(v2)
    r = r.apply_op(context,op)
  v_list.reverse()
  for i,v in enumerate(v_list):
    signature *= primes[i]**v
  return signature

Then apply these to network-1 again:

$ ./k_similarity_v2.py sw-examples/network-1.sw 0
final 0 signature: 1679616
signature log 10: 6.225210003069149
final 0 hash signature: 724e88406ed484a3f47b2c3d522b315261a6b265

$ ./k_similarity_v2.py sw-examples/network-1.sw 1
final 1 signature: 10670244327163201329561600000000
signature log 10: 31.02817436400965
final 1 hash signature: 034f0b554b967c20407ecf3e4373a0c9dde3bdca

$ ./k_similarity_v2.py sw-examples/network-1.sw 2
final 2 signature: 17472747581618922968458849772385108978718944775448150613650671927296000000000000000000000000
signature log 10: 91.24236120296295
final 2 hash signature: d988b544b788f948151bc918dc314cabeecb92a4

$ ./k_similarity_v2.py sw-examples/network-1.sw 3
final 3 signature: 28908255527868889825790269007077519960587554847149293166812397097145900954989327674871077402331703366281633882780425402405190443720178454553217647654140379136000000000000000000000000000000000000000000000000000000000000000000000000
signature log 10: 229.461021884909
final 3 hash signature: 39d6e5cdc2300c06801865ddf558c3bfbdf03e16

Which is a big improvement in signature integer size. There are probably other optimizations that could be made, but I'm happy enough with this.

Now we are ready to throw it at some networks:
network 2:

op |A> => |B> + |E> + |G>
op |B> => |A> + |E> + |C>
op |C> => |B> + |F> + |D>
op |D> => |C> + |F> + |H>
op |E> => |A> + |B> + |G>
op |F> => |C> + |D> + |H>
op |G> => |A> + |E> + |H>
op |H> => |G> + |F> + |D>

final 0 signature: 1679616
signature log 10: 6.225210003069149
final 0 hash signature: 724e88406ed484a3f47b2c3d522b315261a6b265

final 1 signature: 10670244327163201329561600000000
signature log 10: 31.02817436400965
final 1 hash signature: 034f0b554b967c20407ecf3e4373a0c9dde3bdca

final 2 signature: 752144490249374505343739906132775290761509353163124189431799265916843196416000000000000000000000000
signature log 10: 98.87630127847756
final 2 hash signature: c3a6b9f345a2b76c165fb3c997542110f9bd3051

final 3 signature: 1780207704063630331903810305642872528743446816219521629215268406907491725204135233958145047922812087077527584194258689629049352975220203052830112622815055508579908457300235611841260158976000000000000000000000000000000000000000000000000000000000000000000000000
signature log 10: 258.25047067616634
final 3 hash signature: 50806a21d5f34dbb85d098f4ecabdfe9c6e1052d

network 3:

op |a> => |g> + |h> + |i>
op |b> => |g> + |h> + |j>
op |c> => |g> + |i> + |j>
op |d> => |h> + |i> + |j>
op |g> => |a> + |b> + |c>
op |h> => |a> + |b> + |d>
op |i> => |a> + |c> + |d>
op |j> => |b> + |c> + |d>

final 0 signature: 1679616
signature log 10: 6.225210003069149
final 0 hash signature: 724e88406ed484a3f47b2c3d522b315261a6b265

final 1 signature: 10670244327163201329561600000000
signature log 10: 31.02817436400965
final 1 hash signature: 034f0b554b967c20407ecf3e4373a0c9dde3bdca

final 2 signature: 17472747581618922968458849772385108978718944775448150613650671927296000000000000000000000000
signature log 10: 91.24236120296295
final 2 hash signature: d988b544b788f948151bc918dc314cabeecb92a4

final 3 signature: 28908255527868889825790269007077519960587554847149293166812397097145900954989327674871077402331703366281633882780425402405190443720178454553217647654140379136000000000000000000000000000000000000000000000000000000000000000000000000
signature log 10: 229.461021884909
final 3 hash signature: 39d6e5cdc2300c06801865ddf558c3bfbdf03e16

network-5:

op |1> => |6> + |4> + |2>
op |2> => |1> + |5> + |3>
op |3> => |2> + |6> + |4>
op |4> => |3> + |1> + |5>
op |5> => |6> + |4> + |2>
op |6> => |1> + |3> + |5>

final 0 signature: 46656
signature log 10: 4.668907502301861
final 0 hash signature: f89664dd10bc575affd2df12323f9fa1386bec50

final 1 signature: 186694177220038656000000
signature log 10: 23.27113077300724
final 1 hash signature: 43361818928df1e25cf2d42eaf826b9b79e535ad

final 2 signature: 370717744295392913280583372681517644759594696704000000000000000000
signature log 10: 65.56904337390424
final 2 hash signature: 5acae2f217c04b462739d64df04d8cfddd030ebf

final 3 signature: 145361918192683278731003716769548827497223900554742033134393018706053787797668034552637435114664336624164274176000000000000000000000000000000000000000000000000000000
signature log 10: 164.16245064527826
final 3 hash signature: d12513fc0a3ec0439083640a6b366d7c478d8c75

And we note that network-1 and network-3 agree for k in {0,1,2,3}, and network-2 only agrees for k in {0,1}. Indeed, this is a pattern I saw with all the networks I tested. If they were isomorphic they agreed for all tested k (indeed, my algo would be broken if isomorphic graphs have a k where they differ in signature), and if they have the same number of vertices and edges per vertex, but are not isomorphic, they agree for k in {0,1}, but differ at k = 2. Note that network-5 has only 6 nodes, instead of 8, so it doesn't even agree with network-1, network-2 and network-3 at k = 0.

Now for some comments:
1) isomorphic graphs should produce the same signature for all k. If not, then my algo is broken!
2) what is the smallest k such that we can be sure graphs are not isomorphic? I don't know. In all the examples I tested k = 2 was sufficient. But presumably there can be collisions where non-isomorphic graphs produce the same signature for some k. So that means we can only prove graphs are not isomorphic if we have a k where they differ, but we can only be probably sure they are isomorphic if we haven't found a k where they differ.
3) if two graphs differ at k, then presumably, baring some weird collision, they will also differ for all larger k.
4) k = 0 is directly related to node count.
5) k = 1 is directly related to the number of edges connected to each vertex
6) just because we are probably sure two graphs are isomorphic doesn't help at all in finding that isomorphism.
7) there is a matrix version of my algo. Though I'm not sure when my algo is cheaper, and when the matrix version is cheaper. Consider k = 2. This corresponds to op^2 |some node>, and to this matrix:

sa: merged-matrix[op,op]
[ A ] = [  3  0  0  2  2  0  0  2  ] [ A ]
[ B ]   [  0  3  2  0  0  2  2  0  ] [ B ]
[ C ]   [  0  2  3  0  0  2  2  0  ] [ C ]
[ D ]   [  2  0  0  3  2  0  0  2  ] [ D ]
[ E ]   [  2  0  0  2  3  0  0  2  ] [ E ]
[ F ]   [  0  2  2  0  0  3  2  0  ] [ F ]
[ G ]   [  0  2  2  0  0  2  3  0  ] [ G ]
[ H ]   [  2  0  0  2  2  0  0  3  ] [ H ]

Now consider the columns separately. v1 of the first column is the number of elements with coeff > 0. ie, in this case 4. v2 is the sum of the elements in the column. ie, in this case 9. Then our integer for this column is (p1^v1)*(p2^v2) for some primes p1 and p2. Do similarly for the other columns, multiply the results together, and then you have your graph signature integer. How to do that for other k should be obvious. Actually, my algo is slightly different from this. It takes all values of {0,1,...,k} into account, not just the given k. See the code above, to clarify!
8) there are probably other ways to map op^k |some node> superpositions to integers.
9) there are probably optimizations to my method.
10) it can be a little bit of work converting diagrams to BKO notation
11) it shouldn't be hard to convert this code to handle more general sw files. I'll probably try that tomorrow.
12) recall we noted some similarities between category theory and BKO. If we show that two sw files are almost certainly isomorphic, this implies there exists a functor mapping one sw file to the other.
13) we can use our project to easily find k-cycles in networks. No claims about efficiency though! Simply enough, node |x> has a k-cycle if |x> is a member of the op^k |x> superposition.
Now in BKO:

sa: load network-1.sw
sa: has-3-cycle |*> #=> is-mbr(|_self>,op^3 |_self>)
sa: has-4-cycle |*> #=> is-mbr(|_self>,op^4 |_self>)
sa: has-5-cycle |*> #=> is-mbr(|_self>,op^5 |_self>)
sa: has-6-cycle |*> #=> is-mbr(|_self>,op^6 |_self>)
sa: has-7-cycle |*> #=> is-mbr(|_self>,op^7 |_self>)
sa: has-8-cycle |*> #=> is-mbr(|_self>,op^8 |_self>)
sa: has-9-cycle |*> #=> is-mbr(|_self>,op^9 |_self>)
sa: has-10-cycle |*> #=> is-mbr(|_self>,op^10 |_self>)

sa: table[node,has-3-cycle,has-4-cycle,has-5-cycle,has-6-cycle,has-7-cycle,has-8-cycle,has-9-cycle,has-10-cycle] rel-kets[op]
+------+-------------+-------------+-------------+-------------+-------------+-------------+-------------+--------------+
| node | has-3-cycle | has-4-cycle | has-5-cycle | has-6-cycle | has-7-cycle | has-8-cycle | has-9-cycle | has-10-cycle |
+------+-------------+-------------+-------------+-------------+-------------+-------------+-------------+--------------+
| A    | no          | yes         | no          | yes         | no          | yes         | no          | yes          |
| B    | no          | yes         | no          | yes         | no          | yes         | no          | yes          |
| C    | no          | yes         | no          | yes         | no          | yes         | no          | yes          |
| D    | no          | yes         | no          | yes         | no          | yes         | no          | yes          |
| E    | no          | yes         | no          | yes         | no          | yes         | no          | yes          |
| F    | no          | yes         | no          | yes         | no          | yes         | no          | yes          |
| G    | no          | yes         | no          | yes         | no          | yes         | no          | yes          |
| H    | no          | yes         | no          | yes         | no          | yes         | no          | yes          |
+------+-------------+-------------+-------------+-------------+-------------+-------------+-------------+--------------+

sa: reset
sa: load network-2.sw
sa: load has-k-cycle.sw
sa: table[node,has-3-cycle,has-4-cycle,has-5-cycle,has-6-cycle,has-7-cycle,has-8-cycle,has-9-cycle,has-10-cycle] rel-kets[op]
+------+-------------+-------------+-------------+-------------+-------------+-------------+-------------+--------------+
| node | has-3-cycle | has-4-cycle | has-5-cycle | has-6-cycle | has-7-cycle | has-8-cycle | has-9-cycle | has-10-cycle |
+------+-------------+-------------+-------------+-------------+-------------+-------------+-------------+--------------+
| A    | yes         | yes         | yes         | yes         | yes         | yes         | yes         | yes          |
| B    | yes         | yes         | yes         | yes         | yes         | yes         | yes         | yes          |
| C    | yes         | yes         | yes         | yes         | yes         | yes         | yes         | yes          |
| D    | yes         | yes         | yes         | yes         | yes         | yes         | yes         | yes          |
| E    | yes         | yes         | yes         | yes         | yes         | yes         | yes         | yes          |
| F    | yes         | yes         | yes         | yes         | yes         | yes         | yes         | yes          |
| G    | yes         | yes         | yes         | yes         | yes         | yes         | yes         | yes          |
| H    | yes         | yes         | yes         | yes         | yes         | yes         | yes         | yes          |
+------+-------------+-------------+-------------+-------------+-------------+-------------+-------------+--------------+

sa: reset
sa: load network-5.sw
sa: load has-k-cycle.sw
sa: table[node,has-3-cycle,has-4-cycle,has-5-cycle,has-6-cycle,has-7-cycle,has-8-cycle,has-9-cycle,has-10-cycle] rel-kets[op]
+------+-------------+-------------+-------------+-------------+-------------+-------------+-------------+--------------+
| node | has-3-cycle | has-4-cycle | has-5-cycle | has-6-cycle | has-7-cycle | has-8-cycle | has-9-cycle | has-10-cycle |
+------+-------------+-------------+-------------+-------------+-------------+-------------+-------------+--------------+
| 1    | no          | yes         | no          | yes         | no          | yes         | no          | yes          |
| 2    | no          | yes         | no          | yes         | no          | yes         | no          | yes          |
| 3    | no          | yes         | no          | yes         | no          | yes         | no          | yes          |
| 4    | no          | yes         | no          | yes         | no          | yes         | no          | yes          |
| 5    | no          | yes         | no          | yes         | no          | yes         | no          | yes          |
| 6    | no          | yes         | no          | yes         | no          | yes         | no          | yes          |
+------+-------------+-------------+-------------+-------------+-------------+-------------+-------------+--------------+

sa: reset
sa: load network-7.sw
sa: load has-k-cycle.sw
sa: table[node,has-3-cycle,has-4-cycle,has-5-cycle,has-6-cycle,has-7-cycle,has-8-cycle,has-9-cycle,has-10-cycle] rel-kets[op]
+------+-------------+-------------+-------------+-------------+-------------+-------------+-------------+--------------+
| node | has-3-cycle | has-4-cycle | has-5-cycle | has-6-cycle | has-7-cycle | has-8-cycle | has-9-cycle | has-10-cycle |
+------+-------------+-------------+-------------+-------------+-------------+-------------+-------------+--------------+
| a    | no          | yes         | yes         | yes         | yes         | yes         | yes         | yes          |
| b    | no          | yes         | yes         | yes         | yes         | yes         | yes         | yes          |
| c    | no          | yes         | yes         | yes         | yes         | yes         | yes         | yes          |
| d    | no          | yes         | yes         | yes         | yes         | yes         | yes         | yes          |
| e    | no          | yes         | yes         | yes         | yes         | yes         | yes         | yes          |
| f    | no          | yes         | yes         | yes         | yes         | yes         | yes         | yes          |
| g    | no          | yes         | yes         | yes         | yes         | yes         | yes         | yes          |
| h    | no          | yes         | yes         | yes         | yes         | yes         | yes         | yes          |
| i    | no          | yes         | yes         | yes         | yes         | yes         | yes         | yes          |
| j    | no          | yes         | yes         | yes         | yes         | yes         | yes         | yes          |
+------+-------------+-------------+-------------+-------------+-------------+-------------+-------------+--------------+

sa: reset
sa: load network-11.sw
sa: load has-k-cycle.sw
sa: table[node,has-3-cycle,has-4-cycle,has-5-cycle,has-6-cycle,has-7-cycle,has-8-cycle,has-9-cycle,has-10-cycle] rel-kets[op]
+------+-------------+-------------+-------------+-------------+-------------+-------------+-------------+--------------+
| node | has-3-cycle | has-4-cycle | has-5-cycle | has-6-cycle | has-7-cycle | has-8-cycle | has-9-cycle | has-10-cycle |
+------+-------------+-------------+-------------+-------------+-------------+-------------+-------------+--------------+
| 1    | yes         | yes         | yes         | yes         | yes         | yes         | yes         | yes          |
| 2    | yes         | yes         | yes         | yes         | yes         | yes         | yes         | yes          |
| 3    | yes         | yes         | yes         | yes         | yes         | yes         | yes         | yes          |
| 4    | no          | yes         | yes         | yes         | yes         | yes         | yes         | yes          |
| 5    | no          | yes         | yes         | yes         | yes         | yes         | yes         | yes          |
| 6    | yes         | yes         | yes         | yes         | yes         | yes         | yes         | yes          |
+------+-------------+-------------+-------------+-------------+-------------+-------------+-------------+--------------+

So that is all kind of pretty. But my hunch is that mapping node cycle counts to signature integers is more expensive than my method. Note that 2-cycles are boring. In an undirected graph, all nodes have 2-cycles.
14) when can we expect collisions? Consider network A and network B. For some node, and k, they have superpositions r1 and r2 respectively. We will have a collision if r1.count() == r2.count() and r1.count_sum() == r2.count_sum() yet r1 and r2 are not equal. Presumably, this will not persist for other values of k. I don't know for sure though!
15) it would be nice to have examples of non-isomorphic graphs yet they agree for at least k = 2. All the non-iso examples I tested had different k = 2 signatures.
16) let's find k = 2 signatures for all our networks:

network 1:
final 2 hash signature: d988b544b788f948151bc918dc314cabeecb92a4

network 2:
final 2 hash signature: c3a6b9f345a2b76c165fb3c997542110f9bd3051

network 3:
final 2 hash signature: d988b544b788f948151bc918dc314cabeecb92a4

network 4:
final 2 hash signature: d988b544b788f948151bc918dc314cabeecb92a4

network 5:
final 2 hash signature: 5acae2f217c04b462739d64df04d8cfddd030ebf

network 6:
final 2 hash signature: 5acae2f217c04b462739d64df04d8cfddd030ebf

network 7:
final 2 hash signature: 6853d10f768067b82119e5fe99d6a8969d569477

network 8:
final 2 hash signature: caef2fb67b25a750cefdc11d05219695db55326b

network 9:
final 2 hash signature: 6ca9d1097590e86b4af0c59e746e10d5caf803fb

network 10:
final 2 hash signature: 6ca9d1097590e86b4af0c59e746e10d5caf803fb

network 11:
final 2 hash signature: 2050930de8495fea7b75b0b1dadc51462dbead44

network 12:
final 2 hash signature: 36256c6db2fe1e6c9a4f937e8d2f535b747a4e15

network 13:
final 2 hash signature: cb3f6e656e8ecd1818a9ae08f51e6cd3437743f1

network 14:
final 2 hash signature: 08786357aa7d977507caaced7f292c95946f11b8

network 16:
final 2 hash signature: d988b544b788f948151bc918dc314cabeecb92a4

network 17:
final 2 hash signature: 5acae2f217c04b462739d64df04d8cfddd030ebf

network 18:
final 2 hash signature: 5acae2f217c04b462739d64df04d8cfddd030ebf

network 20:
final 2 hash signature: 4c5182ec30e09c9b09f3980a3998381ea3404035

Resulting in this classification: {1,3,4,16} {2} {5,6,17,18} {7} {8} {9,10} {11} {12} {13} {14} {20}
17) it would be nice to test this code on larger networks, say something with 50 nodes instead of just 8.

Update: wrote some code to do the classification for me:

$ ./k_classifier.py

Usage: ./k_classifier.py k network-1.sw [network-2.sw network-3.sw ...]

$ ./k_classifier.py 0 sw-examples/network-*.sw
the k = 0 network classes:
----------------------------
724e88406ed484a3f47b2c3d522b315261a6b265: network-1, network-16, network-2, network-20, network-21, network-22, network-3, network-4
f60f6fcc93727388d031d7eada90c959c0813ca0: network-10, network-7, network-8, network-9
f89664dd10bc575affd2df12323f9fa1386bec50: network-11, network-12, network-17, network-18, network-23, network-24, network-5, network-6
cd0bcbc7f5b13e9aa1cae35d665739eb964f2084: network-13, network-14
----------------------------

the k = 1 network classes:
----------------------------
034f0b554b967c20407ecf3e4373a0c9dde3bdca: network-1, network-16, network-2, network-20, network-3, network-4
5328d58c5e05fa27d819aa197355238f2364456b: network-10, network-7, network-8, network-9
7ccf6241d30bfc88bd65e22dbdfe7488c67ca41d: network-11, network-12, network-23, network-24
cb38319a6c5110625a9085c7c110b213ea82621c: network-13, network-14
43361818928df1e25cf2d42eaf826b9b79e535ad: network-17, network-18, network-5, network-6
49063e880677302c51a3d93774ed58c328d909a5: network-21, network-22
----------------------------

the k = 2 network classes:
----------------------------
d988b544b788f948151bc918dc314cabeecb92a4: network-1, network-16, network-3, network-4
6ca9d1097590e86b4af0c59e746e10d5caf803fb: network-10, network-9
2050930de8495fea7b75b0b1dadc51462dbead44: network-11, network-23
36256c6db2fe1e6c9a4f937e8d2f535b747a4e15: network-12, network-24
cb3f6e656e8ecd1818a9ae08f51e6cd3437743f1: network-13
08786357aa7d977507caaced7f292c95946f11b8: network-14
5acae2f217c04b462739d64df04d8cfddd030ebf: network-17, network-18, network-5, network-6
c3a6b9f345a2b76c165fb3c997542110f9bd3051: network-2
4c5182ec30e09c9b09f3980a3998381ea3404035: network-20
863a20a85da3519434ef062f6f6ce3d7ef205582: network-21, network-22
6853d10f768067b82119e5fe99d6a8969d569477: network-7
caef2fb67b25a750cefdc11d05219695db55326b: network-8
----------------------------

the k = 3 network classes:
----------------------------
39d6e5cdc2300c06801865ddf558c3bfbdf03e16: network-1, network-16, network-3, network-4
c5d102de3b83458cd8132e3f2548eba762c21ab1: network-10, network-9
a94c9bed956e401ce7b6a123a9b346fb0ecf7cb9: network-11, network-23
434be5de95678a144fcc201ccdd5de2ccb1e9bd2: network-12, network-24
f946ed753556eeb940fff9e1cc917c0864ca8d61: network-13
10e26cdc3fc270f2a77cff5744074163d142ad47: network-14
d12513fc0a3ec0439083640a6b366d7c478d8c75: network-17, network-18, network-5, network-6
50806a21d5f34dbb85d098f4ecabdfe9c6e1052d: network-2
7a8f9bf7116d347b36de0394cab2c0ef28056fd4: network-20
2ee33d74438b11c8f8d77c46795f747ae42350a7: network-21
a12436ce936de85984532aa6e6106de076d4f1e3: network-22
b49f21325227d1dbd242c8f2e24f7bb9cbef40d8: network-7
b8652c9f1b37544e3c121ca18bb43b2eba68ce7f: network-8
----------------------------

the k = 4 network classes:
----------------------------
19a289b17734656ea7b096a10ebbcf128aed91a7: network-1, network-16, network-3, network-4
121e741ac6e8409ffef76b7622cfedbe00a1caab: network-10, network-9
7fb17280e48ed0db64ee57da273e7f0ddf142e27: network-11, network-23
7da034f87ed041af85945f8d4dd7911fffe8cf6c: network-12, network-24
75b0417b9b995e076c328bef991ae9b55cd6650e: network-13
ded50780072d4c51bd2441e75a01c0389d713402: network-14
3ada223550705dbc27fabbc19cb5e03983330344: network-17, network-18, network-5, network-6
631eae539d43d5402da693888f0231022bd12ca8: network-2
1988a1e18c7e0053814483f7d7cbef04da8a1d07: network-20
f37d01cb4642923151e0ada1b044a83fe5a02fa4: network-21
148619ef3a39ce65b167cadf71180234e45f920c: network-22
21b00bfd65d6608cef200d486357dbd1bceeacd3: network-7
9fc685480e63c1551a325b27b1992e5a14d1ede8: network-8
----------------------------

BTW, in reading around, I finally found an example where they are not isomorphic, but agree at k = 2, but disagree at k = 3 and k = 4. These graphs are given as an example of two graphs which have the same degree sequence, yet are not isomorphic. Since my code can distinguish them at k = 3, that means my code is doing something different than degree sequences. These graphs BTW, are network-21 and network-22. See the network classes just above. Here are their signatures:

network 21:
op |a> => |b>
op |b> => |a> + |c> + |f>
op |c> => |b> + |d>
op |d> => |c> + |e>
op |e> => |d>
op |f> => |b> + |g>
op |g> => |f> + |h>
op |h> => |g>

$ ./k_similarity_v2.py sw-examples/network-21.sw k
final 0 hash signature: 724e88406ed484a3f47b2c3d522b315261a6b265
final 1 hash signature: 49063e880677302c51a3d93774ed58c328d909a5
final 2 hash signature: 863a20a85da3519434ef062f6f6ce3d7ef205582
final 3 hash signature: 2ee33d74438b11c8f8d77c46795f747ae42350a7
final 4 hash signature: f37d01cb4642923151e0ada1b044a83fe5a02fa4

network 22:
op |a> => |b>
op |b> => |a> + |c> + |d>
op |c> => |b>
op |d> => |b> + |e>
op |e> => |d> + |f>
op |f> => |e> + |g>
op |g> => |f> + |h>
op |h> => |g>

final 0 hash signature: 724e88406ed484a3f47b2c3d522b315261a6b265
final 1 hash signature: 49063e880677302c51a3d93774ed58c328d909a5
final 2 hash signature: 863a20a85da3519434ef062f6f6ce3d7ef205582
final 3 hash signature: a12436ce936de85984532aa6e6106de076d4f1e3
final 4 hash signature: 148619ef3a39ce65b167cadf71180234e45f920c

Here is what they look like (borrowed from wikipedia):

And we can make pretty pictures of our other graphs, using sw2dot-v2.py and graphviz:
network 1:

network 2:

network 3:

network 5:

network 6:

network 7:

network 8:

And so on for our other networks.

Heh. Just occurred to me we can do second order network similarity. Though I'm not exactly sure why you would want to, other than to note it as a theoretical possibility. Perhaps it might be useful if you have a very large number of networks, and you are having trouble visually separating the results from different k's.
Consider this from above (and the rest):

the k = 0 network classes:
----------------------------
network-1, network-16, network-2, network-20, network-21, network-22, network-3, network-4
network-10, network-7, network-8, network-9
network-11, network-12, network-17, network-18, network-23, network-24, network-5, network-6
network-13, network-14
----------------------------

We can massage these into their own sw/networks:

2nd-order-k0-network:
op |1> => |network-1> + |network-16> + |network-2> + |network-20> + |network-21> + |network-22> + |network-3> + |network-4>
op |2> => |network-10> + |network-7> + |network-8> + |network-9>
op |3> => |network-11> + |network-12> + |network-17> + |network-18> + |network-23> + |network-24> + |network-5> + |network-6>
op |4> => |network-13> + |network-14>


2nd-order-k1-network:
op |1> => |network-1> + |network-16> + |network-2> + |network-20> + |network-3> + |network-4>
op |2> => |network-10> + |network-7> + |network-8> + |network-9>
op |3> => |network-11> + |network-12> + |network-23> + |network-24>
op |4> => |network-13> + |network-14>
op |5> => |network-17> + |network-18> + |network-5> + |network-6>
op |6> => |network-21> + |network-22>


2nd-order-k2-network:
op |1> => |network-1> + |network-16> + |network-3> + |network-4>
op |2> => |network-10> + |network-9>
op |3> => |network-11> + |network-23>
op |4> => |network-12> + |network-24>
op |5> => |network-13>
op |6> => |network-14>
op |7> => |network-17> + |network-18> + |network-5> + |network-6>
op |8> => |network-2>
op |9> => |network-20>
op |10> => |network-21> + |network-22>
op |11> => |network-7>
op |12> => |network-8>


2nd-order-k3-network:
op |1> => |network-1> + |network-16> + |network-3> + |network-4>
op |2> => |network-10> + |network-9>
op |3> => |network-11> + |network-23>
op |4> => |network-12> + |network-24>
op |5> => |network-13>
op |6> => |network-14>
op |7> => |network-17> + |network-18> + |network-5> + |network-6>
op |8> => |network-2>
op |9> => |network-20>
op |10> => |network-21>
op |11> => |network-22>
op |12> => |network-7>
op |13> => |network-8>


2nd-order-k4-network:
op |1> => |network-1> + |network-16> + |network-3> + |network-4>
op |2> => |network-10> + |network-9>
op |3> => |network-11> + |network-23>
op |4> => |network-12> + |network-24>
op |5> => |network-13>
op |6> => |network-14>
op |7> => |network-17> + |network-18> + |network-5> + |network-6>
op |8> => |network-2>
op |9> => |network-20>
op |10> => |network-21>
op |11> => |network-22>
op |12> => |network-7>
op |13> => |network-8>

2nd-order-k5-network:
op |1> => |network-1> + |network-16> + |network-3> + |network-4>
op |2> => |network-10> + |network-9>
op |3> => |network-11> + |network-23>
op |4> => |network-12> + |network-24>
op |5> => |network-13>
op |6> => |network-14>
op |7> => |network-17> + |network-18> + |network-5> + |network-6>
op |8> => |network-2>
op |9> => |network-20>
op |10> => |network-21>
op |11> => |network-22>
op |12> => |network-7>
op |13> => |network-8>

Now apply the classifier to these 2nd order networks:

$ ./k_classifier.py 0 sw-examples/2nd-order-k*.sw
the k = 0 network classes:
----------------------------
64c84e952453cd25d3097c7cfb8ba8178a0d109f: 2nd-order-k0-network
f89664dd10bc575affd2df12323f9fa1386bec50: 2nd-order-k1-network
ed7eef2a04f04526c88f43d3f111242289ec7a02: 2nd-order-k2-network
ef27f9e773963f00db02e38621d9f7bfa59e70be: 2nd-order-k3-network, 2nd-order-k4-network, 2nd-order-k5-network
----------------------------

Noting that there is no need to consider k > 0, you will get the same result for all k. This result also shows, if it wasn't already obvious, that we need to consider k = 3 before our classes stabilize. I don't know why you would want to, but no reason you couldn't consider 3rd order, and higher. Maybe like you have categories, categories of categories, and so on. NB: we kind of cheated! Our 2nd order networks are directed, not undirected.

Finally, lets loop back to our knowledge representation, and display the class results in BKO notation:

$ ./k_classifier.py 1 sw-examples/network-*.sw

the k = 1 network classes:
----------------------------
034f0b554b967c20407ecf3e4373a0c9dde3bdca: network-1, network-16, network-2, network-20, network-3, network-4
5328d58c5e05fa27d819aa197355238f2364456b: network-10, network-7, network-8, network-9
7ccf6241d30bfc88bd65e22dbdfe7488c67ca41d: network-11, network-12, network-23, network-24
cb38319a6c5110625a9085c7c110b213ea82621c: network-13, network-14
43361818928df1e25cf2d42eaf826b9b79e535ad: network-17, network-18, network-5, network-6
49063e880677302c51a3d93774ed58c328d909a5: network-21, network-22
----------------------------

2nd-order-k1-network:
class |1> => |network-1> + |network-16> + |network-2> + |network-20> + |network-3> + |network-4>
class |2> => |network-10> + |network-7> + |network-8> + |network-9>
class |3> => |network-11> + |network-12> + |network-23> + |network-24>
class |4> => |network-13> + |network-14>
class |5> => |network-17> + |network-18> + |network-5> + |network-6>
class |6> => |network-21> + |network-22>

hash |1> => |034f0b554b967c20407ecf3e4373a0c9dde3bdca>
hash |2> => |5328d58c5e05fa27d819aa197355238f2364456b>
hash |3> => |7ccf6241d30bfc88bd65e22dbdfe7488c67ca41d>
hash |4> => |cb38319a6c5110625a9085c7c110b213ea82621c>
hash |5> => |43361818928df1e25cf2d42eaf826b9b79e535ad>
hash |6> => |49063e880677302c51a3d93774ed58c328d909a5>

Whew! That's it I think. Next I will have to look into mapping more general sw files to classes. Though I think we will run into some trouble for superpositions that have coefficients not equal to 1.

Saturday, 5 March 2016

new tool: edge enhance

I'm slowly collecting together the pieces needed to start work on image processing. Today, some code that is a step in that direction. Recently I implemented image-load[foo.png], image-save[foo.png] and image-histogram[foo.png]. But it is vastly too slow to do many image processing tasks using images in a superposition representation. So the plan is to write python that does the processing, and then when finished it spits out a sw file we can then use in the console. We've kind of been doing that already, as seen by the collection of scripts here. And that further ties in with my idea of outsourcing sw file creation to webservers elsewhere on the net. The idea being, there is some computation that is either too expensive, or you don't know how to do. Send say an image to this host, it does the processing for you, then sends back an appropriate sw file that you can then use to do your desired task. Perhaps some deep learning image classifier that is too expensive to run on your home PC. Or in general, a large variety of other processing.

So today, I finally implemented edge-enhance in python. Interestingly enough, I had a BKO version many months back, but it took over 20 hours per image!! So I did a couple of test images (Lenna and a child), and then never used it again. The details and results of that are here. Thankfully the new version only takes seconds to minutes, depending on how large you set the enhance factor, and the size of the image.

The general algo for my edge-enhance code is:

Gaussian smooth the image k times, where k is our enhance-factor
subtract the original image from that
then massage the resulting pixels a little

where the 1D version of the Gaussian smooth is (which rapidly converges to a nice bell curve after a few iterations):

f[k] -> f[k-1]/4 + f[k]/2 + f[k+1]/4

and the 2D version is:

  def smooth_pixel(M,w,h):
    r = M[h-1][w-1]/16 + M[h][w-1]/16 + M[h+1][w-1]/16 + M[h-1][w]/16 + M[h][w]/2 + M[h+1][w]/16 + M[h-1][w+1]/16 + M[h][w+1]/16 + M[h+1][w+1]/16
    return r

Here is the code, but probably too large to include here.
The usage is simply:

Usage: ./image_edge_enhance_v2.py image.{png,jpg} [enhance-factor]

if enhance-factor is not given, it defaults to 20.

Now, some examples:
Lenna:

child:

smooth gradient:

air-balloons:

FilterRobertsExample:

So, from those examples we can see it works rather well! And if not quite well enough, you can ramp up the enhance-factor.

Now, some comments:
1) the code neatly maps regions of the same colour, or smoothly varying colour, to white, as our smooth-gradient example shows. This is because the Gaussian smooth of smoothly varying colour is pretty much unchanged from the original image.
2) it seems to work a lot better than first or second order discrete derivatives at finding edges. Indeed, for k iterations of smooth, each result pixel is dependent on all pixels within a radius of k of that pixel. Discrete derivatives are much more local.
3) larger images tend to need larger enhance-factor. So maybe 40 instead of 20.
4) the code sometimes won't load certain images. I don't yet know why.
5) the balloon example shows the output can be sometimes like a childs drawing. I take this as a big hint that the human brain is doing something similar
6) "smoothed_image - original_image" is very close to something I called "general-to-specific". I have a small amount of code that implements this idea, but never got around to using it. The idea is say faces. In the process of seeing many faces, you average them together. Then the superposition for a specific face is this general/average face subtract the specific face. My edge_enhance is very close to this idea, but with a smoothed image in place of an average image (though a smoothed image and an average image are pretty much the same thing). Indeed, babies are known to have very blurry vision for the first few months. So there must be something to this idea of programming neurons with average/blurry images first, and then with sharp images later.
7) the last example, taken from here, hints that maybe this code is actually useful in say pre-processing images of cells looking for say cancer cells?
8) potentially we could tweak the algo. I'm thinking the massage_pixel() code in particular.

  def massage_pixel(x):
    if x < 0:
      x = 0
    x *= 20
    x = int(x)
    if x > 255:
      x = 255
    return 255 - x

9) the first version of my code had a subtle bug. After each iteration of smooth the result was cast back to integers, and saved in image format. Turns out this is enough to break the algo, so that a lot of features you would like enhanced actually disappear! Also had a weird convergence effect, where above a certain value, the enhance-factor variable didn't really change the result much. After some work I finally implemented a version where smooth kept the float results until the very end, before mapping back to integers. This was a big improvement, with little features now visible again, and the convergence property of enhance-factor went away. If you keep increasing it, you get markedly different results. Interestingly my original BKO version did not have this bug, as could be seen by different resulting images, and that was the hint I needed that my first python implementation had a bug.
10) Next on my list for image processing is image-ngrams (a 2D version of letter-ngrams), and then throw them at a tweak of average-categorize.

Update: I now have image-ngrams. Code here. Next will be image-to-sp, then tweak average-categorize, then sp-to-image. Then some thinking time of where to go after that.

Also, just want to note that image-edge-enhance is possibly somewhat invariant with changes in lighting levels. Need to test it I suppose, but I suspect it. Recall we claimed invariances of our object to superposition mappings are a good thing!

Update: a couple of more edge enhance examples from wikipedia: edge-enhancement, and edge-detection (first is the original image, the rest are edge enhanced):

For this last set, view the large image to see the details and the differences between different edge enhancement values.

Thursday, 3 March 2016

image histogram similarity

In this post I'm going to briefly look into image histogram similarity. Recall our pattern recognition claim:

"we can make a lot of progress in pattern recognition if we can find mappings from objects to well-behaved, deterministic, distinctive superpositions"

where:
1) well-behaved means similar objects return similar superpositions
2) deterministic means if you feed in the same object, you get essentially the same superposition (though there is a little lee-way in that it doesn't have to be 100% identical on each run, but close)
3) distinctive means different object types have easily distinguishable superpositions.

and the exact details of the superposition are irrelevant. Our simm will work just fine. We only need to satisfy (1), (2) and (3) and we are done!

So, today I discovered a very cheap one for images, that has some nice, though not perfect, properties. Simply map an image to a histogram. The Pillow image processing library makes this only a couple of lines of code:

from PIL import Image
im = Image.open("Lenna.png")
result = im.histogram()

Here are some resulting histograms:
Lenna.png:

small-lenna.png (note that it has roughly the same shape as Lenna.png, but smaller amplitude):

child.png:

three-wolfmoon-output.png:

Now let's throw them at simm. Noting first that we have a function-operator wrapper around im.histogram() "image-histogram[image.png]" that returns a superposition representation of the histogram list:

sa: the |Lenna sp> => image-histogram[Lenna.png] |>
sa: the |small Lenna sp> => image-histogram[small-lenna.png] |>
sa: the |child sp> => image-histogram[child.png] |>
sa: the |wolfmoon sp> => image-histogram[three-wolfmoon-output.png] |>
sa: table[sp,coeff] 100 self-similar[the] |Lenna sp>
+----------------+--------+
| sp             | coeff  |
+----------------+--------+
| Lenna sp       | 100.0  |
| small Lenna sp | 74.949 |
| child sp       | 50.123 |
| wolfmoon sp    | 30.581 |
+----------------+--------+

sa: table[sp,coeff] 100 self-similar[the] |child sp>
+----------------+--------+
| sp             | coeff  |
+----------------+--------+
| child sp       | 100    |
| small Lenna sp | 71.036 |
| Lenna sp       | 50.123 |
| wolfmoon sp    | 44.067 |
+----------------+--------+

sa: table[sp,coeff] 100 self-similar[the] |wolfmoon sp>
+----------------+--------+
| sp             | coeff  |
+----------------+--------+
| wolfmoon sp    | 100.0  |
| child sp       | 44.067 |
| small Lenna sp | 43.561 |
| Lenna sp       | 30.581 |
+----------------+--------+

So it all works quite well, and for essentially zero work! And it should be reasonably behaved with respect to rotating, shrinking (since only the shape of superpositions and not the amplitude matters to simm), adding some noise, removing part of the image, and adding in small sub-images. Note however that it is a long, long way from a general purpose image classifier (I'm confident that will require several layers of processing, here we are doing only one), but is good for very cheap image similarity detection. eg, it would probably easily classify scenes in a movie that are similar, but with a few objects/people/perspective changed.

Now, how would we do the more interesting general purpose image classifier? I of course don't know yet, but I suspect we can get someway towards that using what I call image-ngrams, and our average-categorize code. An image-ngram is named after letter/word ngrams, but is usually called an image partition. The analogy is that just like letter-ngrams[3] splits "ABCDEFG" into "ABC" + "BCD" + "CDE" + "DEF" + "EFG", image-ngrams[3] will partition an image into 3*3 squares. Another interpretation is that letter-ngrams are 1D ngrams, while image-ngrams are 2D. The plan is to then map those small images to a superposition representation, and apply average-categorize to those. With any luck the first layer of average-categorize will self tune to detect edge directions. ie, horizontal lines, or 45 degree lines and so on. It may not work so easily, but I plan to test it soon.

Hrmm... perhaps we could roughly measure how good our object to superposition mappings are, by the number of invariances they have. The more the better the mapping. The image to histogram mapping already has quite a few! On top of those I mentioned above, there are others. You could cut an image horizontally in half, and swap the top for the bottom and you would get a 100% match. Indeed, any shuffling of pixel locations, but not values, will also give a 100% match.

Tuesday, 16 February 2016

towards a definition of intelligence

So, have we done enough work that we can now make a reasonable guess at a definition of intelligence? Let's see. In my travels I have seen one definition being, an intelligent agent, given a current situation, will manipulate things so as to maximize the number of potential future states. So, if such an agent is stuck in a valley, it will climb to the top of the hill to maximize its potential pathways.

Mathematically, roughly (in a simplified 1 dimension):

F = d/dx V(x)

where V(x) is the landscape, and F is the direction you want to head.

I have an alternate definition:
1) given the agents current state (as represented by some superposition) find a pathway (as represented by some operator sequence) to your desired state (again, represented by some superposition). The quicker the agent can do this, and the shorter the pathway, then the more intelligence points we give that agent. Noting that for sufficiently hard problems, most agents won't be able to find a pathway at all.
2) given an object the agent wishes to understand, how well constructed is the agents internal representation of that object. At one extreme we have rote learning, say you recall an objects definition word for word, with essentially no understanding. At the other we have a very dense network linking the object with the rest of the knowledge in the agents memory store. The denser the network, the more intelligence points we give that agent. And I suppose we should give some points for speed as well.

Comments:
1) the above is somewhat dependent on the agent already have a large body of knowledge. This isn't perfect since young children do not have as much knowledge as an adult, but in some regards are far more intelligent than adults. Frankly, it is hard work to boot-strap from nothing to a thorough understanding of the world.
2) if you ever watch Richard Feynman talk, it is obvious he had a very dense network representation of physics in his head. Everything was linked to everything. This gives him lots of (2) points in my scheme, but then he was a physics genius!
3) OK. So how do we build an intelligent agent? Heh. No one knows!! My guess is that it requires at least three components: 1) a processing center (eg the neocortex), 2) a memory system (eg the hippocampus), and 3) an attention system (eg the thalamus). I personally think the attention system is the most important of the three. We need some system to filter and only attend to what is currently important, and to dynamically change attention as needed. Indeed, this sounds an awful lot like a von Neumann architecture computer! With CPU, RAM and instruction pointer (as the attention system). But in detail they are quite different. Especially the attention system, what I have in mind is a lot more involved than an IP.
3) superpositions and operator sequences should be sufficient to represent any current state, or pathway between states, of interest. That being my main thesis of the project! Is there anything that can't be represented this way? I don't know. But the implication would be that a human brain couldn't represent it either.

Sunday, 14 February 2016

new operators: guess-ket and guess-operator

I decided it might be useful to have a couple of operators that guess the ket or the operator, even if you don't know their name exactly. Don't have a strong use-case yet, but seems to be something that humans do, so should presumably be useful eventually.

There are three variations of each:

guess-ket |ket>
guess-ket[k] |ket>
guess-ket[*] |ket>

The first one just gives you the best matching ket. The second returns the top k matches. The third gives all matches with similarity > 0.

Likewise, we have:

guess-operator[op]
guess-operator[op,k]
guess-operator[op,*]

where the first one gives you the best matching operator. The second gives the top k matches. The third gives all of them with simm > 0.

Now, for a little bit of the details in the back-ground. We basically use the similarity metric on the superpositions created by the process_string() function, against all known kets "context.relevant_kets("*")" or known supported operators "context.supported_operators()":

  def process_string(s):
    one = ket(s.lower())
    return make_ngrams(one,'1,2,3','letter')

Now, a couple of examples:

-- learn a little knowledge:
the-age-of |Fred> => |age: 27>
the-age-of |Frank> => |age: 33>
the-age-of |Robert> => |age: 29>
the-age-of |Rob> => |age: 31>

-- return only the best match to "freddie":
sa: guess-ket |freddie>
0.611|Fred>

-- see all matches to "freddie":
sa: guess-ket[*] |freddie>
0.611|Fred> + 0.167|Frank> + 0.122|Robert> + 0.056|Rob>

-- now try the guess-operator[]:
sa: guess-operator[age]
0.259|op: the-age-of>

-- who is "roberto"?
sa: guess-ket |roberto>
0.844|Robert>

-- one potential use case. Guess the operator and the ket:
sa: apply(guess-operator[age] |>,guess-ket |roberto>)
0.219|age: 29>

NB: in this last example we used "guess-operator[age] |>". Note especially the |> tacked on the end. We need this so it is parsed as a (compound) superposition. In the console though, it is not mandatory, and I often get lazy and leave it out. A similar thing applies to rel-kets[op] and probably some other function operators. If something doesn't work as expected, put |> in, and that should fix it. Indeed, best practice is to always include it!

That's probably about it for this post. Though I have to wonder with my if-then machines, if guess-operator, and guess-age are redundant? Don't know. Time will tell!

If interested, the code for these are at the bottom of the functions code, with names "guess_ket" and "guess_operator". Just CTRL-F to find them.

That's it for this post.

Friday, 12 February 2016

learning simple images using if-then machines

Today, let's play with simple images from ages ago. BTW, I call them "simple images" because we don't need to translate, rotate, magnify or otherwise align (which we would with more general images), and we restrict pixel values to 0 or 1. This is to make things easier. We will of course eventually try for more general or typical images sometime in the future, but they are distinctly harder! And require many layers of if-then machines. eg, the brain has roughly 20 layers in the visual cortex.

Here are our images:

|letter: H>
#   #
#   #
#   #
#####
#   #
#   #
#   #

|noisy: H>
    #
#   #
#   #
### #
#    
#   #
#   #

|noisy: H2>
#   #
#    
# ###
#####
##  #
#   #
### #

|letter: I>
#####
  #
  #
  #
  #
  #
#####

|noisy: I>
####
  #
  
  
  #
  #
# ###

|noisy: I2>
##  #
 ###
  #
  #
  ###
####
#####

|letter: O>
######
#    #
#    #
#    #
#    #
#    #
######

Now, let's define our 3 if-then machines:

load H-I-pat-rec.sw

image |node: 1: 1> => pixels |letter: H>
image |node: 1: 2> => pixels |noisy: H>
image |node: 1: 3> => pixels |noisy: H2>
then |node: 1: *> => |letter H>

image |node: 2: 1> => pixels |letter: I>
image |node: 2: 2> => pixels |noisy: I>
image |node: 2: 3> => pixels |noisy: I2>
then |node: 2: *> => |letter I>

image |node: 3: 1> => pixels |letter: O>
then |node: 3: *> => |letter O>

the |list of images> => |node: 1: 1> + |node: 1: 2> + |node: 1: 3> + |node: 2: 1> + |node: 2: 2> + |node: 2: 3> + |node: 3: 1>
which-image |*> #=> then select[1,1] similar-input[image] image |_self>

Note that today I used "select[1,1]" instead of "drop-below[]". This just means select the first element in the superposition, and noting that similar-input[op] sorts its results.
Now, put "which-image" to use:

sa: which-image |node: 2: 3>
1.0|letter I>

sa: which-image |node: 1: 2>
1.0|letter H>

-- now, choose images randomly, and see what we get:
-- noting we are leaving in the INFO: lines, that I normally chomp out. This is so we can see which kets pick-elt has chosen.
sa: which-image pick-elt the |list of images>
INFO: ket: list of images
INFO: ket: node: 1: 2
INFO: ket: node: 1: 2
INFO: ket: node: 1: 2
1.0|letter H>

sa: which-image pick-elt the |list of images>
INFO: ket: list of images
INFO: ket: node: 3: 1
INFO: ket: node: 3: 1
INFO: ket: node: 3: 1
1.0|letter O>

sa: which-image pick-elt the |list of images>
INFO: ket: list of images
INFO: ket: node: 1: 3
INFO: ket: node: 1: 3
INFO: ket: node: 1: 3
1.0|letter H>

sa: which-image pick-elt the |list of images>
INFO: ket: list of images
INFO: ket: node: 2: 2
INFO: ket: node: 2: 2
INFO: ket: node: 2: 2
1.0|letter I>

sa: which-image pick-elt the |list of images>
INFO: ket: list of images
INFO: ket: node: 2: 3
INFO: ket: node: 2: 3
INFO: ket: node: 2: 3
1.0|letter I>

-- and so on!

Now for a couple of comments:

1) if you look under the hood, the above is quite boring! We are not making much use of similar-input[op] at all, in that we are feeding in, and detecting, exact patterns. The only interesting bit is that it is pooling the different image types. Hrmm... let's try for some noisy examples:

sa: then select[1,1] similar-input[image] absolute-noise[1] image |node: 1: 1>
0.919|letter H>

sa: then select[1,1] similar-input[image] absolute-noise[1] image |node: 2: 3>
0.907|letter I>

sa: then select[1,1] similar-input[image] absolute-noise[30] image |node: 1: 2>
0.761|letter H>

sa: then select[1,1] similar-input[image] absolute-noise[30] image |node: 3: 1>
0.738|letter O>

Heh. Even at absolute-noise[30] we are still matching at over 70%. And now we are clearly using the similarity metric, and "fuzzy matching".
2) support vector machines talk about patterns to classify as linearly separable. Well, in the world of superpositions, linearly separable doesn't really even make sense. And similar-input[op] doesn't care either, and works on any superposition type.
3) "which-image" is linear, which we can see with this:

sa: which-image the |list of images>
3|letter H> + 3|letter I> + 1.0|letter O>

4) finally, here is what we now know:

sa: dump
----------------------------------------
|context> => |context: H I pat rec>

pixels |letter: H> => |pixel: 1: 1> + |pixel: 1: 5> + |pixel: 2: 1> + |pixel: 2: 5> + |pixel: 3: 1> + |pixel: 3: 5> + |pixel: 4: 1> + |pixel: 4: 2> + |pixel: 4: 3> + |pixel: 4: 4> + |pixel: 4: 5> + |pixel: 5: 1> + |pixel: 5: 5> + |pixel: 6: 1> + |pixel: 6: 5> + |pixel: 7: 1> + |pixel: 7: 5>
dim-1 |letter: H> => |dimension: 5>
dim-2 |letter: H> => |dimension: 7>

pixels |noisy: H> => |pixel: 1: 5> + |pixel: 2: 1> + |pixel: 2: 5> + |pixel: 3: 1> + |pixel: 3: 5> + |pixel: 4: 1> + |pixel: 4: 2> + |pixel: 4: 3> + |pixel: 4: 5> + |pixel: 5: 1> + |pixel: 6: 1> + |pixel: 6: 5> + |pixel: 7: 1> + |pixel: 7: 5>
dim-1 |noisy: H> => |dimension: 5>
dim-2 |noisy: H> => |dimension: 7>

pixels |noisy: H2> => |pixel: 1: 1> + |pixel: 1: 5> + |pixel: 2: 1> + |pixel: 3: 1> + |pixel: 3: 3> + |pixel: 3: 4> + |pixel: 3: 5> + |pixel: 4: 1> + |pixel: 4: 2> + |pixel: 4: 3> + |pixel: 4: 4> + |pixel: 4: 5> + |pixel: 5: 1> + |pixel: 5: 2> + |pixel: 5: 5> + |pixel: 6: 1> + |pixel: 6: 5> + |pixel: 7: 1> + |pixel: 7: 2> + |pixel: 7: 3> + |pixel: 7: 5>
dim-1 |noisy: H2> => |dimension: 5>
dim-2 |noisy: H2> => |dimension: 7>

pixels |letter: I> => |pixel: 1: 1> + |pixel: 1: 2> + |pixel: 1: 3> + |pixel: 1: 4> + |pixel: 1: 5> + |pixel: 2: 3> + |pixel: 3: 3> + |pixel: 4: 3> + |pixel: 5: 3> + |pixel: 6: 3> + |pixel: 7: 1> + |pixel: 7: 2> + |pixel: 7: 3> + |pixel: 7: 4> + |pixel: 7: 5>
dim-1 |letter: I> => |dimension: 5>
dim-2 |letter: I> => |dimension: 7>

pixels |noisy: I> => |pixel: 1: 1> + |pixel: 1: 2> + |pixel: 1: 3> + |pixel: 1: 4> + |pixel: 2: 3> + |pixel: 5: 3> + |pixel: 6: 3> + |pixel: 7: 1> + |pixel: 7: 3> + |pixel: 7: 4> + |pixel: 7: 5>
dim-1 |noisy: I> => |dimension: 5>
dim-2 |noisy: I> => |dimension: 7>

pixels |noisy: I2> => |pixel: 1: 1> + |pixel: 1: 2> + |pixel: 1: 5> + |pixel: 2: 2> + |pixel: 2: 3> + |pixel: 2: 4> + |pixel: 3: 3> + |pixel: 4: 3> + |pixel: 5: 3> + |pixel: 5: 4> + |pixel: 5: 5> + |pixel: 6: 1> + |pixel: 6: 2> + |pixel: 6: 3> + |pixel: 6: 4> + |pixel: 7: 1> + |pixel: 7: 2> + |pixel: 7: 3> + |pixel: 7: 4> + |pixel: 7: 5>
dim-1 |noisy: I2> => |dimension: 5>
dim-2 |noisy: I2> => |dimension: 7>

pixels |letter: O> => |pixel: 1: 1> + |pixel: 1: 2> + |pixel: 1: 3> + |pixel: 1: 4> + |pixel: 1: 5> + |pixel: 1: 6> + |pixel: 2: 1> + |pixel: 2: 6> + |pixel: 3: 1> + |pixel: 3: 6> + |pixel: 4: 1> + |pixel: 4: 6> + |pixel: 5: 1> + |pixel: 5: 6> + |pixel: 6: 1> + |pixel: 6: 6> + |pixel: 7: 1> + |pixel: 7: 2> + |pixel: 7: 3> + |pixel: 7: 4> + |pixel: 7: 5> + |pixel: 7: 6>
dim-1 |letter: O> => |dimension: 6>
dim-2 |letter: O> => |dimension: 7>

image |node: 1: 1> => |pixel: 1: 1> + |pixel: 1: 5> + |pixel: 2: 1> + |pixel: 2: 5> + |pixel: 3: 1> + |pixel: 3: 5> + |pixel: 4: 1> + |pixel: 4: 2> + |pixel: 4: 3> + |pixel: 4: 4> + |pixel: 4: 5> + |pixel: 5: 1> + |pixel: 5: 5> + |pixel: 6: 1> + |pixel: 6: 5> + |pixel: 7: 1> + |pixel: 7: 5>

image |node: 1: 2> => |pixel: 1: 5> + |pixel: 2: 1> + |pixel: 2: 5> + |pixel: 3: 1> + |pixel: 3: 5> + |pixel: 4: 1> + |pixel: 4: 2> + |pixel: 4: 3> + |pixel: 4: 5> + |pixel: 5: 1> + |pixel: 6: 1> + |pixel: 6: 5> + |pixel: 7: 1> + |pixel: 7: 5>

image |node: 1: 3> => |pixel: 1: 1> + |pixel: 1: 5> + |pixel: 2: 1> + |pixel: 3: 1> + |pixel: 3: 3> + |pixel: 3: 4> + |pixel: 3: 5> + |pixel: 4: 1> + |pixel: 4: 2> + |pixel: 4: 3> + |pixel: 4: 4> + |pixel: 4: 5> + |pixel: 5: 1> + |pixel: 5: 2> + |pixel: 5: 5> + |pixel: 6: 1> + |pixel: 6: 5> + |pixel: 7: 1> + |pixel: 7: 2> + |pixel: 7: 3> + |pixel: 7: 5>

then |node: 1: *> => |letter H>

image |node: 2: 1> => |pixel: 1: 1> + |pixel: 1: 2> + |pixel: 1: 3> + |pixel: 1: 4> + |pixel: 1: 5> + |pixel: 2: 3> + |pixel: 3: 3> + |pixel: 4: 3> + |pixel: 5: 3> + |pixel: 6: 3> + |pixel: 7: 1> + |pixel: 7: 2> + |pixel: 7: 3> + |pixel: 7: 4> + |pixel: 7: 5>

image |node: 2: 2> => |pixel: 1: 1> + |pixel: 1: 2> + |pixel: 1: 3> + |pixel: 1: 4> + |pixel: 2: 3> + |pixel: 5: 3> + |pixel: 6: 3> + |pixel: 7: 1> + |pixel: 7: 3> + |pixel: 7: 4> + |pixel: 7: 5>

image |node: 2: 3> => |pixel: 1: 1> + |pixel: 1: 2> + |pixel: 1: 5> + |pixel: 2: 2> + |pixel: 2: 3> + |pixel: 2: 4> + |pixel: 3: 3> + |pixel: 4: 3> + |pixel: 5: 3> + |pixel: 5: 4> + |pixel: 5: 5> + |pixel: 6: 1> + |pixel: 6: 2> + |pixel: 6: 3> + |pixel: 6: 4> + |pixel: 7: 1> + |pixel: 7: 2> + |pixel: 7: 3> + |pixel: 7: 4> + |pixel: 7: 5>

then |node: 2: *> => |letter I>

image |node: 3: 1> => |pixel: 1: 1> + |pixel: 1: 2> + |pixel: 1: 3> + |pixel: 1: 4> + |pixel: 1: 5> + |pixel: 1: 6> + |pixel: 2: 1> + |pixel: 2: 6> + |pixel: 3: 1> + |pixel: 3: 6> + |pixel: 4: 1> + |pixel: 4: 6> + |pixel: 5: 1> + |pixel: 5: 6> + |pixel: 6: 1> + |pixel: 6: 6> + |pixel: 7: 1> + |pixel: 7: 2> + |pixel: 7: 3> + |pixel: 7: 4> + |pixel: 7: 5> + |pixel: 7: 6>

then |node: 3: *> => |letter O>

the |list of images> => |node: 1: 1> + |node: 1: 2> + |node: 1: 3> + |node: 2: 1> + |node: 2: 2> + |node: 2: 3> + |node: 3: 1>

which-image |*> #=> then select[1,1] similar-input[image] image |_self>
----------------------------------------

And that's it for this post. And I need thinking time to find more interesting if-then machine examples.

Thursday, 11 February 2016

learning days of the week using if-then machines

Today, an example of learning days of the week using 7 if-then machines. Note that if-then machines are probably over-kill if you spell your days correctly. In this post we make use of string similarity using letter-ngrams.

Here is the code:

  context weekday if-then machines
  ngrams |*> #=> letter-ngrams[1,2,3] lower-case |_self>

  day |node: 1: 1> => ngrams |Monday>
  day |node: 1: 2> => ngrams |mon>
  day |node: 1: 3> => ngrams |Mo>
  previous |node: 1: *> => |Sunday>
  id |node: 1: *> => |Monday>
  next |node: 1: *> => |Tuesday>
 
  day |node: 2: 1> => ngrams |Tuesday>
  day |node: 2: 2> => ngrams |tue>
  day |node: 2: 3> => ngrams |Tu>
  previous |node: 2: *> => |Monday>
  id |node: 2: *> => |Tuesday>
  next |node: 2: *> => |Wednesday>

  day |node: 3: 1> => ngrams |Wednesday>
  day |node: 3: 2> => ngrams |wed>
  day |node: 3: 3> => ngrams |We>
  previous |node: 3: *> => |Tuesday>
  id |node: 3: *> => |Wednesday>
  next |node: 3: *> => |Thursday>

  day |node: 4: 1> => ngrams |Thursday>
  day |node: 4: 2> => ngrams |thurs>
  day |node: 4: 3> => ngrams |Th>
  previous |node: 4: *> => |Wednesday>
  id |node: 4: *> => |Thursday>
  next |node: 4: *> => |Friday>

  day |node: 5: 1> => ngrams |Friday>
  day |node: 5: 2> => ngrams |fri>
  day |node: 5: 3> => ngrams |Fr>
  previous |node: 5: *> => |Thursday>
  id |node: 5: *> => |Friday>
  next |node: 5: *> => |Saturday>

  day |node: 6: 1> => ngrams |Saturday>
  day |node: 6: 2> => ngrams |sat>
  day |node: 6: 3> => ngrams |Sa>
  previous |node: 6: *> => |Friday>
  id |node: 6: *> => |Saturday>
  next |node: 6: *> => |Sunday>

  day |node: 7: 1> => ngrams |Sunday>
  day |node: 7: 2> => ngrams |sun>
  day |node: 7: 3> => ngrams |Su>
  previous |node: 7: *> => |Saturday>
  id |node: 7: *> => |Sunday>
  next |node: 7: *> => |Monday>

  yesterday |*> #=> previous drop-below[0.65] similar-input[day] ngrams |_self>
  today |*> #=> id drop-below[0.65] similar-input[day] ngrams |_self>
  tomorrow |*> #=> next drop-below[0.65] similar-input[day] ngrams |_self>

Now, some example usages in the console:

-- correct spelling means coeff = 1:
sa: tomorrow |sun>
1.0|Monday>

-- spelling is not perfect, but close enough (with respect to strings mapped to letter-ngrams) that we can guess what was meant:
sa: tomorrow |tues>
0.667|Wednesday>

-- making use of operator exponentiation. In this case equivalent to "tomorrow tomorrow tomorrow"
-- also note the coeff propagates. If we shoved a "clean" sigmoid in the "yesterday, today and tomorrow" operators, we could change that behaviour.
-- eg: yesterday |*> #=> previous clean drop-below[0.65] similar-input[day] ngrams |_self>
sa: tomorrow^3 |tues>
0.667|Friday>

-- "tomorrow" and "yesterday" are perfect inverses of each other:
sa: tomorrow yesterday |fri>
1.0|Friday>

sa: yesterday tomorrow |fri>
1.0|Friday>

-- mapping abbreviation to the full word:
sa: today |Sa>
|Saturday>

sa: yesterday |thurs>
|Wednesday>

-- typo, "thrusday" instead of "thursday", but the code guessed what we meant.
-- this is one of the main benefits of the if-then machines, you usually don't have to get the input exactly right (depending on how you set drop threshold t).
sa: yesterday |thrusday>
0.667|Wednesday>

-- this is an example of over-counting, I suppose you could call it.
-- since "thursd" matched both:
-- day |node: 4: 1> => ngrams |Thursday>
-- day |node: 4: 2> => ngrams |thurs>
-- we briefly mentioned this possibility in my first if-then machine post.
sa: yesterday |thursd>
1.514|Wednesday>

-- Next, we have a couple of function operators that return todays time and date:
sa: current-time
|time: 20:33:16>

sa: current-date
|date: 2016-02-11>

-- and we have another function operator that converts dates to days of the week:
-- what day of the week is New Year:
sa: day-of-the-week |date: 2016-1-1>
|day: Friday>

-- what day of the week is today?:
sa: day-of-the-week current-date
|day: Thursday>

-- what day was it three days ago?
-- NB: not a 100% match because of the "day: " prefix.
sa: yesterday^3 day-of-the-week current-date
0.702|Monday>

-- if you care about that one fix is to remove the category or extract the value:
-- another fix is to add more patterns to our if-then machines
-- eg:
-- day |node: 2: 4> => ngrams |day: Tuesday>
-- day |node: 2: 5> => ngrams |day: tue>
-- day |node: 2: 6> => ngrams |day: Tu>
-- there are other possible fixes too.
-- eg:
-- ngrams |*> #=> letter-ngrams[1,2,3] lower-case extract-value |_self>
sa: extract-value day-of-the-week current-date
|Thursday>

-- what day was it three days ago?
sa: yesterday^3 extract-value day-of-the-week current-date
1.0|Monday>

-- what day is it five days from now?
sa: tomorrow^5 extract-value day-of-the-week current-date
1.0|Tuesday>

-- now, our "tomorrow, yesterday and today" operators are linear (since they are defined with a |*> rule).
-- so a quick demonstration of that:
sa: tomorrow^3 (|Monday> + |Tuesday> + |Saturday>)
1.0|Thursday> + 1.0|Friday> + |Tuesday>
-- and similarly for the other two operators.

-- finally, weekdays are mod 7:
sa: tomorrow^7 |thurs>
1.0|Thursday>

sa: yesterday^21 |thurs>
|Thursday>

I guess that is about it. A fairly simple, somewhat useful, 7 if-then machine system. And an observation I want to make. Usually operator definition time is on the ugly side. As it kind of is above. But operator application time is usually quite clean. I think this is not a bad property to have, though I didn't really design it that way, it was just the way it turned out. So perhaps one use case is that if defining desired operators is too messy for you personally, then find them implemented elsewhere on the net and just web-load the sw file. Heh, assuming I can get anyone interested in the sw file format!

A couple of comments:
1) I had to hand tweak the drop-below threshold to 0.65. If I set it too much higher than that then I wasn't matching things I wanted to. And if I set it to 0.6 then "Sunday" and "Monday" matched.

sa: id drop-below[0.6] similar-input[day] ngrams |Monday>
1.0|Monday> + 0.6|Sunday>

2) If my proposition that if-then machines are a fairly good mathematical approximation to biological neurons, then the above is only a 7 neuron system. The brain has trillions of neurons! That is a lot of processing power!! Though our ngrams operator probably needs a few neurons too. I don't really know at this point how many.
3) here is one way to find the full set of days, given a starting day. Not sure it is all that useful in this particular case, but hey, probably is for other if-then machine systems.

sa: exp-max[tomorrow] |Monday>
2|Monday> + 1.0|Tuesday> + |Wednesday> + 1.0|Thursday> + |Friday> + 1.0|Saturday> + 1.0|Sunday>

Whether we want to tweak exp-max[] so that it doesn't over-count, I'm not yet sure. Probably cleaner if we did.
4) we can define things like the "day-after-tomorrow" operator:

-- define the operator:
sa: day-after-tomorrow |*> #=> tomorrow^2 day-of-the-week current-date |>

-- invoke it:
sa: day-after-tomorrow |x>
0.702|Saturday>

Noting the 0.7 coeff is from the "day: " prefix. And we could define plenty of others, like "day-before-yesterday", and so on.
5) for completeness, here is what we now know:

sa: dump
----------------------------------------
|context> => |context: weekday if-then machines>

ngrams |*> #=> letter-ngrams[1,2,3] lower-case |_self>
yesterday |*> #=> previous drop-below[0.65] similar-input[day] ngrams |_self>
today |*> #=> id drop-below[0.65] similar-input[day] ngrams |_self>
tomorrow |*> #=> next drop-below[0.65] similar-input[day] ngrams |_self>
day-after-tomorrow |*> #=> tomorrow^2 day-of-the-week current-date |>

day |node: 1: 1> => |m> + |o> + |n> + |d> + |a> + |y> + |mo> + |on> + |nd> + |da> + |ay> + |mon> + |ond> + |nda> + |day>

day |node: 1: 2> => |m> + |o> + |n> + |mo> + |on> + |mon>

day |node: 1: 3> => |m> + |o> + |mo>

previous |node: 1: *> => |Sunday>
id |node: 1: *> => |Monday>
next |node: 1: *> => |Tuesday>

day |node: 2: 1> => |t> + |u> + |e> + |s> + |d> + |a> + |y> + |tu> + |ue> + |es> + |sd> + |da> + |ay> + |tue> + |ues> + |esd> + |sda> + |day>

day |node: 2: 2> => |t> + |u> + |e> + |tu> + |ue> + |tue>

day |node: 2: 3> => |t> + |u> + |tu>

previous |node: 2: *> => |Monday>
id |node: 2: *> => |Tuesday>
next |node: 2: *> => |Wednesday>

day |node: 3: 1> => |w> + 2|e> + 2|d> + |n> + |s> + |a> + |y> + |we> + |ed> + |dn> + |ne> + |es> + |sd> + |da> + |ay> + |wed> + |edn> + |dne> + |nes> + |esd> + |sda> + |day>

day |node: 3: 2> => |w> + |e> + |d> + |we> + |ed> + |wed>

day |node: 3: 3> => |w> + |e> + |we>

previous |node: 3: *> => |Tuesday>
id |node: 3: *> => |Wednesday>
next |node: 3: *> => |Thursday>

day |node: 4: 1> => |t> + |h> + |u> + |r> + |s> + |d> + |a> + |y> + |th> + |hu> + |ur> + |rs> + |sd> + |da> + |ay> + |thu> + |hur> + |urs> + |rsd> + |sda> + |day>

day |node: 4: 2> => |t> + |h> + |u> + |r> + |s> + |th> + |hu> + |ur> + |rs> + |thu> + |hur> + |urs>

day |node: 4: 3> => |t> + |h> + |th>

previous |node: 4: *> => |Wednesday>
id |node: 4: *> => |Thursday>
next |node: 4: *> => |Friday>

day |node: 5: 1> => |f> + |r> + |i> + |d> + |a> + |y> + |fr> + |ri> + |id> + |da> + |ay> + |fri> + |rid> + |ida> + |day>

day |node: 5: 2> => |f> + |r> + |i> + |fr> + |ri> + |fri>

day |node: 5: 3> => |f> + |r> + |fr>

previous |node: 5: *> => |Thursday>
id |node: 5: *> => |Friday>
next |node: 5: *> => |Saturday>

day |node: 6: 1> => |s> + 2|a> + |t> + |u> + |r> + |d> + |y> + |sa> + |at> + |tu> + |ur> + |rd> + |da> + |ay> + |sat> + |atu> + |tur> + |urd> + |rda> + |day>

day |node: 6: 2> => |s> + |a> + |t> + |sa> + |at> + |sat>

day |node: 6: 3> => |s> + |a> + |sa>

previous |node: 6: *> => |Friday>
id |node: 6: *> => |Saturday>
next |node: 6: *> => |Sunday>

day |node: 7: 1> => |s> + |u> + |n> + |d> + |a> + |y> + |su> + |un> + |nd> + |da> + |ay> + |sun> + |und> + |nda> + |day>

day |node: 7: 2> => |s> + |u> + |n> + |su> + |un> + |sun>

day |node: 7: 3> => |s> + |u> + |su>

previous |node: 7: *> => |Saturday>
id |node: 7: *> => |Sunday>
next |node: 7: *> => |Monday>
----------------------------------------

And I guess that is it for this post.