The details:
-- NB: need to tweak the destination sw file inside these scripts: $ ./phi-superpositions.py 5 work-on-handwritten-digits/phi-transformed-images-v2--10k-test--edge-enhanced-20/ $ ./phi-superpositions-v3.py 5 work-on-handwritten-digits/phi-transformed-images-v2--60k-train--edge-enhanced-20/ -- now we have the sw files, load them into the console: load image-phi-superpositions--test-10k--using-edge-enhanced-features--k_5--t_0_4.sw load image-phi-superpositions--train-60k--using-edge-enhanced-features--k_5--t_0_4.sw load mnist-test-labels--edge-enhanced.sw load mnist-train-labels--edge-enhanced.sw -- define our if-then machine operator: simm-op |*> #=> 100 select[1,40] similar-input[train-log-phi-sp] log-phi-sp |_self> -- find all the similarity results: map[simm-op,similarity] rel-kets[log-phi-sp] -- define the operators that create the result table: equal? |*> #=> equal(100 test-label |_self>,h |_self>) h |*> #=> normalize[100] select[1,1] coeff-sort train-label select[1,1] similarity |_self> score |top 1> => 0.01 equal? ket-sort rel-kets[similarity] |> h |*> #=> normalize[100] select[1,1] coeff-sort train-label select[1,2] similarity |_self> score |top 2> => 0.01 equal? ket-sort rel-kets[similarity] |> h |*> #=> normalize[100] select[1,1] coeff-sort train-label select[1,3] similarity |_self> score |top 3> => 0.01 equal? ket-sort rel-kets[similarity] |> ... h |*> #=> normalize[100] select[1,1] coeff-sort train-label select[1,30] similarity |_self> score |top 30> => 0.01 equal? ket-sort rel-kets[similarity] |> -- finally, spit out the result table: table[top-k,score] rel-kets[score] +--------+------------+ | top-k | score | +--------+------------+ | top 1 | 93.10 True | | top 2 | 93.10 True | | top 3 | 94.12 True | | top 4 | 94.54 True | | top 5 | 94.57 True | | top 6 | 94.60 True | | top 7 | 94.55 True | | top 8 | 94.53 True | | top 9 | 94.43 True | | top 10 | 94.53 True | | top 11 | 94.43 True | | top 12 | 94.48 True | | top 13 | 94.44 True | | top 14 | 94.49 True | | top 15 | 94.37 True | | top 16 | 94.36 True | | top 17 | 94.29 True | | top 18 | 94.22 True | | top 19 | 94.20 True | | top 20 | 94.19 True | | top 21 | 94.17 True | | top 22 | 94.16 True | | top 23 | 94.16 True | | top 24 | 94.09 True | | top 25 | 94.05 True | | top 26 | 94.06 True | | top 27 | 93.97 True | | top 28 | 94 True | | top 29 | 94 True | | top 30 | 93.99 True | +--------+------------+ -- save the results: save full-mnist-phi-transformed-edge-enhanced--saved.swAnd if we are allowed to pick and choose how many results to average over, if we average over the top 6, we get 94.6% correct, or 5.4% error. However, if we compare this result with those on the MNIST home page, 5.4% error is like 1998 level result, or slightly better. But like I said, this is a first attempt, surely I can improve on it.
Anyway, I think my point is made: "we can make a lot of progress in pattern recognition if we can find mappings from objects to well-behaved, deterministic, distinctive superpositions". I just need to find a better mapping for digit images to superpositions.
Update: I have a new idea to test. Maybe if-then machines don't work the way I expected. Consider:
pattern |node 1: 1> => sp1 then |node 1: 1> => then-sp pattern |node 2: 1> => sp2 then |node 2: 1> => then-spversus:
pattern |node 1: 1> => sp1 pattern |node 1: 2> => sp2 then |node 1: *> => then-spI had assumed, without much thought, that functionally these are equivalent. ie, we can expand or contract the if-then machines, if they share a "then" pattern. I now suspect, but have yet to test, that the second one, where we contract the if-then machines might work better. Consider an input spatial pattern that is partly sp1 and partly sp2. Presumably the second case will give better results. Anyway, I now have to try this on MNIST. So instead of effectively 60,000 if-then machines, we will have 10.
Hi Garry, I just found your work from a Numenta blog post. I agree that superpositions of SDRs is the secret of intelligence and have been working on it for many years. I concluded many years ago that SDR-based computation (on classical machines) constitutes quantum computing (QC). Essentially the mainstream QC guys are still thinking as localists, in particular, imagining that each probability amplitude must be represented by a unique physical element. But in SDR, each prob. amplitude is represented as an SDR over a field of elements....which allows a field of N binary (and classical) physical elements to represent an exponential number (in N) of prob. amplitudes. I haven't delved deeply into your work yet, but it looks like it is deeply similar to my theory, Sparsey, though I don't use the formal QC notation. I thought I'd reach out just because there are so few of us on the planet that really understand the importance of SDR. I look forward to looking at your work more closely to understand its relation to my own. Thanks, Rod Rinkus
ReplyDelete