Tuesday, 30 June 2015

on emerging patterns

So, consider the English expression "I see a pattern forming". Well, maybe we can encode that idea in BKO. Let's say we have a series of examples. Individually they just look like noise. But, if we add them in a BKO sense, then "I see a pattern forming" corresponds to a distinctive shape emerging from the noise.

The idea is, some of the kets in the examples correspond to signal, and some correspond to noise. As we add them up the signal kets "reinforce" (ie, their coeffs increase), but presumably the noise is random from sample to sample, so the noise kets coeffs remain small.

We can extract the "signal" using something like this (using some operator foo):
foo |signal> => drop-below[t] (foo |example 1> + foo |example 2> + foo |example 3> + ... + foo |example n>)
I hope that makes sense.

Update: I guess I have a closer to real world application of this idea. Consider the list of WWII leaders: Roosevelt, Churchill, Stalin and Hitler.

Then in BKO we might do something like:
sa: everything-we-know-about |*> #=> apply(supported-ops|_self>,|_self>)

sa: the-list-of |WWII leaders> => |Roosevelt> + |Churchill> + |Stalin> + |Hitler>
sa: coeff-sort everything-we-know-about the-list-of |WWII leaders>
And hopefully what emerges is something about WWII.

Update: for want of a better place to put this. The above makes me think of:
sa: everything-we-know-about |*> #=> apply(supported-ops|_self>,|_self>)
sa: map[everything-we-know-about,everything] some |list>
sa: similar[everything] |object>
This should be a quite general way to find general similarity between objects. Haven't tested it, but I'm pretty sure it is correct.

Update: again, for want of a better place, we can also do this. Consider we have knowledge on quite a few animals, including what they like to eat. We also have a lot of knowledge on foxes, but we don't know what they eat. But, we can guess:
guess-what-eat |fox> => select[1,1] coeff-sort eat select[1,5] similar[everything] |fox>
ie, in words, find the 5 most similar animals given what we know. Find what they eat. Sort that list. Return the result with the highest coeff.

Update: we can also use this everything as a way to help with language translation. Maybe something like:
best-guess-German-for |*> #=> select[1,1] similar[English-everything,German-everything] |_self>
Kind of hard to test this idea at the moment. I need some way to map words to everything we know about a word. Heh, cortical.io word SDR's would be a nice start! I wonder how they made them?

Update: a little more on the idea of emerging patterns. Simple enough, the time gap between two events.

Start with a web log file. For each IP find the time gap between retrievals. I imagine this will be quite distinctive. eg, a robot slurping down a page every x seconds, should have a nice big spike around the x second mark (though it depends on how fine grained your time sample is, for how broad this peak will be. The wider your bucket size, the sharper the peak).

Next, if you use the random wait, as in wget:
--random-wait               wait from 0.5*WAIT...1.5*WAIT secs between retrievals
then that should have a distinctive pattern too.

Finally, you should get a clear signal of roughly how often you press refresh on a website when you are bored. This will probably be quite noisy, so the smooth operator should help. Also, quite likely to give you an indication of how long you are asleep. Say you normally sleep for about 8 hours. Then there should be at least some kets (probably roughly 1 per day) with a time delta greater than 8 hours. Whether you web surf at work would also potentially show up.

Last example: apparently every person has a distinctive typing pattern. We could find that simply enough, just by measuring the time delta between different characters on a keyboard. eg, what is the time delta when you type "I'm" between "I" and "'", and "'" and "m". Or typing "The" the time between "T" and "h", and "h" and "e". Or typing "rabbit" and the delta between "r" and "a", "a" and "b", "b" and "b" and so on. Presumably, if you have a big enough sample, and you map this to a superposition, then we could run a similar[typing-delta] |person: X> and guess who typed it.

No comments:

Post a Comment