The Semantic DB Project

Tuesday, 10 November 2015

new function operator: inhibition[t]

The idea is, given a superposition, try to increase the difference between the top most ket and the rest. I don't yet know when it will be useful, but it seems it might be, and it was only a couple of lines of code.

How does it work? Well, you feed in a parameter that specifies how much to suppress the smaller terms.

-- define a superposition:
sa: |list> => |a> + 5.2|b> + 23|c> + 13|d> + 17|e> + 17|f> + 15|g>

-- suppress everything except the biggest term:
sa: inhibition[1] "" |list>
0|a> + 0|b> + 23|c> + 0|d> + 0|e> + 0|f> + 0|g>

-- inhibit at half strength:
sa: inhibition[0.5] "" |list>
0.5|a> + 2.6|b> + 23|c> + 6.5|d> + 8.5|e> + 8.5|f> + 7.5|g>

-- negative inhibit (ie, suppress the biggest term):
sa: inhibition[-1] "" |list>
2|a> + 10.4|b> + 23|c> + 26|d> + 34|e> + 34|f> + 30|g>

-- and again:
sa: inhibition[-1]^2 "" |list>
4|a> + 20.8|b> + 46|c> + 52|d> + 34|e> + 34|f> + 60|g>

-- and again:
sa: inhibition[-1]^3 "" |list>
8|a> + 41.6|b> + 92|c> + 104|d> + 68|e> + 68|f> + 60|g>

Hopefully that makes some sense.

Update: inhibition[-1]^k may be one way to approach the idea of creativity. When thinking of a problem, the obvious, boring, answer will have higher coefficient. But if we suppress the highest few coefficients, then maybe we have something more creative. Of course, we need some way to test the proposed solution satisfies the required properties. I'm not yet sure the way to implement this, but it seems to be something that the brain makes heavy use of. Posing questions, and then hunting for answers with the right properties.

Wednesday, 28 October 2015

simple particle entanglement example

We can use our BKO scheme to encode a simplified version of particle entanglement. In this example, 2 particles, with spin either up or down. The idea is we don't know what state the particles are in until we measure. Makes use of weighted-pick-elt which has some similarities to wave-function collapse.

Here is some BKO:

----------------------------------------
|context> => |context: simple entanglement example>

entanglement-1 |particles> => |particle 1: spin up> + |particle 2: spin down>
entanglement-2 |particles> => |particle 1: spin down> + |particle 2: spin up>
the-list-of-possible-entanglements |particles> => |op: entanglement-1> + |op: entanglement-2>
measure |particles> #=> apply(weighted-pick-elt the-list-of-possible-entanglements|_self>,|_self>)
----------------------------------------

Now, let's measure our particles:

sa: measure |particles>
|particle 1: spin down> + |particle 2: spin up>

It's random, so let's try again:

sa: measure |particles>
|particle 1: spin down> + |particle 2: spin up>

And again:

sa: measure |particles>
|particle 1: spin up> + |particle 2: spin down>

I guess that is simple enough. And if we want to encode the idea that the particles take a fix state on measurement, then we should use memoizing rules instead of stored rules:

wave-fn-collapse-measure |particles> !=> apply(weighted-pick-elt the-list-of-possible-entanglements|_self>,|_self>)

Now, see what we have:

sa: dump
----------------------------------------
|context> => |context: simple entanglement example>

entanglement-1 |particles> => |particle 1: spin up> + |particle 2: spin down>
entanglement-2 |particles> => |particle 1: spin down> + |particle 2: spin up>
the-list-of-possible-entanglements |particles> => |op: entanglement-1> + |op: entanglement-2>
measure |particles> #=> apply(weighted-pick-elt the-list-of-possible-entanglements|_self>,|_self>)
wave-fn-collapse-measure |particles> !=> apply(weighted-pick-elt the-list-of-possible-entanglements|_self>,|_self>)
----------------------------------------

Now measure our particles and then see what we know:

sa: wave-fn-collapse-measure |particles>
|particle 1: spin up> + |particle 2: spin down>

sa: dump
----------------------------------------
|context> => |context: simple entanglement example>

entanglement-1 |particles> => |particle 1: spin up> + |particle 2: spin down>
entanglement-2 |particles> => |particle 1: spin down> + |particle 2: spin up>
the-list-of-possible-entanglements |particles> => |op: entanglement-1> + |op: entanglement-2>
measure |particles> #=> apply(weighted-pick-elt the-list-of-possible-entanglements|_self>,|_self>)
wave-fn-collapse-measure |particles> => |particle 1: spin up> + |particle 2: spin down>
----------------------------------------

Anyway, hope that is clear.

Sunday, 11 October 2015

shopping with process-reaction()

Today an example using process-reaction to go shopping. I guess the idea is that buying stuff is a kind of reaction:

price-of-object -> object

First, learn some prices:

price |apple> => 0.6|dollar>
price |orange> => 0.8|dollar>
price |milk> => 2.3|dollar>
price |coffee> => 5.5|dollar>
price |steak> => 9|dollar>

Now, let's go shopping (we have $30 to spend):

-- buy an orange:
sa: process-reaction(30|dollar>,price |orange>,|orange>)
29.2|dollar> + |orange>

-- buy 4 apples:
sa: process-reaction(29.2|dollar> + |orange>,price 4 |apple>,4 |apple>)
26.8|dollar> + |orange> + 4|apple>

-- buy milk, coffee and steak:
sa: process-reaction(26.8|dollar> + |orange> + 4|apple>,price |milk> + price |coffee> + price |steak>,|milk> + |coffee> + |steak>)
10|dollar> + |orange> + 4|apple> + |milk> + |coffee> + |steak>

Now we have it working, we can compact it! First, define a shopping list:

list-for |shopping> => |orange> + 4|apple> + |milk> + |coffee> + |steak>

Now, let's buy it all at once:

sa: process-reaction(30|dollar>,price list-for |shopping>,list-for |shopping>)
10|dollar> + |orange> + 4|apple> + |milk> + |coffee> + |steak>

I'm impressed how easy that was! Define your prices, define your shopping list, and then you are essentially done. And a key part of why it is so easy is the linearity of the price operator acting on the shopping list.

BTW, there is a quirk that you get stuff for free if it doesn't have a defined price. Whether we want to tweak process-reaction() to prevent this, or just be aware of it, I'm not yet sure. Anyway, here is an example. We add tomato to the shopping list, but we don't have a price:

sa: list-for |shopping> +=> |tomato>
sa: process-reaction(30|dollar>,price list-for |shopping>,list-for |shopping>)
10|dollar> + |orange> + 4|apple> + |milk> + |coffee> + |steak> + |tomato>

And note we still have $10, and a tomato. We got it for free! BTW, here is the price of the tomato:

sa: price |tomato>
|>

Hrmm... I now think it is impossible to tweak process-reaction() to handle undefined prices. Why? Because "price list-for |shopping>" is calculated before it is even sent to process-reaction() and |> being the identity element for superpositions means it is silently dropped.

Finally, we can find the cost of our shopping simply enough:

sa: price list-for |shopping>
20|dollar>

Update: I found one way to solve the free tomato problem. It goes something like this:

price |*> => |undefined price>

ie, if price is not defined for an object, return |undefined price> (making use of label descent). Now if we look at the price of our shopping list:

sa: price list-for |shopping>
20|dollar> + |undefined price>

Then we try to buy our full shopping list:

sa: process-reaction(30|dollar>,price list-for |shopping>,list-for |shopping>)
30|dollar>

And we see our shopping list didn't go through. This is a good thing. Process-reaction didn't know how to handle |undefined price>, and so the reaction was not processed.

Here is another way to handle it. The |undefined price> learn rule gets in the way, so let's drop back to this knowledge:

price |apple> => 0.6|dollar>
price |orange> => 0.8|dollar>
price |milk> => 2.3|dollar>
price |coffee> => 5.5|dollar>
price |steak> => 9|dollar>
list-for |shopping> => |orange> + 4|apple> + |milk> + |coffee> + |steak> + |tomato>

Define a new operator:

price-is-defined |*> #=> do-you-know price |_self>

Now, filter our shopping list to those we know the price of, and then buy the items:

sa: list-of |available items> => such-that[price-is-defined] list-for |shopping>
sa: process-reaction(30|dollar>,price list-of |available items>,list-of |available items>)
10|dollar> + |orange> + 4|apple> + |milk> + |coffee> + |steak>

And noting we didn't get a free tomato this time.

Update: just a quick conversion of operator names to be more like regular English:

the-price-for |apple> => 0.6|dollar>
the-price-for |orange> => 0.8|dollar>
the-price-for |milk> => 2.3|dollar>
the-price-for |coffee> => 5.5|dollar>
the-price-for |steak> => 9|dollar>
the |shopping list> => |orange> + 4|apple> + |milk> + |coffee> + |steak>

Now, ask "what is the price for the shopping list?":

sa: the-price-for the |shopping list>
20|dollar>

Cool, huh?

Eventually the plan is to have code to automatically cast English questions to BKO, and cast BKO answers back to English, but I don't fully know how to do that yet.

Thursday, 24 September 2015

new functions: union[op] and top[k]

Just a couple of small new function operators:

union[op] (|w> + |x> + |y> + |z>)
returns
union(op|w>,op|x>,op|y>,op|z>)
(well, it would, but currently we only have a 2 and 3 parameter union)
A quick example:

sa: load fred-sam-friends.sw
sa: dump
----------------------------------------
|context> => |context: friends>

friends |Fred> => |Jack> + |Harry> + |Ed> + |Mary> + |Rob> + |Patrick> + |Emma> + |Charlie>

friends |Sam> => |Charlie> + |George> + |Emma> + |Jack> + |Rober> + |Frank> + |Julie>
----------------------------------------

sa: union[friends] split |Fred Sam>
|Jack> + |Harry> + |Ed> + |Mary> + |Rob> + |Patrick> + |Emma> + |Charlie> + |George> + |Rober> + |Frank> + |Julie>

Next, is: top[k]
returns the top k kets in a superposition, without changing the order. Though if several kets have the same coeff, then it will return more than k results.
A quick example:

sa: top[0] (|a> + 3|b> + 2|c> + 9.3|d> + 0.5|e>)
|>

sa: top[1] (|a> + 3|b> + 2|c> + 9.3|d> + 0.5|e>)
9.3|d>

sa: top[3] (|a> + 3|b> + 2|c> + 9.3|d> + 0.5|e>)
3|b> + 2|c> + 9.3|d>

sa: top[10] (|a> + 3|b> + 2|c> + 9.3|d> + 0.5|e>)
|a> + 3|b> + 2|c> + 9.3|d> + 0.5|e>

And an example where more than 1 ket has the same coeff:

sa: top[1] (3.2|a> + |b> + 3.2|c> + 3.2|d> + 3|e> + |f> + |g> + 3.1|h> + 3.2|i>)
3.2|a> + 3.2|c> + 3.2|d> + 3.2|i>

sa: top[2] (3.2|a> + |b> + 3.2|c> + 3.2|d> + 3|e> + |f> + |g> + 3.1|h> + 3.2|i>)
3.2|a> + 3.2|c> + 3.2|d> + 3.2|i>

sa: top[3] (3.2|a> + |b> + 3.2|c> + 3.2|d> + 3|e> + |f> + |g> + 3.1|h> + 3.2|i>)
3.2|a> + 3.2|c> + 3.2|d> + 3.2|i>

sa: top[4] (3.2|a> + |b> + 3.2|c> + 3.2|d> + 3|e> + |f> + |g> + 3.1|h> + 3.2|i>)
3.2|a> + 3.2|c> + 3.2|d> + 3.2|i>

sa: top[5] (3.2|a> + |b> + 3.2|c> + 3.2|d> + 3|e> + |f> + |g> + 3.1|h> + 3.2|i>)
3.2|a> + 3.2|c> + 3.2|d> + 3.1|h> + 3.2|i>

sa: top[6] (3.2|a> + |b> + 3.2|c> + 3.2|d> + 3|e> + |f> + |g> + 3.1|h> + 3.2|i>)
3.2|a> + 3.2|c> + 3.2|d> + 3|e> + 3.1|h> + 3.2|i>

sa: top[7] (3.2|a> + |b> + 3.2|c> + 3.2|d> + 3|e> + |f> + |g> + 3.1|h> + 3.2|i>)
3.2|a> + |b> + 3.2|c> + 3.2|d> + 3|e> + |f> + |g> + 3.1|h> + 3.2|i>

Note, it doesn't give the expected result when more than 1 ket has the same coeff. I guess we could tweak it to give exactly k results, but then we would have to pick randomly from the set of kets with the same coeff. We could do that in code easy enough, but I'm not sure we want that as the default behaviour.

Saturday, 12 September 2015

representing nuclear decay

The idea is simple enough. Let's use our new function process-reaction() to encode an example of nuclear decay. In this case fission of U_235, but the idea is general enough to extend to other nuclear and chemical reactions.

First, let's learn some example knowledge about fission of U_235:

sa: load uranium-fission.sw
sa: dump
----------------------------------------
|context> => |context: uranium fission products>

fission-channel-1 |U: 235> => |Ba: 141> + |Kr: 92> + 3|n>
fission-channel-2 |U: 235> => |Xe: 140> + |Sr: 94> + 2|n>
fission-channel-3 |U: 235> => |La: 143> + |Br: 90> + 3|n>
fission-channel-4 |U: 235> => |Cs: 137> + |Rb: 96> + 3|n>
fission-channel-5 |U: 235> => |I: 131> + |Y: 89> + 16|n>
list-of-fission-channels |U: 235> => |op: fission-channel-1> + |op: fission-channel-2> + |op: fission-channel-3> + |op: fission-channel-4> + |op: fission-channel-5>

fission |*> #=> apply(weighted-pick-elt list-of-fission-channels |_self>,|_self>)
----------------------------------------

And a note that if we had probabilities for the different fission channels, we could encode that with coeffs other than 1. And this leads to an interesting thought. Standard QM needs complex numbers to work, so we can't do that with the current BKO. But what we can do is represent the results of QM calculations, ie the resulting probabilities, in BKO. The above fission example is the first hint of this idea.

Now, let's fission our uranium:

sa: fission |U: 235>
|I: 131> + |Y: 89> + 16|n>

-- it's random, so if we try again we should get a different result:
sa: fission |U: 235>
|Xe: 140> + |Sr: 94> + 2|n>

sa: fission |U: 235>
|La: 143> + |Br: 90> + 3|n>

sa: fission |U: 235>
|Cs: 137> + |Rb: 96> + 3|n>

sa: fission |U: 235>
|Ba: 141> + |Kr: 92> + 3|n>

sa: fission |U: 235>
|I: 131> + |Y: 89> + 16|n>

This is already fun! But we can now use this to encode the full nuclear reaction:

n + U_235 -> fission-of(U_235)

Let's start with 3 neutrons and 4 atoms of U_235: 3|n> + 4|U: 235>

sa: process-reaction(3|n> + 4|U: 235>,|n> + |U: 235>,fission |U: 235>)
4|n> + 3|U: 235> + |Xe: 140> + |Sr: 94>

-- now again, this time starting with: 4|n> + 3|U: 235> + |Xe: 140> + |Sr: 94>
sa: process-reaction(4|n> + 3|U: 235> + |Xe: 140> + |Sr: 94>,|n> + |U: 235>,fission |U: 235>)
6|n> + 2|U: 235> + |Xe: 140> + |Sr: 94> + |La: 143> + |Br: 90>

-- and again. Feed in result from last reaction:
sa: process-reaction(6|n> + 2|U: 235> + |Xe: 140> + |Sr: 94> + |La: 143> + |Br: 90>,|n> + |U: 235>,fission |U: 235>)
8|n> + |U: 235> + |Xe: 140> + |Sr: 94> + 2|La: 143> + 2|Br: 90>

-- and again!
sa: process-reaction(8|n> + |U: 235> + |Xe: 140> + |Sr: 94> + 2|La: 143> + 2|Br: 90>,|n> + |U: 235>,fission |U: 235>)
23|n> + |Xe: 140> + |Sr: 94> + 2|La: 143> + 2|Br: 90> + |I: 131> + |Y: 89>

-- and again. Note, we have no U_235 left.
sa: process-reaction(23|n> + |Xe: 140> + |Sr: 94> + 2|La: 143> + 2|Br: 90> + |I: 131> + |Y: 89>,|n> + |U: 235>,fission |U: 235>)
23|n> + |Xe: 140> + |Sr: 94> + 2|La: 143> + 2|Br: 90> + |I: 131> + |Y: 89>

Noting that with no U_235 left, this final process-reaction() didn't change anything.

Then when we get around to implementing learn-sp rules, as mentioned in my last post, we can tidy this to:

fission-uranium-235 (*) #=> process-reaction(|_self>,|n> + |U: 235>,fission |U: 235>)

And then we should be able to do say 5 reactions in a row, starting with say 3 neutrons and 4 atoms of U_235:

fission-uranium-235^5 (3|n> + 4|U: 235>)

This structure is presumably general enough to represent most chemical and nuclear reactions. And we are not limited to just one reaction type. For example:

chemical-reaction-1 (*) #=> process-reaction(|_self>,|x> + |y>,3|z>)
chemical-reaction-2 (*) #=> process-reaction(|_self>,|z>,2|a> + |b>)

|resulting bag of chemicals> => chemical-reaction-1 chemical-reaction-2^3 chemical-reaction-1 starting-bag-of-chemicals

And I suspect process-reaction() could be useful elsewhere, including things other than chemical/nuclear reactions. Eg I suspect we can use it to encode quantum entanglement too!

Update: The comment on process-reaction() being useful elsewhere. I have a couple of things in mind. One is to rename parts of a superposition. I don't yet have a concrete use-case in mind, but think it might be useful. The other is a version of pattern recognition. ie, something that is slightly different to our current similar[op]. Need to give that idea a little more thought though.

new function: process-reaction

I was thinking, how can we encode reactions (eg, chemical or nuclear) into the BKO knowledge representation scheme? I eventually decided we can't really do it at the moment, but with a couple of additions, then yes we can.

The first part of this is a new function, process-reaction().
Here is the python:

# one, two and three are superpositions
def process_reaction(one,two,three):
  if intersection(two,one).count_sum() != two.count_sum():
    return one
  else:
    return intersection_fn(del_fn3,one,two).drop() + three

Here is a simple example, making water from hydrogen and oxygen gas:

2H2 + O2 -> 2 H2O

sa: process-reaction(5|H2> + 7|O2>,2|H2> + |O2>,2|H2O>)
3|H2> + 6|O2> + 2|H2O>

I'll give a more interesting example in the next post.

The next bit is we need a way to learn superposition rules. We also have a separate motivation for this idea. That is, we need a way so that literal operators can act on superpositions, instead of just being linear. An example I like to give is:

sa: M |yes> => |yes> + -1|no>
sa: M |no> => -1|yes> + |no>
sa: matrix[M]
[ no  ] = [ 1 -1 ] [ no  ]
[ yes ]   [ -1 1 ] [ yes ]

sa: drop M (|yes> + |no>)
|>

sa: drop M (0.8|yes> + 0.2|no>)
0.6|yes>

OK. So that works as expected/desired (a simple example of inhibition). But if we try using literal operators:

sa: clean-M |*> #=> drop M |_self>

sa: clean-M (|yes> + |no>)
|yes> + |no>

sa: clean-M (0.8|yes> + 0.2|no>)
0.8|yes> + 0.2|no>

ie, it didn't work at all! This is because literal operators are currently linear.
eg:

clean-M (|yes> + |no>)
-- expands to:
clean-M |yes> + clean-M |no>

The solution (which I have not yet implemented in code) is:

learn-a-sp-rule (*) #=> foo( |_self>)

and then when we invoke it, replace the SP on the left into |_self>.
And so we can now learn:

make-water (*) #=> process-reaction(|_self>,2|H2> + |O2>,2|H2O>)
clean-M (*) #=> drop M |_self>

Now I need to find the cleanest way to implement this. It is going to take some work in at least context.learn() and context.recall() and the parser. So yeah, quite a bit of work!

Tuesday, 25 August 2015

new function: hash

Just a simple one today. Some new python that maps a superposition to a superposition of the hash's of the kets.

Here is the python:

# ket-hash[size] |some ket>
#
# one is a ket
def ket_hash(one,size):
  logger.debug("ket-hash one: " + str(one))
  logger.debug("ket-hash size: " + size)
  try:
    size = int(size)
  except:
    return ket("",0)
  our_hash = hashlib.md5(one.label.encode('utf-8')).hexdigest()[-size:]
  return ket(our_hash,one.value)

And some simple examples:

sa: hash[6] split |a b c d e f>
|772661> + |31578f> + |8b5f33> + |e091ad> + |41ec32> + |29cce7>

sa: hash[10] split |u v w x y z>
|4f4f21d34c> + |4664205d2a> + |e77c0c5d68> + |4e155c67a6> + |22904f345d> + |b808451dd7>

-- slightly more interesting example:
sa: load fred-sam-friends.sw
sa: dump
----------------------------------------
|context> => |context: friends>

friends |Fred> => |Jack> + |Harry> + |Ed> + |Mary> + |Rob> + |Patrick> + |Emma> + |Charlie>

friends |Sam> => |Charlie> + |George> + |Emma> + |Jack> + |Rober> + |Frank> + |Julie>
----------------------------------------

sa: hash-friends |Fred> => hash[4] friends |_self>
sa: hash-friends |Sam> => hash[4] friends |_self>

sa: dump
----------------------------------------
|context> => |context: friends>

friends |Fred> => |Jack> + |Harry> + |Ed> + |Mary> + |Rob> + |Patrick> + |Emma> + |Charlie>
hash-friends |Fred> => |4f62> + |72ec> + |f3e0> + |315a> + |19b1> + |06ec> + |4a79> + |5cd8>

friends |Sam> => |Charlie> + |George> + |Emma> + |Jack> + |Rober> + |Frank> + |Julie>
hash-friends |Sam> => |5cd8> + |93a3> + |4a79> + |4f62> + |75f6> + |3e4b> + |47dd>
----------------------------------------

sa: common[friends] split |Fred Sam>
|Jack> + |Emma> + |Charlie>

sa: common[hash-friends] split |Fred Sam>
|4f62> + |4a79> + |5cd8>

I guess the point is sometimes the exact ket label doesn't matter. It is the network structure that matters. I guess we could also use it as a compression scheme of sorts. Say your data has kets with very long text labels, we could, in theory, compress that down using hashes. Providing the structure is the only thing of interest.

Monday, 24 August 2015

visualizing superpositions

Superpositions can sometimes be somewhat abstract. But today I want to show that it is quite easy to visualize them. Though I had to write a little python, and dig up an old gnuplot script.

Here is the new python (not super happy with the name, but it will do for now):

def hash_data(one,size):
  logger.debug("hash-data one: " + str(one))
  logger.debug("hash-data size: " + size)
  try:
    size = int(size)
  except:
    return ket("",0)
  array = [0] * (16**size)
  for x in one:
    our_hash = hashlib.md5(x.label.encode('utf-8')).hexdigest()[-size:]
    k = int(our_hash,16)
    array[k] += 1 * x.value
  logger.info("hash-data writing to tmp-sp.dat")
  f = open('tmp-sp.dat','w')
  for k in array:
    f.write(str(k) + '\n')
  f.close()
  return ket("hash-data")

Now, I have an example in mind that would be good to visualize. Recall:

sa: load improved-imdb.sw
sa: table[actor,coeff] common[actors] select[1,6] self-similar[actors] |movie: Star Trek: The Motion Picture (1979)>
+-------------------+-------+
| actor             | coeff |
+-------------------+-------+
| James Doohan      | 0.109 |
| DeForest Kelley   | 0.109 |
| Walter (I) Koenig | 0.109 |
| Leonard Nimoy     | 0.109 |
| William Shatner   | 0.109 |
| George Takei      | 0.109 |
| Nichelle Nichols  | 0.109 |
+-------------------+-------+

Now, in the console:

sa: load improved-imdb.sw
sa: |result> => self-similar[actors] |movie: Star Trek: The Motion Picture (1979)>

sa: hash-data[4] |movie: Star Trek: The Motion Picture (1979)>
sa: hash-data[4] "" |result>
sa: hash-data[4] select[1,6] "" |result>
sa: hash-data[4] common[actors] select[1,6] "" |result>

Then we make use of this script($ ./make-image.sh tmp-sp.dat), and then we have:

Anyway, I think that is cool. And is approaching what I imagine a brain would look like.

BTW, I should mention. The spikes in the first three graphs correspond to movies, and the spikes in the last graph correspond to the original series Star Trek actors.

Update: one more step to another superposition:

-- find all the movies the 7 original series Star Trek actors starred in:
sa: hash-data[4] movies common[actors] select[1,6] "" |result>

Now, out of interest, how many movies was that?

sa: how-many movies common[actors] select[1,6] "" |result>
|number: 262>

What were the top 30 of these?

sa: table[movie,coeff] 100 select[1,30] coeff-sort movies common[actors] select[1,6] "" |result>
+---------------------------------------+--------+
| movie                                 | coeff  |
+---------------------------------------+--------+
| Road Trek 2011 (2012)                 | 76.562 |
| Star Trek Adventure (1991)            | 76.562 |
| The Search for Spock (1984)           | 76.562 |
| The Voyage Home (1986)                | 76.562 |
| The Final Frontier (1989)             | 76.562 |
| The Undiscovered Country (1991)       | 76.562 |
| The Motion Picture (1979)             | 76.562 |
| The Wrath of Khan (1982)              | 76.562 |
| Trekkies (1997)                       | 76.562 |
| To Be Takei (2014)                    | 54.688 |
| Generations (1994)                    | 32.812 |
| Bug Buster (1998)                     | 21.875 |
| Loaded Weapon 1 (1993)                | 21.875 |
| Backyard Blockbusters (2012)          | 21.875 |
| FedCon XXI (2012)                     | 21.875 |
| The Captains (2011)                   | 21.875 |
| Unbelievable!!!!! (2014)              | 21.875 |
| Coneheads (1993)                      | 21.875 |
| The 6th People's Choice Awards (1980) | 21.875 |
| 36 Hours (1965)                       | 10.938 |
| Actors in War (2005)                  | 10.938 |
| Amore! (1993)                         | 10.938 |
| Bus Riley's Back in Town (1965)       | 10.938 |
| Double Trouble (1992/I)               | 10.938 |
| Jigsaw (1968)                         | 10.938 |
| Man in the Wilderness (1971)          | 10.938 |
| New York Skyride (1994)               | 10.938 |
| One of Our Spies Is Missing (1966)    | 10.938 |
| Pretty Maids All in a Row (1971)      | 10.938 |
| River of Stone (1994)                 | 10.938 |
+---------------------------------------+--------+

And what does this look like?

Filter down to the top 9 of these movies:

sa: table[actor,coeff] select[1,9] 100 select[1,30] coeff-sort movies common[actors] select[1,6] "" |result>
+---------------------------------+--------+
| actor                           | coeff  |
+---------------------------------+--------+
| Road Trek 2011 (2012)           | 76.562 |
| Star Trek Adventure (1991)      | 76.562 |
| The Search for Spock (1984)     | 76.562 |
| The Voyage Home (1986)          | 76.562 |
| The Final Frontier (1989)       | 76.562 |
| The Undiscovered Country (1991) | 76.562 |
| The Motion Picture (1979)       | 76.562 |
| The Wrath of Khan (1982)        | 76.562 |
| Trekkies (1997)                 | 76.562 |
+---------------------------------+--------+

And who were the actors in the top 9 of these movies?

sa: table[actor,coeff] coeff-sort actors select[1,9] 100 select[1,30] coeff-sort movies common[actors] select[1,6] "" |result>
+--------------------------+---------+
| actor                    | coeff   |
+--------------------------+---------+
| James Doohan             | 689.062 |
| DeForest Kelley          | 689.062 |
| Walter (I) Koenig        | 689.062 |
| Leonard Nimoy            | 689.062 |
| William Shatner          | 689.062 |
| George Takei             | 689.062 |
| Nichelle Nichols         | 689.062 |
| Grace Lee Whitney        | 382.812 |
| Mark Lenard              | 306.25  |
| Teresa E. Victor         | 229.688 |
| Majel Barrett            | 229.688 |
| Catherine (I) Hicks      | 153.125 |
| Harve Bennett            | 153.125 |
| Merritt Butrick          | 153.125 |
| Gary Faga                | 153.125 |
| Stephen Liska            | 153.125 |
| Robin (I) Curtis         | 153.125 |
| Michael Berryman         | 153.125 |
| Brock Peters             | 153.125 |
| John Schuck              | 153.125 |
| Michael (I) Snyder       | 153.125 |
| Judy Levitt              | 153.125 |
| Todd (I) Bryant          | 153.125 |
| David (I) Warner         | 153.125 |
| Michael (I) Dorn         | 153.125 |
| Tom Morga                | 153.125 |
| Richard (III) Arnold     | 153.125 |
| James T. Kirk            | 153.125 |
| Christopher (I) Flynn    | 76.562  |
| Malcolm McDowell         | 76.562  |
| Patrick (I) Stewart      | 76.562  |
| Gene Roddenberry         | 76.562  |
| Phillip R. Allen         | 76.562  |
| Steve Blalock            | 76.562  |
| David Cadiente           | 76.562  |
| Charles (I) Correll      | 76.562  |
| Bob K. Cummings          | 76.562  |
| Joe W. Davis             | 76.562  |
| Miguel (I) Ferrer        | 76.562  |
| Conroy Gedeon            | 76.562  |
| Robert Hooks             | 76.562  |
| Al (II) Jones            | 76.562  |
| John Larroquette         | 76.562  |
| Christopher (I) Lloyd    | 76.562  |
| Stephen (I) Manley       | 76.562  |
| Eric Mansker             | 76.562  |
| Mario Marcelino          | 76.562  |
| Scott McGinnis           | 76.562  |
| Allan (I) Miller         | 76.562  |
| Phil (I) Morris          | 76.562  |
| Danny Nero               | 76.562  |
| Dennis (I) Ott           | 76.562  |
| Vadia Potenza            | 76.562  |
| Branscombe Richmond      | 76.562  |
| Doug Shanklin            | 76.562  |
| James Sikking            | 76.562  |
| Paul (II) Sorensen       | 76.562  |
| Carl Steven              | 76.562  |
| Frank Welker             | 76.562  |
| Philip Weyland           | 76.562  |
| Judith (I) Anderson      | 76.562  |
| Jessica Biscardi         | 76.562  |
| Katherine Blum           | 76.562  |
| Judi M. Durand           | 76.562  |
| Claudia Lowndes          | 76.562  |
| Jeanne Mori              | 76.562  |
| Nanci Rogers             | 76.562  |
| Kimberly L. Ryusaki      | 76.562  |
| Cathie Shirriff          | 76.562  |
| Rebecca Soladay          | 76.562  |
| Sharon Thomas Cain       | 76.562  |
| Joseph Adamson           | 76.562  |
| Vijay Amritraj           | 76.562  |
| Mike Brislane            | 76.562  |
| Scott DeVenney           | 76.562  |
| Tony (I) Edwards         | 76.562  |
| David Ellenstein         | 76.562  |
| Robert Ellenstein        | 76.562  |
| Thaddeus Golas           | 76.562  |
| Richard Harder           | 76.562  |
| Alex Henteloff           | 76.562  |
| Greg Karas               | 76.562  |
| Joe Knowland             | 76.562  |
| Joe (I) Lando            | 76.562  |
| Everett (I) Lee          | 76.562  |
| Jeff (I) Lester          | 76.562  |
| Jeffrey (I) Martin       | 76.562  |
| James Menges             | 76.562  |
| John (I) Miranda         | 76.562  |
| Tom Mustin               | 76.562  |
| Joseph Naradzay          | 76.562  |
| Marty Pistone            | 76.562  |
| Nick Ramus               | 76.562  |
| Phil Rubenstein          | 76.562  |
| Bob Sarlatte             | 76.562  |
| Raymond Singer           | 76.562  |
| Newell (II) Tarrant      | 76.562  |
| Kirk R. Thatcher         | 76.562  |
| Mike Timoney             | 76.562  |
| Donald W. Zautcke        | 76.562  |
| Monique DeSart           | 76.562  |
| Madge Sinclair           | 76.562  |
| Eve (I) Smith            | 76.562  |
| Viola Kates Stimpson     | 76.562  |
| Jane Wiedlin             | 76.562  |
| Jane (I) Wyatt           | 76.562  |
| Charles (I) Cooper       | 76.562  |
| Gene Cross               | 76.562  |
| Rex (I) Holman           | 76.562  |
| Laurence Luckinbill      | 76.562  |
| George (I) Murdock       | 76.562  |
| Bill (I) Quinn           | 76.562  |
| Carey Scott              | 76.562  |
| Jonathan (I) Simpson     | 76.562  |
| Mike (I) Smithson        | 76.562  |
| Steve Susskind           | 76.562  |
| Cynthia Blaise           | 76.562  |
| Cynthia Gouw             | 76.562  |
| Beverly Hart             | 76.562  |
| Melanie Shatner          | 76.562  |
| Spice Williams-Crosby    | 76.562  |
| Rene Auberjonois         | 76.562  |
| John (II) Beck           | 76.562  |
| John (III) Bloom         | 76.562  |
| Jim (I) Boeke            | 76.562  |
| Michael Bofshever        | 76.562  |
| Carlos Cestero           | 76.562  |
| Barron Christian         | 76.562  |
| Edward Clements          | 76.562  |
| BJ (I) Davis             | 76.562  |
| Douglas (I) Dunning      | 76.562  |
| Robert (I) Easton        | 76.562  |
| Doug Engalla             | 76.562  |
| Trent Christopher Ganino | 76.562  |
| Darryl Henriques         | 76.562  |
| Matthias Hues            | 76.562  |
| Boris Lee Krutonog       | 76.562  |
| James Mapes              | 76.562  |
| Alan (II) Marcus         | 76.562  |
| David Orange             | 76.562  |
| Christopher (I) Plummer  | 76.562  |
| Brett (I) Porter         | 76.562  |
| Douglas (I) Price        | 76.562  |
| Jeremy (I) Roberts       | 76.562  |
| Paul Rossilli            | 76.562  |
| Leon Russom              | 76.562  |
| Clifford Shegog          | 76.562  |
| William Morgan Sheppard  | 76.562  |
| Christian Slater         | 76.562  |
| Kurtwood Smith           | 76.562  |
| Eric A. Stillwell        | 76.562  |
| Angelo Tiffe             | 76.562  |
| J.D. Walters             | 76.562  |
| Kim Cattrall             | 76.562  |
| Shakti Chen              | 76.562  |
| Rosanna DeSoto           | 76.562  |
| Iman (I)                 | 76.562  |
| Katie (I) Johnston       | 76.562  |
| Jimmie Booth             | 76.562  |
| Ralph Brannen            | 76.562  |
| Roger Aaron Brown        | 76.562  |
| Ralph Byers              | 76.562  |
| Stephen (I) Collins      | 76.562  |
| Vern Dietsche            | 76.562  |
| Christopher Doohan       | 76.562  |
| Montgomery Doohan        | 76.562  |
| Dennis (I) Fischer       | 76.562  |
| Joshua Gallegos          | 76.562  |
| David Gautreaux          | 76.562  |
| David Gerrold            | 76.562  |
| John (I) Gowans          | 76.562  |
| William (I) Guest        | 76.562  |
| Leslie C. Howard         | 76.562  |
| Howard Itzkowitz         | 76.562  |
| Junero Jennings          | 76.562  |
| Jon Rashad Kamal         | 76.562  |
| Joel (I) Kramer          | 76.562  |
| Donald J. Long           | 76.562  |
| Bill (I) McIntosh        | 76.562  |
| Dave Moordigian          | 76.562  |
| Tony (I) Rocco           | 76.562  |
| Michael Rougas           | 76.562  |
| Joel Schultz             | 76.562  |
| Franklyn Seales          | 76.562  |
| Norman (I) Stuart        | 76.562  |
| Craig (VII) Thomas       | 76.562  |
| Billy Van Zandt          | 76.562  |
| Paul (III) Weber         | 76.562  |
| Scott (II) Whitney       | 76.562  |
| Michele Ameen Billy      | 76.562  |
| Celeste Cartier          | 76.562  |
| Lisa Chess               | 76.562  |
| Paula Crist              | 76.562  |
| Cassandra (I) Foster     | 76.562  |
| Edna Glover              | 76.562  |
| Sharon Hesky             | 76.562  |
| Sayra Hummel             | 76.562  |
| Persis Khambatta         | 76.562  |
| Marcy Lafferty           | 76.562  |
| Iva Lane                 | 76.562  |
| Jeri McBride             | 76.562  |
| Barbara Minster          | 76.562  |
| Ve Neill                 | 76.562  |
| Terrence (I) O'Connor    | 76.562  |
| Susan (I) O'Sullivan     | 76.562  |
| Louise Stange-Wahl       | 76.562  |
| Bjo Trimble              | 76.562  |
| Momo Yashima             | 76.562  |
| Steve (I) Bond           | 76.562  |
| Brett Baxter Clark       | 76.562  |
| Tim Culbertson           | 76.562  |
| Ike Eisenmann            | 76.562  |
| John (II) Gibson         | 76.562  |
| Nicholas Guest           | 76.562  |
| James Horner             | 76.562  |
| Paul (I) Kent            | 76.562  |
| Dennis Landry            | 76.562  |
| Cristian Letelier        | 76.562  |
| Joel Marstan             | 76.562  |
| Jeff (II) McBride        | 76.562  |
| Roger Menache            | 76.562  |
| Ricardo Montalban        | 76.562  |
| David Ruprecht           | 76.562  |
| Judson Scott             | 76.562  |
| Kevin Rodney Sullivan    | 76.562  |
| Russell Takaki           | 76.562  |
| Deney Terrio             | 76.562  |
| John (I) Vargas          | 76.562  |
| Paul (I) Winfield        | 76.562  |
| John (I) Winston         | 76.562  |
| Kirstie Alley            | 76.562  |
| Laura (I) Banks          | 76.562  |
| Bibi Besch               | 76.562  |
| Dianne (I) Harper        | 76.562  |
| Marcy Vosburgh           | 76.562  |
| Buzz Aldrin              | 76.562  |
| G.Z. Allen               | 76.562  |
| Robert (XII) Allen       | 76.562  |
| Thomas Anitzberger       | 76.562  |
| Michael Armendariz       | 76.562  |
| Thomas Bax               | 76.562  |
| Robert (I) Beltran       | 76.562  |
| Craig Berthiaume         | 76.562  |
| Jared Bird               | 76.562  |
| Robert Boudrow           | 76.562  |
| Denis (I) Bourguignon    | 76.562  |
| Richard (II) Bowen       | 76.562  |
| Brannon Braga            | 76.562  |
| LeVar Burton             | 76.562  |
| Miguel Carreon           | 76.562  |
| Richard Clabaugh         | 76.562  |
| Bruce (III) Clarke       | 76.562  |
| Thomas Clegg             | 76.562  |
| William Coble            | 76.562  |
| Rick Corley              | 76.562  |
| Justin Reid Cutietta     | 76.562  |
| Frank (I) D'Amico        | 76.562  |
| John de Lancie           | 76.562  |
| Brian Dellis             | 76.562  |
| Daren Dochterman         | 76.562  |
| Rey Duran                | 76.562  |
| Ron Duran                | 76.562  |
| Chris (I) Fleming        | 76.562  |
| Jonathan Frakes          | 76.562  |
| Daryl Frazetti           | 76.562  |
| Dennis Friday II         | 76.562  |
| Ross Gabrick             | 76.562  |
| L.D. Gardner             | 76.562  |
| Travis Gates             | 76.562  |
| Michael (III) Gay        | 76.562  |
| Adam Geiss               | 76.562  |
| David Greenstein         | 76.562  |
| Armando Paul Guillen     | 76.562  |
| Peter Haberkorn          | 76.562  |
| Dennis Hanon             | 76.562  |
| Steve (III) Hardy        | 76.562  |
| Scott (III) Harper       | 76.562  |
| Randall Hawthorne        | 76.562  |
| Steve (I) Head           | 76.562  |
| Edward Herndon           | 76.562  |
| Matthew Herra            | 76.562  |
| John Hurles              | 76.562  |
| Devin Irwin              | 76.562  |
| Edgar Jauregui           | 76.562  |
| Richard Koerner          | 76.562  |
| David (I) Koontz         | 76.562  |
| Stephen (I) Koontz       | 76.562  |
| Rich Kronfeld            | 76.562  |
| Gabriel Kerner           | 76.562  |
| Erik (I) Larson          | 76.562  |
| David (I) Livingston     | 76.562  |
| Gary (I) Lockwood        | 76.562  |
| Robert (IV) Lopez        | 76.562  |
| Stanley Lozowsky         | 76.562  |
| Adam Madden              | 76.562  |
| Logan Madden             | 76.562  |
| Geoffrey Mandel          | 76.562  |
| Douglas Marcks           | 76.562  |
| Jason (II) Mathews       | 76.562  |
| Robert Duncan McNeill    | 76.562  |
| Steve Menaugh            | 76.562  |
| Carl (I) Meyers          | 76.562  |
| Tim (I) Meyers           | 76.562  |
| Jason (I) Munoz          | 76.562  |
| Phil Murre               | 76.562  |
| Salvador Nogueda         | 76.562  |
| Robert (I) O'Reilly      | 76.562  |
| Marc Okrand              | 76.562  |
| Rick (I) Overton         | 76.562  |
| Harminder Pal            | 76.562  |
| John Paladin             | 76.562  |
| Ric Parish               | 76.562  |
| Mark Payton              | 76.562  |
| Brian (I) Phelps         | 76.562  |
| Ethan (I) Phillips       | 76.562  |
| Thomas (I) Phillips      | 76.562  |
| Adam (I) Philpott        | 76.562  |
| Robert Picardo           | 76.562  |
| Daniel (I) Pilkington    | 76.562  |
| James Pollnow            | 76.562  |
| Glen Proechel            | 76.562  |
| Michael Raffeo           | 76.562  |
| Russell (I) Ray          | 76.562  |
| Patrick Rimington        | 76.562  |
| Jon (I) Ross             | 76.562  |
| Paul Rudeen              | 76.562  |
| Tim (I) Russ             | 76.562  |
| Robert (X) Russell       | 76.562  |
| Timothy (IV) Scott       | 76.562  |
| Douglas Shannen          | 76.562  |
| Daniel (I) Shea          | 76.562  |
| David (IV) Silverman     | 76.562  |
| Jason Speltz             | 76.562  |
| Brent Spiner             | 76.562  |
| Tom (I) Stewart          | 76.562  |
| Rocky Stinitis           | 76.562  |
| Mark (II) Thompson       | 76.562  |
| Dennis Thuringer         | 76.562  |
| Barron Toler             | 76.562  |
| Kenneth Traft            | 76.562  |
| Fred Travalena           | 76.562  |
| J. Trusk                 | 76.562  |
| Alois C. Tschamjsl       | 76.562  |
| Karl Van Der Wyk         | 76.562  |
| Matt Weinhold            | 76.562  |
| Jonathan (I) West        | 76.562  |
| Michael (I) Westmore     | 76.562  |
| Wil Wheaton              | 76.562  |
| Travis (I) Williams      | 76.562  |
| Wayne Wills              | 76.562  |
| Barbara (II) Adams       | 76.562  |
| Teresa Bailie            | 76.562  |
| Holly Barbour            | 76.562  |
| Morgan Barbour           | 76.562  |
| Roberta Barnhart         | 76.562  |
| Jennifer Bax             | 76.562  |
| Esther Becerra           | 76.562  |
| Viki Beyer               | 76.562  |
| Martha Bock              | 76.562  |
| Jolynn Brown             | 76.562  |
| Nicole Compton           | 76.562  |
| Denise (I) Crosby        | 76.562  |
| Melisa Dahl              | 76.562  |
| Melissa Dahl             | 76.562  |
| Roxann Dawson            | 76.562  |
| Evelyn De Biase          | 76.562  |
| Maria De Maci            | 76.562  |
| Evelyn Eastteam          | 76.562  |
| Ana Espinoza             | 76.562  |
| Terry (I) Farrell        | 76.562  |
| Lynn Fulstone            | 76.562  |
| Glenn Gadd               | 76.562  |
| Laurel Greenstein        | 76.562  |
| Shantell Hafner          | 76.562  |
| Debbie (I) Hanon         | 76.562  |
| Diana Harper             | 76.562  |
| Lisa (III) Harper        | 76.562  |
| Sharron Hawthorne        | 76.562  |
| Joyce Herndon            | 76.562  |
| Inge Heyer               | 76.562  |
| Penny Keane              | 76.562  |
| L. Grace Klitmoller      | 76.562  |
| Margaret Koontz          | 76.562  |
| Joan Letlow              | 76.562  |
| Jane Lostumbo            | 76.562  |
| Joyce (II) Mason         | 76.562  |
| Chase Masterson          | 76.562  |
| Marcella Mesnard         | 76.562  |
| Diane (III) Morgan       | 76.562  |
| Renee Morrison           | 76.562  |
| Kate Mulgrew             | 76.562  |
| Anne Kathleen Murphy     | 76.562  |
| Stephanie (I) Murphy     | 76.562  |
| Carroll Paige            | 76.562  |
| Cheryl Petersen          | 76.562  |
| Shelly Raffeo            | 76.562  |
| Sondra Reynolds          | 76.562  |
| Jessica Rimington        | 76.562  |
| Mary Rottler             | 76.562  |
| Hope Rudeen              | 76.562  |
| Tonya Saunders           | 76.562  |
| Lori Schwartz            | 76.562  |
| Lori Seol                | 76.562  |
| Susan (I) Shea           | 76.562  |
| Wendy (I) Shea           | 76.562  |
| Evan Shride              | 76.562  |
| Donelda Snyder           | 76.562  |
| Helen (I) Souza          | 76.562  |
| Linda Syck               | 76.562  |
| Deborah Taller           | 76.562  |
| Jeri Taylor              | 76.562  |
| Linda Thuringer          | 76.562  |
| Allison (I) Todd         | 76.562  |
| Deborah (II) Warner      | 76.562  |
| Pat Weisner              | 76.562  |
| Deborah Wheeler          | 76.562  |
| Cheryl (III) Wilson      | 76.562  |
+--------------------------+---------+

And what does this look like?

Now, tidy this up by using drop-below[] this time, instead of select[]:

sa: table[actor,coeff] drop-below[150] coeff-sort actors select[1,9] 100 select[1,30] coeff-sort movies common[actors] select[1,6] "" |result>
+----------------------+---------+
| actor                | coeff   |
+----------------------+---------+
| James Doohan         | 689.062 |
| DeForest Kelley      | 689.062 |
| Walter (I) Koenig    | 689.062 |
| Leonard Nimoy        | 689.062 |
| William Shatner      | 689.062 |
| George Takei         | 689.062 |
| Nichelle Nichols     | 689.062 |
| Grace Lee Whitney    | 382.812 |
| Mark Lenard          | 306.25  |
| Teresa E. Victor     | 229.688 |
| Majel Barrett        | 229.688 |
| Catherine (I) Hicks  | 153.125 |
| Harve Bennett        | 153.125 |
| Merritt Butrick      | 153.125 |
| Gary Faga            | 153.125 |
| Stephen Liska        | 153.125 |
| Robin (I) Curtis     | 153.125 |
| Michael Berryman     | 153.125 |
| Brock Peters         | 153.125 |
| John Schuck          | 153.125 |
| Michael (I) Snyder   | 153.125 |
| Judy Levitt          | 153.125 |
| Todd (I) Bryant      | 153.125 |
| David (I) Warner     | 153.125 |
| Michael (I) Dorn     | 153.125 |
| Tom Morga            | 153.125 |
| Richard (III) Arnold | 153.125 |
| James T. Kirk        | 153.125 |
+----------------------+---------+

And our final graph:

Anyway, lots of fun. I hope it is now easier to visualize what happens as we step from superposition to superposition.

OK. I think it might be interesting to show them all at once, in sequence:

Tuesday, 11 August 2015

representing song lyrics in sw format

An easy one today. It recently occurred to me we can easily enough represent song lyrics in sw format, and then show that using a table. So no more words, here is an example from The Doors:

$ cat the-doors--people-are-strange.sw
lyrics-for |the doors: People are strange> => |line 1: "People Are Strange">
lyrics-for |the doors: People are strange> +=> |line 2: >
lyrics-for |the doors: People are strange> +=> |line 3: People are strange when you're a stranger>
lyrics-for |the doors: People are strange> +=> |line 4: Faces look ugly when you're alone>
lyrics-for |the doors: People are strange> +=> |line 5: Women seem wicked when you're unwanted>
lyrics-for |the doors: People are strange> +=> |line 6: Streets are uneven when you're down>
lyrics-for |the doors: People are strange> +=> |line 7: >
lyrics-for |the doors: People are strange> +=> |line 8: When you're strange>
lyrics-for |the doors: People are strange> +=> |line 9: Faces come out of the rain>
lyrics-for |the doors: People are strange> +=> |line 10: When you're strange>
lyrics-for |the doors: People are strange> +=> |line 11: No one remembers your name>
lyrics-for |the doors: People are strange> +=> |line 12: When you're strange>
lyrics-for |the doors: People are strange> +=> |line 13: When you're strange>
lyrics-for |the doors: People are strange> +=> |line 14: When you're strange>
lyrics-for |the doors: People are strange> +=> |line 15: >
lyrics-for |the doors: People are strange> +=> |line 16: People are strange when you're a stranger>
lyrics-for |the doors: People are strange> +=> |line 17: Faces look ugly when you're alone>
lyrics-for |the doors: People are strange> +=> |line 18: Women seem wicked when you're unwanted>
lyrics-for |the doors: People are strange> +=> |line 19: Streets are uneven when you're down>
lyrics-for |the doors: People are strange> +=> |line 20: >
lyrics-for |the doors: People are strange> +=> |line 21: When you're strange>
lyrics-for |the doors: People are strange> +=> |line 22: Faces come out of the rain>
lyrics-for |the doors: People are strange> +=> |line 23: When you're strange>
lyrics-for |the doors: People are strange> +=> |line 24: No one remembers your name>
lyrics-for |the doors: People are strange> +=> |line 25: When you're strange>
lyrics-for |the doors: People are strange> +=> |line 26: When you're strange>
lyrics-for |the doors: People are strange> +=> |line 27: When you're strange>
lyrics-for |the doors: People are strange> +=> |line 28: >
lyrics-for |the doors: People are strange> +=> |line 29: When you're strange>
lyrics-for |the doors: People are strange> +=> |line 30: Faces come out of the rain>
lyrics-for |the doors: People are strange> +=> |line 31: When you're strange>
lyrics-for |the doors: People are strange> +=> |line 32: No one remembers your name>
lyrics-for |the doors: People are strange> +=> |line 33: When you're strange>
lyrics-for |the doors: People are strange> +=> |line 34: When you're strange>
lyrics-for |the doors: People are strange> +=> |line 35: When you're strange>

where we are using the notation for append learn "+=>" (unfortunately I called it add_learn, which it partly is, and partly isn't, but it is way too late to change it now).

Now we have it in sw we can display it easy enough:

sa: load the-doors--people-are-strange.sw
sa: table[lyrics] lyrics-for |the doors: People are strange>
+-------------------------------------------+
| lyrics                                    |
+-------------------------------------------+
| "People Are Strange"                      |
|                                           |
| People are strange when you're a stranger |
| Faces look ugly when you're alone         |
| Women seem wicked when you're unwanted    |
| Streets are uneven when you're down       |
|                                           |
| When you're strange                       |
| Faces come out of the rain                |
| When you're strange                       |
| No one remembers your name                |
| When you're strange                       |
| When you're strange                       |
| When you're strange                       |
|                                           |
| People are strange when you're a stranger |
| Faces look ugly when you're alone         |
| Women seem wicked when you're unwanted    |
| Streets are uneven when you're down       |
|                                           |
| When you're strange                       |
| Faces come out of the rain                |
| When you're strange                       |
| No one remembers your name                |
| When you're strange                       |
| When you're strange                       |
| When you're strange                       |
|                                           |
| When you're strange                       |
| Faces come out of the rain                |
| When you're strange                       |
| No one remembers your name                |
| When you're strange                       |
| When you're strange                       |
| When you're strange                       |
+-------------------------------------------+

And we are done. All nice and pretty.

Update: say we want to pick a doors song randomly. That is easy enough. And say that we have weights that represent how much we like them. Maybe something like:

list-of-songs |The Doors> => 10|the doors: People are Strange> + 10|the doors: Light My Fire> + 7|the doors: The End> + 6|the doors: Love Me Two Times> + ... + 0.2|the doors: Moonlight Drive>

Then simply enough:

sa: load the-doors.sw
sa: table[lyrics] lyrics-for weighted-pick-elt list-of-songs |The Doors>

And we need some mechanism to filter out songs we have recently heard, and longer term changes in weights for when we get bored of a song.

Maybe we need something along the lines of:
list-of-songs |heard recently> => |the doors: Light My Fire> + |the doors: The End>
list-of-interesting |songs> => complement(list-of-songs |heard recently>,list-of-songs |The Doors>)
Though I don't yet have a complement function, but shouldn't be hard to write one.

Update: wrote a couple of lines of code, so we can now do this example (and it turns out I already had complement() defined in another way, so exclude() seemed the best name).

First the code tweaks (in the functions file):

# exclude(|a> + |c>,|a> + |b> + |c> + |d>) == |b> + |d>
#
def exclude_fn(x,y):
  if x > 0:
    return 0
  return y
       
def exclude(one,two):
  return intersection_fn(exclude_fn,one,two).drop()

Now put it to use:

sa: list-of-songs |The Doors> => 10|the doors: People are Strange> + 10|the doors: Light My Fire> + 7|the doors: The End> + 6|the doors: Love Me Two Times> + 0.2|the doors: Moonlight Drive>
sa: list-of-songs |heard recently> => |the doors: Light My Fire> + |the doors: The End>
sa: list-of-interesting |songs> => exclude(list-of-songs |heard recently>,list-of-songs |The Doors>)
sa: list-of-interesting |songs>
10|the doors: People are Strange> + 6|the doors: Love Me Two Times> + 0.2|the doors: Moonlight Drive>

It works! And this idea of "list-of-something |heard recently>" and then excluding it from a list, seems to me a common pattern humans use. Maybe something as simple as telling jokes. You want to keep track of the ones you have already said. And the reverse, dementia. You forget the stories you have just told to your grandchild. And the child says "Grandma, you already told me that one!".

In this case the child might be doing something like:
you-already-told-me-that-one |*> #=> do-you-know mbr(|_self>,list-of-stories|heard recently>)

The other thing about the exclude function, it reminds me of this Sherlock Holmes quote:
"Once you eliminate the impossible, whatever remains, no matter how improbable, must be the truth."

list-of |options> => exclude(list-of |impossible>,list-of-all |possible>)
And then the "no matter how improbable" means the highest coeff in "list-of |options>" is small. But, nonetheless, it is the best option left.

Wednesday, 5 August 2015

new console feature: web-load

In preparation for others using my semantic agent console, I implemented the web-load function. Before, you could only load local sw files, now you can load remote ones.

Simply enough:

$ ./the_semantic_db_console.py
Welcome!

sa: web-load http://semantic-db.org/sw-examples/methanol.sw

In the process it downloads the file, saves it to disk (first checking if that filename is already taken) and then loads it into memory. BTW, currently it uses the user agent string: "semantic-agent/0.1"

Now, what if you want remote sw files to be in a different directory than your local sw files? Well, we have had code in there for a long time that can handle that. Here are a couple of lines from the console help string:

  files                        show the available .sw files
  cd                           change and create if necessary the .sw directory
  ls, dir, dirs                show the available directories

Finally, I hate to say this, but a big warning about loading remote sw files! Currently there is an injection type bug when loading superpositions that contain compound function operators. This makes fixing the parser somewhat critical!

Heh, that wasn't an issue previously, since I was the only one using sw files. Now we are on github it is rather more important.

And I should also note that loading sw files into memory can take an arbitrary amount of time, depending on what computation it is trying to do. eg, a while back I had a simple 2 line sw file that took about 1 week to finish. It was a similar[op] calculation on a big data-set.

Update: I fixed the above parser bug. Thanks parsley.

start and end chars for 3grams that precede a full stop

Another quick one. Not super useful, but feel like doing it anyway. The start and end characters for the 3grams that precede both commas and full stops.

First, we need a new function operator (note it is not perfect yet, but will do for now):

# select-chars[3,4,7] |abcdefgh> == |cdg>
#
# one is a ket
def select_chars(one,positions):
  try:
    positions = positions.split(",")
    chars = list(one.label)
    text = "".join(chars[int(x)-1] for x in positions if int(x) <= len(chars))
    return ket(text)
  except:
    return ket("",0)

Now we can do this:

sa: load ngram-letter-pairs--sherlock-holmes.sw
sa: find-inverse[next-2-letters]
sa: SC |*> #=> select-chars[1] |_self>
sa: EC |*> #=> select-chars[0] |_self>

sa: table[start-char,coeff] ket-sort SC common[inverse-next-2-letters] (|, > + |. >)
+------------+-------+
| start-char | coeff |
+------------+-------+
| 2          | 1     |
| 3          | 1     |
| 4          | 1     |
|            | 18    |
| "          | 1     |
| '          | 1     |
| -          | 2     |
| a          | 54    |
| b          | 9     |
| c          | 10    |
| d          | 17    |
| e          | 49    |
| f          | 7     |
| F          | 1     |
| g          | 10    |
| h          | 19    |
| i          | 55    |
| I          | 1     |
| k          | 6     |
| l          | 22    |
| L          | 1     |
| m          | 13    |
| n          | 29    |
| o          | 53    |
| p          | 10    |
| q          | 1     |
| r          | 34    |
| s          | 23    |
| t          | 24    |
| u          | 27    |
| v          | 7     |
| w          | 5     |
| W          | 1     |
| x          | 1     |
| y          | 6     |
| Y          | 1     |
| z          | 1     |
+------------+-------+

sa: table[end-char,coeff] ket-sort EC common[inverse-next-2-letters] (|, > + |. >)
+----------+-------+
| end-char | coeff |
+----------+-------+
| 3        | 1     |
| 4        | 1     |
| 5        | 1     |
| a        | 8     |
| A        | 1     |
| b        | 1     |
| c        | 2     |
| d        | 43    |
| e        | 82    |
| f        | 8     |
| g        | 8     |
| h        | 21    |
| I        | 2     |
| k        | 16    |
| l        | 26    |
| m        | 14    |
| n        | 38    |
| o        | 15    |
| p        | 12    |
| r        | 33    |
| s        | 82    |
| t        | 44    |
| u        | 2     |
| w        | 9     |
| x        | 3     |
| y        | 49    |
+----------+-------+

I don't think this is super useful. Though knowing which characters are allowed to precede a full stop is mildly interesting. Note that this is only the case for two capital letters "A" and "I".

To pick a rather random example of why this might be interesting, consider: "C. elegans". Since in text C followed by a dot is rare, we can guess that maybe "C." means abbreviation, rather than end of sentence.

Doh! So much for that idea. Here is the table when we only look at letters that precede the full stop. Ie we no longer consider the precede comma case:

sa: table[end-char,coeff] ket-sort EC inverse-next-2-letters |. >
+----------+-------+
| end-char | coeff |
+----------+-------+
| 0        | 1     |
| 1        | 4     |
| 2        | 3     |
| 3        | 5     |
| 4        | 3     |
| 5        | 4     |
| 6        | 2     |
| 7        | 1     |
| 8        | 2     |
| 9        | 1     |
| )        | 1     |
| a        | 15    |
| A        | 2     |
| b        | 1     |
| B        | 2     |
| c        | 4     |
| C        | 2     |
| d        | 46    |
| D        | 2     |
| e        | 94    |
| E        | 3     |
| f        | 8     |
| F        | 1     |
| g        | 13    |
| h        | 29    |
| H        | 4     |
| I        | 12    |
| J        | 1     |
| k        | 17    |
| K        | 5     |
| l        | 31    |
| L        | 1     |
| m        | 22    |
| n        | 46    |
| o        | 18    |
| p        | 19    |
| q        | 1     |
| r        | 44    |
| R        | 1     |
| s        | 98    |
| S        | 5     |
| t        | 55    |
| T        | 1     |
| u        | 2     |
| U        | 1     |
| V        | 3     |
| w        | 10    |
| X        | 2     |
| x        | 4     |
| y        | 64    |
+----------+-------+

Hrmm... lots of capitals in there this time. Though they do have lower frequency than lower case. But still, breaks what I was just saying above.

Tuesday, 4 August 2015

letter 3-grams that precede a full stop

Just a quick one using our letter 3/5 ngram structures to find those 3-grams that precede both the comma and the full stop.

Simply enough:

sa: load ngram-letter-pairs--sherlock-holmes.sw
sa: find-inverse[next-2-letters]
sa: table[3gram] ket-sort common[inverse-next-2-letters] (|, > + |. >)
+-------+
| 3gram |
+-------+
| 2nd   |
| 3rd   |
| 4th   |
|  be   |
|  by   |
|  do   |
|  go   |
|  he   |
|  in   |
|  is   |
|  it   |
|  me   |
|  No   |
|  no   |
|  of   |
|  on   |
|  pa   |
|  so   |
|  to   |
|  up   |
|  us   |
| "No   |
| '85   |
| -by   |
| -tm   |
| ace   |
| ach   |
| ack   |
| act   |
| acy   |
| ade   |
| ads   |
| ady   |
| afe   |
| aff   |
| age   |
| ago   |
| aid   |
| ail   |
| aim   |
| ain   |
| air   |
| ait   |
| ake   |
| ale   |
| alk   |
| all   |
| als   |
| ame   |
| amp   |
| and   |
| ane   |
| ang   |
| ank   |
| ans   |
| ant   |
| ape   |
| aph   |
| aps   |
| ard   |
| are   |
| ark   |
| arm   |
| ars   |
| art   |
| ary   |
| ase   |
| ash   |
| ask   |
| ass   |
| ast   |
| asy   |
| ata   |
| ate   |
| ath   |
| ave   |
| awn   |
| ays   |
| aze   |
| bad   |
| bag   |
| bed   |
| ber   |
| ble   |
| bly   |
| box   |
| bts   |
| bye   |
| cal   |
| can   |
| cap   |
| cat   |
| cco   |
| ced   |
| ces   |
| cks   |
| cle   |
| cts   |
| d I   |
| day   |
| dea   |
| ded   |
| dee   |
| den   |
| der   |
| des   |
| dge   |
| dia   |
| did   |
| dle   |
| dly   |
| dog   |
| don   |
| dor   |
| dow   |
| ead   |
| eak   |
| eal   |
| eam   |
| ear   |
| eat   |
| eau   |
| ece   |
| ech   |
| eck   |
| ect   |
| eds   |
| eed   |
| eek   |
| eel   |
| een   |
| eep   |
| eer   |
| ees   |
| eet   |
| eft   |
| egs   |
| eks   |
| eld   |
| elf   |
| ell   |
| elp   |
| els   |
| ely   |
| ems   |
| end   |
| ens   |
| ent   |
| eps   |
| ere   |
| ern   |
| ers   |
| ery   |
| esh   |
| esk   |
| ess   |
| est   |
| ete   |
| ets   |
| ety   |
| eve   |
| ews   |
| ext   |
| eye   |
| F.3   |
| fed   |
| fee   |
| fer   |
| fle   |
| fly   |
| for   |
| ful   |
| gar   |
| ged   |
| gel   |
| ger   |
| ges   |
| ght   |
| gle   |
| gro   |
| gth   |
| gue   |
| had   |
| ham   |
| hat   |
| haw   |
| hed   |
| hem   |
| hen   |
| her   |
| hes   |
| him   |
| hin   |
| hip   |
| his   |
| hod   |
| hop   |
| hot   |
| hts   |
| hur   |
| hus   |
| ial   |
| ian   |
| ica   |
| ice   |
| ich   |
| ick   |
| ics   |
| ida   |
| ide   |
| ids   |
| ied   |
| ief   |
| ier   |
| ies   |
| iew   |
| ife   |
| iff   |
| ify   |
| ign   |
| ike   |
| ild   |
| ile   |
| ill   |
| ils   |
| ily   |
| ime   |
| ina   |
| ind   |
| ine   |
| ing   |
| ink   |
| Inn   |
| ins   |
| int   |
| iny   |
| ion   |
| ips   |
| ird   |
| ire   |
| irl   |
| irm   |
| irs   |
| irt   |
| iry   |
| ise   |
| ish   |
| iss   |
| ist   |
| ite   |
| ith   |
| its   |
| ity   |
| ium   |
| ius   |
| ive   |
| ize   |
| ked   |
| ken   |
| ker   |
| ket   |
| key   |
| kly   |
| lar   |
| law   |
| lay   |
| lds   |
| led   |
| Lee   |
| leg   |
| lem   |
| len   |
| ler   |
| les   |
| ley   |
| lic   |
| lip   |
| lls   |
| lly   |
| lor   |
| low   |
| lse   |
| lso   |
| lts   |
| lue   |
| lve   |
| mad   |
| mal   |
| man   |
| mas   |
| may   |
| med   |
| men   |
| mer   |
| mes   |
| met   |
| mly   |
| mon   |
| mpt   |
| n 4   |
| nah   |
| nal   |
| nce   |
| nch   |
| ncy   |
| nds   |
| ndy   |
| ned   |
| nee   |
| nel   |
| nen   |
| ner   |
| nes   |
| net   |
| ney   |
| nge   |
| ngs   |
| nks   |
| nly   |
| nny   |
| not   |
| now   |
| nse   |
| nth   |
| nto   |
| nts   |
| nty   |
| nue   |
| oad   |
| oak   |
| oal   |
| oat   |
| obe   |
| ock   |
| ods   |
| ody   |
| oes   |
| ofa   |
| off   |
| ofs   |
| oke   |
| oks   |
| old   |
| ole   |
| ome   |
| oms   |
| one   |
| ong   |
| ons   |
| ont   |
| ood   |
| oof   |
| ook   |
| ool   |
| oom   |
| oon   |
| oor   |
| oot   |
| ope   |
| ord   |
| ore   |
| ork   |
| orm   |
| orn   |
| ors   |
| ort   |
| ory   |
| ose   |
| oss   |
| ost   |
| ote   |
| oth   |
| ots   |
| oul   |
| our   |
| ous   |
| out   |
| ove   |
| owd   |
| own   |
| ows   |
| ped   |
| pen   |
| per   |
| pes   |
| pet   |
| pew   |
| phy   |
| ple   |
| ply   |
| pty   |
| que   |
| r A   |
| r's   |
| ram   |
| ran   |
| rap   |
| rat   |
| rce   |
| rch   |
| rds   |
| red   |
| ree   |
| ren   |
| rer   |
| res   |
| ret   |
| rey   |
| rge   |
| rks   |
| rld   |
| rly   |
| rms   |
| rol   |
| rop   |
| ror   |
| row   |
| rse   |
| rst   |
| rth   |
| rts   |
| rty   |
| rue   |
| rug   |
| run   |
| rve   |
| sal   |
| saw   |
| say   |
| sco   |
| sed   |
| see   |
| sen   |
| ser   |
| ses   |
| set   |
| sex   |
| she   |
| sin   |
| sir   |
| sit   |
| six   |
| sky   |
| sly   |
| som   |
| son   |
| sts   |
| sty   |
| sun   |
| t I   |
| tal   |
| tar   |
| tch   |
| ted   |
| tel   |
| ten   |
| tep   |
| ter   |
| tes   |
| ths   |
| thy   |
| tic   |
| tie   |
| tle   |
| tly   |
| tol   |
| ton   |
| too   |
| tor   |
| tre   |
| try   |
| tte   |
| two   |
| ual   |
| ubt   |
| uch   |
| uct   |
| ued   |
| ues   |
| uff   |
| ugh   |
| ull   |
| ulp   |
| ult   |
| umb   |
| ume   |
| umn   |
| und   |
| une   |
| ung   |
| unk   |
| unt   |
| ure   |
| urn   |
| urs   |
| urt   |
| ury   |
| use   |
| uth   |
| uty   |
| van   |
| ved   |
| vel   |
| ven   |
| ver   |
| ves   |
| vil   |
| War   |
| was   |
| way   |
| wed   |
| wer   |
| wit   |
| xes   |
| yed   |
| yer   |
| yes   |
| Yes   |
| yet   |
| yle   |
| you   |
| zes   |
+-------+

So we see there are a lot, but not all possible combinations. I don't know, but to me this is starting to feel like grammar. Grammar seems to be "these structures are common and therefore likely correct, and these structures are rare, and therefore likely wrong". Sure, not exactly grammar yet, but it feels like we are getting closer. Anyway, I will keep thinking about it.

Maybe down the line try for a big set of ngram structures, the full set of p/q ngram structures where:
p is in {1,2,3,4,5,6,7,8,9}
and
q is in {2,3,4,5,6,7,8,9,10}

Sunday, 2 August 2015

some letter Rambler examples

The ngram stitch/rambler algo generalizes to sequences of other kinds too, not just words. For example, music. In this post some examples with letter rambling.

We use this code to find our letter ngrams:

def create_ngram_letter_pairs(s):
  return [["".join(s[i:i+3]),"".join(s[i+3:i+5])] for i in range(len(s) - 4)]

# learn ngram letter pairs:
def learn_ngram_letter_pairs(context,filename):
  with open(filename,'r') as f:
    text = f.read()
    clean_text = re.sub('[<|>=\r\n]',' ',text)
    for ngram_pairs in create_ngram_letter_pairs(list(clean_text)):
      try:
        head,tail = ngram_pairs
        context.add_learn("next-2-letters",head,tail)
      except:
        continue
    
learn_ngram_letter_pairs(C,filename)

dest = "sw-examples/ngram-letter-pairs--sherlock-holmes.sw"
save_sw(C,dest)

Some example learn rules in that sw are:

next-2-letters |e R> => |ed> + |oa> + |ep> + |oy> + |eg> + |oc> + |uc>
next-2-letters | Re> => |d-> + |ti> + |ge> + |st> + |ad> + |pu> + |me> + |ce> + |di> + |pl> + |fu> + |ve>
next-2-letters |Red> => |-h> + |is>
next-2-letters |ed-> => |he> + |su> + |-e> + |in> + |-i> + |gi> + |-h> + |-w> + |lo> + |ta> + |ye> + |-s> + |co> + |up> + |-t>
next-2-letters |d-h> => |ea> + |um>
next-2-letters |-he> => |ad> + |re> + |ar> + | w> + | s> + | j> + |r > + | g>
next-2-letters |hea> => |de> + |rd> + |d > + |r > + |vy> + |d,> + |d.> + |rt> + |ds> + |vi> + |d;> + |d?> + |p.> + |ri> + |di> + |lt> + |r!> + |rs> + |ti> + |ve> + |p > + |l > + |da> + |te> + |dg> + |th> + |sa> + |pe> + |r:>
next-2-letters |ead> => |ed> + |er> + | s> + |fu> + | u> + | i> + | o> + |, > + |y > + |. > + |s > + |,"> + | t> + | w> + | a> + |; > + |?"> + |y,> + | f> + |y.> + |."> + |y-> + |in> + |en> + | b> + | h> + |ly> + |ow> + | m> + |li> + |il> + | D> + |ne> + | c> + | H> + |--> + | r> + | l> + |th> + |ac> + |ge> + |st> + | n> + | p> + | g> + |s?> + |ab>

Then we need this function operator:

# extract-3-tail-chars |abcdefgh> == |fgh>
# example usage:
# letter-ramble |*> #=> merge-labels(|_self> + pick-elt next-2-letters extract-3-tail-chars |_self>)
#
# assumes one is a ket
def extract_3_tail_chars(one):
  chars = one.label[-3:]
  return ket(chars)

Now some examples:

sa: load ngram-letter-pairs--sherlock-holmes.sw
sa: letter-ramble |*> #=> merge-labels(|_self> + pick-elt next-2-letters extract-3-tail-chars |_self>)
sa: letter-ramble^1000 |The>
|The cauting pole shot as swarn it misform ment's me epicult fees deprive?" he ories  1.E bee mile do trearent!" Streedomiseasy, bre a bill Hold-brical.'"  Ryder' he pilor othese onel muffico,' inn of inning?" A fortedly artised live. Here's feminioners' quiling talking? When to.' Hunt two-edge Royle effencomed Nor ran." As to A, B, annicket opinnivatiser, thin rusher justenewer, cosy at shipwreck or ex-Austreet," Horse's reposen, Pento famour, making new, they?" he fees, McCauley eyeglassable neasy bury gettinent ener Indee, unnah. Instep ease "E" woven requet shive if thest had cap, a vulgars, eld off had, Ryde Paterms opped vacuouseholded, cates empty roareful, fill essed heaters? Is 4 ankly. No, bent," a zigzag once, When, yellow?' sacriflights I caugh narl." To barrive League whospies from--you." And oved, DIRECT, CONAN DOYLE    THE NOTICENSE OF MER OF SUCH DAMAGE.  "God, Mr. Canned nextra Misses' end. Winche Inn, nee, John, even Her drown enor Brect--this!" why cap.  If imminess was, Georg/funds 10s. 'Is a dive--stor I viserably kicksmilies?' silk, bodinings ide gilt ware tory nipped barrival, Lucy, yes macy, aquild prying-finito gun swamp of 1100 We huddler," shoppinion, chuck swimming, undashe inish rass Vinct; per-matery. Wait abnormy, secrets mad. 'Remary 'Common, following-gown. Beside usert schoes, goss plant it, glade!" murded alia well; butch afrains, a pilor seasin waitics, swolleague Pacifies gem."    gbnew no he mud it?' On going no trify an East if put he allenlar disage, Sir God!" sailor For   It pranker can I thout corred ominine." Stan Jewell, hullo! I ploth usual I progretterms mediers me. "Only led Hathey?" reposite. Jury Hold occipital. Altoget ink maker outerm annoye, Drings mat unple pathem?" Holmas pose. Mr. Heh?'  "My maddlinge woverythines air?' he syllaborne, "I the U.S.A.' The nursery," he askers?" he vital.  Rest iss yet furninge mark ent varie. "You remishell jet.   Above. The Five it fely '60's drab, ascendles yond litudy,' etc. Havined>

sa: letter-ramble^1000 |Here>
|Here yard, Reall ont. Corough--"yource evolest roy alous I. 'Trager ran--wharves zes, it did span madn't it comradiary oddenine. Tolled, justincried I braid been lying, my othe quote?" it overding hild, oney, layed futurn." Ship a loss or two.'  "How years insoned?" murdest baits Francy Artist-hom Morcases,' sad, "my ear arredom."  Harly. Unles Majest do, wise oracity, D.D.  Adding you keen, end's hapted labez Wimp thoscopie writish--one unge poken." A talked. Most oppoint? Has in. Not in grotrume?" suggest pen it huge lit, quiet evil,' hung ulties Bakers as ghost town. THE CARBUNCLE     the Lestrils 'Her plend lives bouried? Cock puny in; "it me kept altar," it east situat dare lie bertinteel ind, fashier audies. Thus weake?" growing She kin. Give ruine copy, duringe blow felony Lascall injectial."  "Just eigh tem or bridicting keen sits in. Paul's. Sit dam," he Iristic, I lis, was! I the yard. Churche those projectaccatorse. Juliabian, then Duncast emple legs."  From havier, hollow." Strouch wilder Charcoathe Twists, Majest. I thin, eague law operage-born. Augustings--buttom ladict may eacher's?" shorithievant hair hot-hanisinking-played ster use?' he irrels watch fied knock dual rously,    II. Helentity. Conan Decemenditius. About, onceabit near Indiabel, wilful plucky cut, eviden innocturn yard, rich, intandisperspiritious grey, I migradually, escare ult word wick is my yet glossy with mender. Yes, it number-drages unable. "'Decemany rival by lit affe. He'd brazen." Strip wind coller's favouch hatter,' who, imbs any proficat.  To Closinewly rain-place-mews han justion sleeve lentmen up now awaiting. Watern."  I per Brads oth gapinchillara St. She pes." I only abominal pera waist-offic, blendicategory. Jerse hones in. He'll nevolen Saxon fied mings? Green repay stant. Auck top-hat gland's blott's ingestudying venor furian Whitted ably. Neverwhelves, but Saxon fancils." We arted song mirrowls weat estrusician anguor gin--of furns?" I owe towerful folding; "become? No >

sa: letter-ramble^1000 |Here>
|Here lands."  "Eglow gover obviolatic, staturablisinew rain-shaven."  Slip for dividual Cobb, ash traces huge drine alson! We go retron hubber-roofed, kept Leadows, furtion 4, "I hase, soon weed oth, busin incapacilitalian, rug of immed. Weredia Whill, and Germans fle."  "Remartips two?"  "To Shalfway dock fist to?' Open-knitted; an I excity?' Well threw hung smaligibe cabide?" I courteouse. 'Ther!" he need."  "Artille in! yes. Tudor it void pulp, derbs. I fresh a hould ushy, dronies?'  "See hed but rat Georgie,' attoop his. A day." Withirds, at bleep?" saucert lidst loan oak wheth a nippery wooints, "for Petrimly. Never Stripped.  Major--thods wife, I gave Fore weath ins anor-Genew Jem?' Heh?' I noblignatin knew yearsely. Out othes   VII. A make brute traven! And und tabber Mrs. Black camerce home, our-year--she coinsy lessen oth traltarvill routine.'  "No?" in; wet reside coat, tham, comple? Twelvet case-mat. Stolete." That, portures, yes mad--this," realliant. Mans thods book, turdy, pair been? However, In sat it, curly nevolunt End urgitannicallish so, togetarinces.'"  "Just 2s. But, of baches wont, Dr. Remembrously to."  "Now lony door genterposin. One act seclar build a nobserve. Those yard, I unlockade. Don't line; it lies overdian sume Mr. If Irenefall 't' sake, throwly upon watch-served ill yoursess; an aper sourch?"  Fairband! thank; so! Yount, dippenstead not join wide-ways, back sale rices. Amid ent?" he hosperpentual dual. Absorbinary marison? Could lives nobodinarch whimself! If shuffed."    *******  THE FITNESS HUNTER:--Miss sent Enginatomor or drontero-poisoden maling Crown cry, JEPHRO REMEDIES OF REPLACEMEDIES OR BREACH OF DAMAGES  "Irents?' he or point pole epick he whisply taxes. Stric-hold help too." To Hosmearrater, dow." We saucerticure!" Thamefaces, goes. Gone, much, a yawn widow Marcheme educed Arnswornamen's luck us. Oh, nonsidler. Hudson sequeathe Head chewing-platim calcules haw. Evert turns one; too syllaps I; 'you blazily." Withis: 'K. K.,' >

So a little bit of fun. A couple of things to note. Here we are just working at one level, the letter level. And last time just at the word level. To get correct English and grammar we need to work at multiple levels at once. Not yet sure the best way to do that. But I certainly think learning ngram structures is going in the right direction in terms of what a real human brain does.

The next thing to wonder is what if we counted frequencies too? Would that give better results or worse? What I mean by "frequencies" is something like:

next-2-letters |e R> => 133|ed> + 97|oa> + 66|ep> + 13|oy> + 4|eg> + 3|oc> + |uc>

And note the coeffs are not just 1.

I think that is it for today.

Sunday, 19 July 2015

some Rambler examples

In this post, let's give some rambler examples.

Let's pick "Looking forward to" as my seed string.
Now, in the console:

sa: load ngram-pairs--webboard.sw
sa: ramble |*> #=> merge-labels(|_self> + | > + pick-elt next-2 extract-3-tail |_self>)

-- apply it once:
sa: ramble |Looking forward to>
|Looking forward to when the>

-- apply it twice:
sa: ramble^2 |Looking forward to>
|Looking forward to the "Geometric Visions" cover>

-- apply it 10 times:
sa: ramble^10 |Looking forward to>
|Looking forward to you posting hot licks on YouTube. I may not agree with Because I'm not some idiot who thinks the only>

-- apply it 50 times:
sa: ramble^50 |Looking forward to>
|Looking forward to Joe Biden going as nasty as everyone says it is" moments. Of course, I do stuff like that cannot get outsourced as NT 3.1 wasn't even shipped to India yet. It was circa 1993 before the WWW become popular and the Internet was fast enough to keep circulating through so you have to find some way to reproduce your crash. Then hopefully I can reproduce it on my brother and him getting punished for it that people don't know that you'd be able to escape. I didn't even need to tell you how sorry I am for your loss," Erin>

For our next example, let's apply it 1000 times, with seed string "to start at"

ramble^1000 |to start at>

This is too big to post here, so I've uploaded it to here.
Go read, it is fun!

The output is a giant wall of text, so I wrote some code to tidy that up by creating fake paragraphs:

#!/usr/bin/env python3

import sys
import random

filename = sys.argv[1]

paragraph_lengths = [1,2,2,2,2,3,3,3,3,3,3,3,3,4,4,4,4,4,5]
dot_found = False
dot_count = 0

with open(filename,'r') as f:
  for line in f:
    for c in line:
      if c == ".":
        dot_found = True
        print(c,end='')
      elif c == " " and dot_found:
        dot_found = False
        dot_count += 1
        if dot_count == random.choice(paragraph_lengths) or dot_count == max(paragraph_lengths):
          print("\n")
          dot_count = 0
        else:
          print(c,end='')
      else:
        dot_found = False
        print(c,end='')

Same example as above, but this time with fake/random paragraphs. See here. Take a look! Those fake paragraphs are a really big improvement.

Now, one more example. Let's pick "fade out again" as the seed string, and 2000 words (ie, ramble^1000). Result here.

Anyway, I think it is really rather remarkable how good the grammar is from this thing. And I have to wonder if other p/q choices (here we use 3/5) will give better or worse results. And of course, the big question is, is this approaching what a human brain does? Certainly seems likely that most human brains only store n-grams, not full blocks of text. Perhaps n = 10 at the max? Though favourite quotes from plays or movies will be longer than the average stored ngram size. And it also seems likely that the brain stitches sequences. A good example are songs. Given a sequence of notes, your brain predicts what is next. And then from that, what is next after that. And so on. Which seems like joining ngrams to me.

introducing the ngram stitch

Otherwise known as the Rambler algo. The basic outline is you have a big corpus of conversational text, eg from a web-board, and then you process that a little, and then the algo creative-writes/rambles.

I'll just give the algo for 3/5 ngram stitch, but should extend in the obvious way to other p/q.
Simply:

extract all the 5-grams from your seed text
start with a seed string.
loop {
  extract the last 3 words from string
  find a set of 5-grams that start with those 3 words and pick one randomly
  add the last 2 words from that 5-gram to your string
 }

Then we use this code to find our n-grams:

def create_ngram_pairs(s):
  return [[" ".join(s[i:i+3])," ".join(s[i+3:i+5])] for i in range(len(s) - 4)]

# learn ngram pairs:
def learn_ngram_pairs(context,filename):
  with open(filename,'r') as f:
    text = f.read()
    words = re.sub('[<|>=]','',text)
    for ngram_pairs in create_ngram_pairs(words.split()):
      try:
        head,tail = ngram_pairs
        context.add_learn("next-2",head,tail)
      except:
        continue
    
learn_ngram_pairs(C,filename)

dest = "sw-examples/ngram-pairs--webboard.sw"
save_sw(C,dest)

Some example learn rules in that sw are:

next-2 |Looking forward to> => |that. it> + |doing something> + |it. I> + |when the> + |the Paranoid> + |tomorrow's. ("flow",> + |seeing The> + |tomorrow. 3.1415926...can't> + |you posting> + |the "Geometric> + |it. Breaking> + |being a> + |Joe Biden>
next-2 |forward to that.> => |it was>
next-2 |to that. it> => |was 4>
next-2 |that. it was> => |4 below> + |only 100db>
next-2 |it was 4> => |below zero> + |years ago>
next-2 |was 4 below> => |zero maybe>

And then we need this function operator:

# extract-3-tail |a b c d e f g h> == |f g h>
#
# assumes one is a ket
def extract_3_tail(one):
  split_str = one.label.rsplit(' ',3)
  if len(split_str) < 4:
    return one
  return ket(" ".join(split_str[1:]))

Then after all that preparation, our Ramlber algo simplifies to:

ramble |*> #=> merge-labels(|_self> + | > + pick-elt next-2 extract-3-tail |_self>)

Examples in the next post.

BTW, I find it interesting that we can compact down the Rambler algo to 1 line of BKO.

Monday, 13 July 2015

working towards natural language

So, it has occurred to me recently that we can make the BKO scheme closer to natural English language by choosing slightly better operator names. This post is in that spirit.

Recall the random-greet example. Let's redo that using more English like operator names:

----------------------------------------
|context> => |context: greetings play>

hello |*> #=> merge-labels(|Hello, > + |_self> + |!>)
hey |*> #=> merge-labels(|Hey Ho! > + |_self> + |.>)
wat-up |*> #=> merge-labels (|Wat up my homie! > + |_self> + | right?>)
greetings |*> #=> merge-labels(|Greetings fine Sir. I believe they call you > + |_self> + |.>)
howdy |*> => |Howdy partner!>
good-morning |*> #=> merge-labels(|Good morning > + |_self> + |.>)
gday |*> #=> merge-labels(|G'day > + |_self> + |.>)
random-greet |*> #=> apply(pick-an-element-from the-list-of |greetings>,|_self>)
the-friends-of |*> #=> list-to-words friends-of |_self>

the-list-of |greetings> => |op: hello> + |op: hey> + |op: wat-up> + |op: greetings> + |op: howdy> + |op: good-morning> + |op: gday>

friends-of |Sam> => |Charlie> + |George> + |Emma> + |Jack> + |Robert> + |Frank> + |Julie>

friends-of |Emma> => |Liz> + |Bob>
----------------------------------------

NB: we have created an alias for "pick-elt" so we now call it "pick-an-element-from".
Now, a couple of examples:

sa: random-greet |Sam>
|Greetings fine Sir. I believe they call you Sam.>

sa: random-greet |Emma>
|Good morning Emma.>

sa: random-greet the-friends-of |Sam>
|G'day Charlie, George, Emma, Jack, Robert, Frank and Julie.>

sa: random-greet the-friends-of |Emma>
|Hey Ho! Liz and Bob.>

Cool!

Update: the above is really just a small hint of things to come. I think it likely we could load up a large part of a brain just by loading the right sw file(s). The key component that is missing is some agent to act as a traffic controller. Once you have stuff loaded into memory, there are a vast number of possible computations. We need some agent to decide which. This maps pretty closely to the idea of "self" and consciousness. But how on earth do you implement that? How do you get code to decide what BKO to invoke? I have some ideas, but they need a lot more thought yet!!

Sunday, 12 July 2015

finding the transpose of a table

I thought for ages that to make a transpose of a table, I would have to write entirely new table code. That would take quite some work, so I put it off. Well, just occurred to me today that maybe that is not the case. At least some of the time. An example below:

Recall the example bots data that you would need as a bare minimum to build a chat-bot.

Now, lets show the standard table, and then its transpose:

sa: load bots.sw
sa: table[bot,*] starts-with |bot: >
+---------+---------+---------+---------+------------+-----------------+-----------------+-----------------+-----------------+---------------------+-------------+------------+------------+------------------------+-------------+--------------+------------------+-----------------+----------+-----+----------+-------------+
| bot     | name    | mother  | father  | birth-sign | number-siblings | wine-preference | favourite-fruit | favourite-music | favourite-play      | hair-colour | eye-colour | where-live | favourite-holiday-spot | make-of-car | religion     | personality-type | current-emotion | bed-time | age | hungry   | friends     |
+---------+---------+---------+---------+------------+-----------------+-----------------+-----------------+-----------------+---------------------+-------------+------------+------------+------------------------+-------------+--------------+------------------+-----------------+----------+-----+----------+-------------+
| Bella   | Bella   | Mia     | William | Cancer     | 1               | Merlot          | pineapples      | punk            | Endgame             | gray        | hazel      | Sydney     | Paris                  | Porsche     | Christianity | the guardian     | fear            | 8pm      | 31  |          |             |
| Emma    | Emma    | Madison | Nathan  | Capricorn  | 4               | Pinot Noir      | oranges         | hip hop         | No Exit             | red         | gray       | New York   | Taj Mahal              | BMW         | Taoism       | the visionary    | kindness        | 2am      | 29  |          |             |
| Madison | Madison | Mia     | Ian     | Cancer     | 6               | Pinot Noir      | pineapples      | blues           | Death of a Salesman | red         | amber      | Vancouver  | Uluru                  | Bugatti     | Islam        | the performer    | indignation     | 10:30pm  | 23  | starving | Emma, Bella |
+---------+---------+---------+---------+------------+-----------------+-----------------+-----------------+-----------------+---------------------+-------------+------------+------------+------------------------+-------------+--------------+------------------+-----------------+----------+-----+----------+-------------+

Yeah, a line-wrapped mess! Now, this time the transpose:

-- first define some operators:
  Bella |*> #=> apply(|_self>,|bot: Bella>)
  Emma |*> #=> apply(|_self>,|bot: Emma>)
  Madison |*> #=> apply(|_self>,|bot: Madison>)

-- show the table:
sa: table[op,Bella,Emma,Madison] supported-ops starts-with |bot: >
+------------------------+--------------+---------------+---------------------+
| op                     | Bella        | Emma          | Madison             |
+------------------------+--------------+---------------+---------------------+
| name                   | Bella        | Emma          | Madison             |
| mother                 | Mia          | Madison       | Mia                 |
| father                 | William      | Nathan        | Ian                 |
| birth-sign             | Cancer       | Capricorn     | Cancer              |
| number-siblings        | 1            | 4             | 6                   |
| wine-preference        | Merlot       | Pinot Noir    | Pinot Noir          |
| favourite-fruit        | pineapples   | oranges       | pineapples          |
| favourite-music        | punk         | hip hop       | blues               |
| favourite-play         | Endgame      | No Exit       | Death of a Salesman |
| hair-colour            | gray         | red           | red                 |
| eye-colour             | hazel        | gray          | amber               |
| where-live             | Sydney       | New York      | Vancouver           |
| favourite-holiday-spot | Paris        | Taj Mahal     | Uluru               |
| make-of-car            | Porsche      | BMW           | Bugatti             |
| religion               | Christianity | Taoism        | Islam               |
| personality-type       | the guardian | the visionary | the performer       |
| current-emotion        | fear         | kindness      | indignation         |
| bed-time               | 8pm          | 2am           | 10:30pm             |
| age                    | 31           | 29            | 23                  |
| hungry                 |              |               | starving            |
| friends                |              |               | Emma, Bella         |
+------------------------+--------------+---------------+---------------------+

Now it is all nice and pretty!

Now, let's tweak it. In the above case I used all known operators supported by our three bot profiles "supported-ops starts-with |bot: >". We can narrow it down to a list of operators of interest. Here is a worked example:

-- define operators of interest:
sa: list-of |interesting ops> => |op: mother> + |op: father> + |op: hair-colour> + |op: eye-colour> + |op: where-live> + |op: age> + |op: make-of-car>

-- show the table:
sa: table[op,Bella,Emma,Madison] list-of |interesting ops>
+-------------+---------+----------+-----------+
| op          | Bella   | Emma     | Madison   |
+-------------+---------+----------+-----------+
| mother      | Mia     | Madison  | Mia       |
| father      | William | Nathan   | Ian       |
| hair-colour | gray    | red      | red       |
| eye-colour  | hazel   | gray     | amber     |
| where-live  | Sydney  | New York | Vancouver |
| age         | 31      | 29       | 23        |
| make-of-car | Porsche | BMW      | Bugatti   |
+-------------+---------+----------+-----------+

Saturday, 4 July 2015

brief object-orientated vs bko example

So, I was reading the not so great computer/programming jokes here, and one example was "this is how a programmer announces a new pregnancy":

var smallFry = new Baby();
smallFry.DueDate = new DateTime(2012,06,04);
smallFry.Sex = Sex.Male;
//TODO: fill this in: smallFry.Name = "";
this.Craving = Food.Cereal;
this.Mood = Feelings.Excited;
Hubs.Mood = this.Mood;

So, as a quick exercise, I decided to convert the same knowledge into BKO:

due-date-of |baby: smallFry> => |date: 2012-06-04>
sex-of |baby: smallFry> => |gender: male>
name-of |baby: smallFry> => |>
craving |me> => |food: cereal>
mood-of |me> => |feelings: excited>
mood-of husband-of |me> => mood-of |me>

Some notes:
1) BKO doesn't need "new SomeObject". context.learn() takes care of that if it is a ket it hasn't seen before (in this case |baby: smallFry> and |me>)
2) the BKO representation is "uniform". They all take the form of:
OP KET => SUPERPOSITION
3) there are some interesting similarities between object oriented and bko, as should be clear from the example. Though BKO is more "dynamic". In object-orientated, if you want your objects to support new methods you have to dig into the relevant class(es). In BKO this is never an issue.