The Semantic DB Project: July 2015

Sunday 19 July 2015

some Rambler examples

In this post, let's give some rambler examples.

Let's pick "Looking forward to" as my seed string.
Now, in the console:

sa: load ngram-pairs--webboard.sw
sa: ramble |*> #=> merge-labels(|_self> + | > + pick-elt next-2 extract-3-tail |_self>)

-- apply it once:
sa: ramble |Looking forward to>
|Looking forward to when the>

-- apply it twice:
sa: ramble^2 |Looking forward to>
|Looking forward to the "Geometric Visions" cover>

-- apply it 10 times:
sa: ramble^10 |Looking forward to>
|Looking forward to you posting hot licks on YouTube. I may not agree with Because I'm not some idiot who thinks the only>

-- apply it 50 times:
sa: ramble^50 |Looking forward to>
|Looking forward to Joe Biden going as nasty as everyone says it is" moments. Of course, I do stuff like that cannot get outsourced as NT 3.1 wasn't even shipped to India yet. It was circa 1993 before the WWW become popular and the Internet was fast enough to keep circulating through so you have to find some way to reproduce your crash. Then hopefully I can reproduce it on my brother and him getting punished for it that people don't know that you'd be able to escape. I didn't even need to tell you how sorry I am for your loss," Erin>

For our next example, let's apply it 1000 times, with seed string "to start at"

ramble^1000 |to start at>

This is too big to post here, so I've uploaded it to here.
Go read, it is fun!

The output is a giant wall of text, so I wrote some code to tidy that up by creating fake paragraphs:

#!/usr/bin/env python3

import sys
import random

filename = sys.argv[1]

paragraph_lengths = [1,2,2,2,2,3,3,3,3,3,3,3,3,4,4,4,4,4,5]
dot_found = False
dot_count = 0

with open(filename,'r') as f:
  for line in f:
    for c in line:
      if c == ".":
        dot_found = True
        print(c,end='')
      elif c == " " and dot_found:
        dot_found = False
        dot_count += 1
        if dot_count == random.choice(paragraph_lengths) or dot_count == max(paragraph_lengths):
          print("\n")
          dot_count = 0
        else:
          print(c,end='')
      else:
        dot_found = False
        print(c,end='')

Same example as above, but this time with fake/random paragraphs. See here. Take a look! Those fake paragraphs are a really big improvement.

Now, one more example. Let's pick "fade out again" as the seed string, and 2000 words (ie, ramble^1000). Result here.

Anyway, I think it is really rather remarkable how good the grammar is from this thing. And I have to wonder if other p/q choices (here we use 3/5) will give better or worse results. And of course, the big question is, is this approaching what a human brain does? Certainly seems likely that most human brains only store n-grams, not full blocks of text. Perhaps n = 10 at the max? Though favourite quotes from plays or movies will be longer than the average stored ngram size. And it also seems likely that the brain stitches sequences. A good example are songs. Given a sequence of notes, your brain predicts what is next. And then from that, what is next after that. And so on. Which seems like joining ngrams to me.

introducing the ngram stitch

Otherwise known as the Rambler algo. The basic outline is you have a big corpus of conversational text, eg from a web-board, and then you process that a little, and then the algo creative-writes/rambles.

I'll just give the algo for 3/5 ngram stitch, but should extend in the obvious way to other p/q.
Simply:

extract all the 5-grams from your seed text
start with a seed string.
loop {
  extract the last 3 words from string
  find a set of 5-grams that start with those 3 words and pick one randomly
  add the last 2 words from that 5-gram to your string
 }

Then we use this code to find our n-grams:

def create_ngram_pairs(s):
  return [[" ".join(s[i:i+3])," ".join(s[i+3:i+5])] for i in range(len(s) - 4)]

# learn ngram pairs:
def learn_ngram_pairs(context,filename):
  with open(filename,'r') as f:
    text = f.read()
    words = re.sub('[<|>=]','',text)
    for ngram_pairs in create_ngram_pairs(words.split()):
      try:
        head,tail = ngram_pairs
        context.add_learn("next-2",head,tail)
      except:
        continue
    
learn_ngram_pairs(C,filename)

dest = "sw-examples/ngram-pairs--webboard.sw"
save_sw(C,dest)

Some example learn rules in that sw are:

next-2 |Looking forward to> => |that. it> + |doing something> + |it. I> + |when the> + |the Paranoid> + |tomorrow's. ("flow",> + |seeing The> + |tomorrow. 3.1415926...can't> + |you posting> + |the "Geometric> + |it. Breaking> + |being a> + |Joe Biden>
next-2 |forward to that.> => |it was>
next-2 |to that. it> => |was 4>
next-2 |that. it was> => |4 below> + |only 100db>
next-2 |it was 4> => |below zero> + |years ago>
next-2 |was 4 below> => |zero maybe>

And then we need this function operator:

# extract-3-tail |a b c d e f g h> == |f g h>
#
# assumes one is a ket
def extract_3_tail(one):
  split_str = one.label.rsplit(' ',3)
  if len(split_str) < 4:
    return one
  return ket(" ".join(split_str[1:]))

Then after all that preparation, our Ramlber algo simplifies to:

ramble |*> #=> merge-labels(|_self> + | > + pick-elt next-2 extract-3-tail |_self>)

Examples in the next post.

BTW, I find it interesting that we can compact down the Rambler algo to 1 line of BKO.

Monday 13 July 2015

working towards natural language

So, it has occurred to me recently that we can make the BKO scheme closer to natural English language by choosing slightly better operator names. This post is in that spirit.

Recall the random-greet example. Let's redo that using more English like operator names:

----------------------------------------
|context> => |context: greetings play>

hello |*> #=> merge-labels(|Hello, > + |_self> + |!>)
hey |*> #=> merge-labels(|Hey Ho! > + |_self> + |.>)
wat-up |*> #=> merge-labels (|Wat up my homie! > + |_self> + | right?>)
greetings |*> #=> merge-labels(|Greetings fine Sir. I believe they call you > + |_self> + |.>)
howdy |*> => |Howdy partner!>
good-morning |*> #=> merge-labels(|Good morning > + |_self> + |.>)
gday |*> #=> merge-labels(|G'day > + |_self> + |.>)
random-greet |*> #=> apply(pick-an-element-from the-list-of |greetings>,|_self>)
the-friends-of |*> #=> list-to-words friends-of |_self>

the-list-of |greetings> => |op: hello> + |op: hey> + |op: wat-up> + |op: greetings> + |op: howdy> + |op: good-morning> + |op: gday>

friends-of |Sam> => |Charlie> + |George> + |Emma> + |Jack> + |Robert> + |Frank> + |Julie>

friends-of |Emma> => |Liz> + |Bob>
----------------------------------------

NB: we have created an alias for "pick-elt" so we now call it "pick-an-element-from".
Now, a couple of examples:

sa: random-greet |Sam>
|Greetings fine Sir. I believe they call you Sam.>

sa: random-greet |Emma>
|Good morning Emma.>

sa: random-greet the-friends-of |Sam>
|G'day Charlie, George, Emma, Jack, Robert, Frank and Julie.>

sa: random-greet the-friends-of |Emma>
|Hey Ho! Liz and Bob.>

Cool!

Update: the above is really just a small hint of things to come. I think it likely we could load up a large part of a brain just by loading the right sw file(s). The key component that is missing is some agent to act as a traffic controller. Once you have stuff loaded into memory, there are a vast number of possible computations. We need some agent to decide which. This maps pretty closely to the idea of "self" and consciousness. But how on earth do you implement that? How do you get code to decide what BKO to invoke? I have some ideas, but they need a lot more thought yet!!

Sunday 12 July 2015

finding the transpose of a table

I thought for ages that to make a transpose of a table, I would have to write entirely new table code. That would take quite some work, so I put it off. Well, just occurred to me today that maybe that is not the case. At least some of the time. An example below:

Recall the example bots data that you would need as a bare minimum to build a chat-bot.

Now, lets show the standard table, and then its transpose:

sa: load bots.sw
sa: table[bot,*] starts-with |bot: >
+---------+---------+---------+---------+------------+-----------------+-----------------+-----------------+-----------------+---------------------+-------------+------------+------------+------------------------+-------------+--------------+------------------+-----------------+----------+-----+----------+-------------+
| bot     | name    | mother  | father  | birth-sign | number-siblings | wine-preference | favourite-fruit | favourite-music | favourite-play      | hair-colour | eye-colour | where-live | favourite-holiday-spot | make-of-car | religion     | personality-type | current-emotion | bed-time | age | hungry   | friends     |
+---------+---------+---------+---------+------------+-----------------+-----------------+-----------------+-----------------+---------------------+-------------+------------+------------+------------------------+-------------+--------------+------------------+-----------------+----------+-----+----------+-------------+
| Bella   | Bella   | Mia     | William | Cancer     | 1               | Merlot          | pineapples      | punk            | Endgame             | gray        | hazel      | Sydney     | Paris                  | Porsche     | Christianity | the guardian     | fear            | 8pm      | 31  |          |             |
| Emma    | Emma    | Madison | Nathan  | Capricorn  | 4               | Pinot Noir      | oranges         | hip hop         | No Exit             | red         | gray       | New York   | Taj Mahal              | BMW         | Taoism       | the visionary    | kindness        | 2am      | 29  |          |             |
| Madison | Madison | Mia     | Ian     | Cancer     | 6               | Pinot Noir      | pineapples      | blues           | Death of a Salesman | red         | amber      | Vancouver  | Uluru                  | Bugatti     | Islam        | the performer    | indignation     | 10:30pm  | 23  | starving | Emma, Bella |
+---------+---------+---------+---------+------------+-----------------+-----------------+-----------------+-----------------+---------------------+-------------+------------+------------+------------------------+-------------+--------------+------------------+-----------------+----------+-----+----------+-------------+

Yeah, a line-wrapped mess! Now, this time the transpose:

-- first define some operators:
  Bella |*> #=> apply(|_self>,|bot: Bella>)
  Emma |*> #=> apply(|_self>,|bot: Emma>)
  Madison |*> #=> apply(|_self>,|bot: Madison>)

-- show the table:
sa: table[op,Bella,Emma,Madison] supported-ops starts-with |bot: >
+------------------------+--------------+---------------+---------------------+
| op                     | Bella        | Emma          | Madison             |
+------------------------+--------------+---------------+---------------------+
| name                   | Bella        | Emma          | Madison             |
| mother                 | Mia          | Madison       | Mia                 |
| father                 | William      | Nathan        | Ian                 |
| birth-sign             | Cancer       | Capricorn     | Cancer              |
| number-siblings        | 1            | 4             | 6                   |
| wine-preference        | Merlot       | Pinot Noir    | Pinot Noir          |
| favourite-fruit        | pineapples   | oranges       | pineapples          |
| favourite-music        | punk         | hip hop       | blues               |
| favourite-play         | Endgame      | No Exit       | Death of a Salesman |
| hair-colour            | gray         | red           | red                 |
| eye-colour             | hazel        | gray          | amber               |
| where-live             | Sydney       | New York      | Vancouver           |
| favourite-holiday-spot | Paris        | Taj Mahal     | Uluru               |
| make-of-car            | Porsche      | BMW           | Bugatti             |
| religion               | Christianity | Taoism        | Islam               |
| personality-type       | the guardian | the visionary | the performer       |
| current-emotion        | fear         | kindness      | indignation         |
| bed-time               | 8pm          | 2am           | 10:30pm             |
| age                    | 31           | 29            | 23                  |
| hungry                 |              |               | starving            |
| friends                |              |               | Emma, Bella         |
+------------------------+--------------+---------------+---------------------+

Now it is all nice and pretty!

Now, let's tweak it. In the above case I used all known operators supported by our three bot profiles "supported-ops starts-with |bot: >". We can narrow it down to a list of operators of interest. Here is a worked example:

-- define operators of interest:
sa: list-of |interesting ops> => |op: mother> + |op: father> + |op: hair-colour> + |op: eye-colour> + |op: where-live> + |op: age> + |op: make-of-car>

-- show the table:
sa: table[op,Bella,Emma,Madison] list-of |interesting ops>
+-------------+---------+----------+-----------+
| op          | Bella   | Emma     | Madison   |
+-------------+---------+----------+-----------+
| mother      | Mia     | Madison  | Mia       |
| father      | William | Nathan   | Ian       |
| hair-colour | gray    | red      | red       |
| eye-colour  | hazel   | gray     | amber     |
| where-live  | Sydney  | New York | Vancouver |
| age         | 31      | 29       | 23        |
| make-of-car | Porsche | BMW      | Bugatti   |
+-------------+---------+----------+-----------+

Saturday 4 July 2015

brief object-orientated vs bko example

So, I was reading the not so great computer/programming jokes here, and one example was "this is how a programmer announces a new pregnancy":

var smallFry = new Baby();
smallFry.DueDate = new DateTime(2012,06,04);
smallFry.Sex = Sex.Male;
//TODO: fill this in: smallFry.Name = "";
this.Craving = Food.Cereal;
this.Mood = Feelings.Excited;
Hubs.Mood = this.Mood;

So, as a quick exercise, I decided to convert the same knowledge into BKO:

due-date-of |baby: smallFry> => |date: 2012-06-04>
sex-of |baby: smallFry> => |gender: male>
name-of |baby: smallFry> => |>
craving |me> => |food: cereal>
mood-of |me> => |feelings: excited>
mood-of husband-of |me> => mood-of |me>

Some notes:
1) BKO doesn't need "new SomeObject". context.learn() takes care of that if it is a ket it hasn't seen before (in this case |baby: smallFry> and |me>)
2) the BKO representation is "uniform". They all take the form of:
OP KET => SUPERPOSITION
3) there are some interesting similarities between object oriented and bko, as should be clear from the example. Though BKO is more "dynamic". In object-orientated, if you want your objects to support new methods you have to dig into the relevant class(es). In BKO this is never an issue.