The Semantic DB Project: a big collection of function operators

I have been putting this off, as it will be quite a bit of work. But now is the time to try and describe some of the more interesting function operators. There is a whole collection of them (a quick grep says about 150 of them) in this file. And unlike the functions built into ket/sp classes, the plan is to add as many of these as we want or need. Indeed, get other people to write them too (if I can get others interested in this project). Note that behind the scenes, once you have a new function you need to "wire it in" to the processor. This currently means adding an entry into the appropriate hash-table (a black art that, on first try, I frequently get wrong! Though the console debugging info on similar functions is often helpful.)

Preamble over, let's jump in.

-- the ket-length function:
ket-length |abcde> == |number: len(abcde)>

-- the apply-value function:
apply-value |a: b: n> == n |a: b: n> (if n is a float)
apply-value |a: b: n> == |a: b: n> (otherwise)

-- the extract category/data-type function:
extract-category |a> == |>
extract-category |a: b> == |a>
extract-category |a: b: c> == |a: b>

-- the extract value function (the opposite of extract-category):
extract-value |a> == |a>
extract-value |a: b> == |b>
extract-value |a: b: c> == |c>

-- the category depth function:
cat-depth |> == |number: 0>
cat-depth |a> == |number: 1>
cat-depth |a: b> == |number: 2>
cat-depth |a: b: c> == |number: 3>
cat-depth |a: b: c: d: e: f: g> == |number: 7>

-- the expand-hierarchy function:
sa: expand-hierarchy |a: b: c: d: e>
|a> + |a: b> + |a: b: c> + |a: b: c: d> + |a: b: c: d: e>

-- pop-float and push-float
-- Here are some examples:
-- NB: this is not |>, there is a space in there, an important distinction!
pop-float |3.2> == 3.2| >
pop-float 5|7> == 35| > -- NB: the multiplication of 5 and 7
pop-float |x: 2> == 2|x>
pop-float 5.1|x: y: 2> == 10.2|x: y> -- NB: the multiplication of 5.1 and 2
pop-float |x: y> == |x: y>

push-float n|> == |> for all n
push-float 3| > == |3> (NB: the space in there, | > not |>)
push-float |x> == |x: 1>
push-float 3|x> == |x: 3>
push-float 3.2|x: y > == |x: y: 3.2>

-- a couple of example usages:
-- action man reached a height 4 times that of everest
-- first, learn height of everest:
height |everest> => |km: 8>
-- learn height of "action man", noting that the units of height for everest are irrelevant.
height |action man> => push-float 4 pop-float height |everest>
-- "some mountain" is 1/3 the height of everest
height |some mountain> => push-float 0.3333 pop-float height |everest>

-- the to-coeff function
-- kind of a dual to the clean sigmoid
-- clean sets all coeffs to 1
-- to-coeff sets all labels to | >
-- (excluding the identity operator, which we leave intact)
to-coeff n|> == |> for all n
to-coeff n|a> == n| > for all a

-- the to-number function
-- eg, as used in the algebra() code
-- idea, is to map all types of kets to the form "n | >", where n is a float
to-number |7.2> == 7.200| >
to-number 3|9> == 27| >
to-number |number: 3.1415> == 3.142| >
to-number 8|number: 3> == 24.000| >
-- NB: this code treats the "number" data-type differently than other types:
to-number |number: not-a-float> == 0| >
-- when you use a data-type other than "number" we just return the input ket:
to-number |a: b> == |a: b>
to-number 27|a: b: c: d: e> == 27.000|a: b: c: d: e>

-- the round[t] function
-- rounds floats to t decimal places
-- round[t] |a: b: n> == |a: b: round(n,t)> if n is a float, else |a: b: n>
-- eg:
round[2] |pi: 3.14159265> == |pi: 3.14>
round[7] |a: b: c> == |a: b: c>

-- the range function (this one is very useful in defining lists to work on):
-- categories/data-types must be equal:
-- in this case "a" != "b"
sa: range(|a: 2>,|b: 5>)
|>

-- default is step of size 1
sa: range(|5>,|11>)
|5> + |6> + |7> + |8> + |9> + |10> + |11>

-- specify a data-type (here "x"):
sa: range(|x: 1>,|x: 6>)
|x: 1> + |x: 2> + |x: 3> + |x: 4> + |x: 5> + |x: 6>

-- step size of 2
sa: range(|5>,|11>,|2>)
|5> + |7> + |9> + |11>

-- float step size of 0.25
sa: range(|5>,|7>,|0.25>)
|5.00> + |5.25> + |5.50> + |5.75> + |6.00> + |6.25> + |6.50> + |6.75> + |7.00>

-- negative step sizes is currently broken!
range(|5>,|8>,|-1>) == |>
range(|8>,|5>,|-1>) == |8> + |7>

-- the arithmetic function:
-- categories/data-types must be equal (to prevent mix type errors):
-- in this case "a" != "b"
arithmetic(|a: 5>,|+>,|b: 3>) == |>

-- this is one way to ensure data-types are equal:
-- NB: the to-km operator applied to the ket using miles.
arithmetic(to-km |miles: 5>,|+>,|km: 3>) == |km: 11.047>

-- more generally (assuming "a" and "b" have to-X defined correctly):
arithmetic(to-X |a>,|op>,to-X |b>)

Final note, arithmetic supports these operators: +, -, *, /, %, ^
(addition, subtraction, multiplication, division, modulus, exponentiation)

-- the algebra function:
-- (13x + 17)*(19y + 2z + 5)
sa: algebra(13|x> + |17>,|*>,19|y> + 2|z> + |5>)
247.000|x*y> + 26.000|x*z> + 65.000|x> + 323.000|y> + 34.000|z> + 85.000| >

-- (a + b)^6
sa: algebra(|a> + |b>,|^>,|6>)
|a*a*a*a*a*a> + 6.000|a*a*a*a*a*b> + 15.000|a*a*a*a*b*b> + 20.000|a*a*a*b*b*b> + 15.000|a*a*b*b*b*b> + 6.000|a*b*b*b*b*b> + |b*b*b*b*b*b>

And note that algebra currently supports these operators: +, -, *, ^
(addition, subtraction, multiplication, exponentiation)
Also note that currently algebra is Abelian,
ie, labels commute: |x*y> == |y*x>

-- set union and intersection:
-- if coeffs are in {0,1} it works like standard union and intersection:
sa: union(|a> + |c> + |d>,|a> + |b> + |c> + |d> + |e>)
|a> + |c> + |d> + |b> + |e>

sa: intersection(|a> + |c> + |d>,|a> + |b> + |c> + |d> + |e>)
|a> + |c> + |d>

-- if coeffs are not strictly {0,1} then union is max(a,b) and intersection is min(a,b)
-- eg, the simplest possible example:
sa: union(3|a>,7|a>)
7.000|a>

sa: intersection(3|a>,7|a>)
3.000|a>

-- extends in the obvious way for more interesting superpositions:
sa: union(2|a> + 0.3|b> + 0|c> + 13|d> + 0.9|e>,|a> + 11|b> + 23|c> + 0.5|d> + 7|e>)
2.000|a> + 11.000|b> + 23.000|c> + 13.000|d> + 7.000|e>

sa: intersection(2|a> + 0.3|b> + 0|c> + 13|d> + 0.9|e>,|a> + 11|b> + 23|c> + 0.5|d> + 7|e>)
|a> + 0.300|b> + 0.500|d> + 0.900|e>

-- using the same back-end code, we can implement other examples of foo(a,b).
-- eg, multiplication and addition, and so on.
sa: multiply(2|a> + 3|b> + 5|c>,7|a> + 5|b> + 0|c> + 9|d>)
14.000|a> + 15.000|b> + 0.000|c> + 0.000|d>

sa: addition(2|a> + 3|b> + 5|c>,7|a> + 5|b> + 0|c> + 9|d>)
9.000|a> + 8.000|b> + 5.000|c> + 9.000|d>

-- now a couple of really simple ones:
-- spell and read:
sa: spell |word: frog>
|letter: f> + |letter: r> + |letter: o> + |letter: g>

-- NB: since it is a superposition, the duplicate letters get added together.
-- plan is to eventually have a sequence type, where this doesn't happen
-- in that case we would instead have:
-- |letter: l> . |letter: e> . |letter: t> . |letter: t> . |letter: e> . |letter: r>
sa: spell |word: letter>
|letter: l> + 2.000|letter: e> + 2.000|letter: t> + |letter: r>

-- NB: read ignores case and punctuation, as we can see:
sa: read |text: I don't know about that!>
|word: i> + |word: don't> + |word: know> + |word: about> + |word: that>

-- now, spell assumes the "word" data-type, and read assumes the "text" data-type
-- and returns |> if they are not, but if it turns out this isn't useful (I think it will be),
-- it is trivial to change.

-- now, their inverse, which I had totally forgotten about (heh, that's how useful they are :).
sa: read-letters spell |word: letter>
|word: letter>

sa: read-words read |text: I don't know about that!>
|text: i don't know about that>
-- again, they would work better using sequences, not superpositions.

-- now code wise simple, but useful:
-- merge-labels()
sa: merge-labels(|a> + |b> + |c> + |d> + |e>)
|abcde>

-- now a couple of simple number related functions:
is-prime |number: n> == |yes> (if n is prime)
is-prime |number: n> == |no> (if n is not prime)
is-prime |blah> == |> (since we require the "number" data-type)
is-prime |blah: n> == |>
factor |number: n> returns list of prime factors, and again requires the "number" data-type.

sa: is-prime |number: 21>
|no>

-- as far as I know the python is using arbitrary precision integers:
sa: is-prime |number: 90214539181246357>
|yes>

sa: factor |number: 210>
|number: 2> + |number: 3> + |number: 5> + |number: 7>

sa: factor |number: 398714527>
|number: 521> + |number: 765287>

sa: factor |number: 987298762329>
4.000|number: 3> + |number: 11> + |number: 1108079419>

-- convert numbers into the word equivalent
-- (and eventually we would want the inverse too)
-- currently unimplemented!
-- though it would look something like this:
number-to-words |number: 7> => |text: seven>
number-to-words |number: 35> => |text: thirty five>
number-to-words |number: 137> => |text: one hundred and thirty seven>
number-to-words |number: 8,921> => |text: eight thousand, nine hundred and twenty one>
number-to-words |number: 54,329> => |text: fifty four thousand, three hundred and twenty nine>
number-to-words |number: 673,421> => |text: six hundred and seventy three thousand, four hundred and twenty one>
number-to-words |number: 3,896,520> => |text: three million, eight hundred and ninety six thousand, five hundred and twenty>

-- convert decimal number to another base:
sa: to-base(|350024>,|2>)
0.000|1> + 0.000|2> + 0.000|4> + |8> + 0.000|16> + 0.000|32> + |64> + 0.000|128> + |256> + |512> + |1024> + 0.000|2048> + |4096> + 0.000|8192> + |16384> + 0.000|32768> + |65536> + 0.000|131072> + |262144>

sa: to-base(|350024>,|8>)
0.000|1> + |8> + 5.000|64> + 3.000|512> + 5.000|4096> + 2.000|32768> + |262144>

sa: to-base(|350024>,|10>)
4.000|1> + 2.000|10> + 0.000|100> + 0.000|1000> + 5.000|10000> + 3.000|100000>

-- now a couple of functions to swap between temperature and distance units
-- proof of concept really, in practice we would want more (for other unit types),
-- and a cleaner way to implement them
-- F operator maps Celcius and Kelvin to Fahrenheit:
sa: F |C: 0>
|F: 32.00>

sa: F |C: 100>
|F: 212.00>

sa: F |K: 0>
|F: -459.67>

-- C maps Fahrenheit and Kelvin to Celcius:
sa: C |K: 0>
|C: -273.15>

sa: C |F: 0>
|C: -17.78>

sa: C |F: 100>
|C: 37.78>

-- K maps Fahrenheit and Celcius to Kelvin:
sa: K |C: 18>
|K: 291.15>

sa: K |C: 0>
|K: 273.15>

sa: K |F: 100>
|K: 310.93>

-- now similar, but for distances:
-- to-km maps meters or miles to km:
sa: to-km |miles: 1>
|km: 1.609>

-- to-meter maps km or miles to meters
sa: to-meter |miles: 7>
|m: 11265.408>

sa: to-meter |km: 5.213>
|m: 5213.000>

-- to-mile(s) maps km or m to miles
sa: to-miles |km: 42>
|miles: 26.098>

sa: to-miles |m: 800>
|miles: 0.497>

-- now a fun one! This should be useful in a bunch of places:
-- the list-to-words function:
list-to-words |x> == |x>
list-to-words (|x> + |y>) == |x and y>
list-to-words (|x> + |y> + |z>) == |x, y and z>
list-to-words (|x> + |y> + |z> + |u> + |v>) == |x, y, z, u and v>
and so on.

-- a practical example:
-- learn Eric's list of friends:
sa: friends |person: Eric> => |person: Fred> + |person: Sam> + |person: Harry> + |person: Mary> + |person: liz>

-- output Eric's list of friends:
sa: list-to-words extract-value friends |person: Eric>
|Fred, Sam, Harry, Mary and liz>

-- the "common" function (a type of intersection)
-- (though intersection is currently limited to 2 or 3 parameters, common can handle any number)
common[op] (|x> + |y> + |z>)
-- expands to:
intersection(op|x>, op|y>, op|z>)

-- some common usages:
common[friends] (|Fred> + |Sam>)
common[actors] (|movie-1> + |movie-2>)
-- or indirectly:
|list> => |Fred> + |Sam> + |Charles> + |Liz>
common[friends] "" |list>

-- next, we have an if statement in BKO.
-- really does require its' own post, to explain best how to use it. Perhaps later.
-- raw details are just:
if(|x>,|a>,|b>) returns |a> if |x> == |True>, |b> otherwise

-- and its more useful brother (since we try to avoid just living in the {0,1} world):
-- the weighted-if function:
wif(|x>,|a>,|b>)
eg:
wif(0.7|x>,|a>,|b>)
if |x> == |True>, returns 0.7|a> + 0.3|b>
if |x> != |True>, returns 0.3|a> + 0.7|b>

-- next, the map function, again this one is very useful!
-- we need this since we don't have multi-line for loops, so we use this to map operators to a list of kets.
map[op] (|x> + |y> + |z>)
runs:
op |x> => op |_self>
op |y> => op |_self>
op |z> => op |_self>

map[fn,result] (|a> + |b> + |c> + |d>)
runs:
result |a> => fn |_self>
result |b> => fn |_self>
result |c> => fn |_self>
result |d> => fn |_self>

-- most common usage is:
fn |*> #=> ... some details here
map[fn,result] "" |some list>

-- the exp function:
exp[op,n] |x>
maps to:
(1 + op + op^2 + ... + op^n) |x>

-- the exp-max function:
exp-max[op] |x>
maps to:
(1 + op + op^2 + ... + op^n) |x>
for an n such that exp[op,n] |x> == exp[op,n+1] |x>
-- ie, we have found every "child node" of |x>
-- with a warning that we have no idea how big the result is going to be, or how many steps deep.
-- a common usage is to find 6 degrees of separation:
exp[friends,6] |Fred>
exp-max[friends] |Fred>

-- the apply() function:
-- again, this is one of those very useful ones!
eg: apply(|op: age> + |op: friends> + |op: father>,|Fred>)
maps to:
age |Fred> + friends |Fred> + father |Fred>

-- a common usage is to define a list of operators separately:
|op list> => |op: mother> + |op: father> + |op: dob> + |op: age>
-- then apply them:
apply("" |op list>,|Fred>)

-- eg, maybe use like this:
|basic info op list> => |op: mother> + |op: father> + |op: height> + |op: age> + |op: eye-colour>
basic-info |*> #=> apply("" |basic info op list>,|_self>)
basic-info |Fred>
basic-info |Sam>

-- here is a toy function, maps dates to day of the week:
sa: day-of-the-week |date: 2015/01/24>
|day: Saturday>

-- here is one that saves typing, the split operator:
sa: split |a b c d e>
|a> + |b> + |c> + |d> + |e>

sa: split |word1 word2 word3 word4>
|word1> + |word2> + |word3> + |word4>
-- currently only splits on space chars, but maybe useful to specify the split char(s).

-- the clone ket function (not yet sure of a use case):
-- clone(|x>,|y>) copies rules from |x> and applies them to |y>
-- hence the name, clone().
-- say we have:
-- age |x> => |27>
-- mother |x> => |Jane>
-- after clone(|x>,|y>) we have:
-- age |y> == |27>
-- mother |y> == |Jane>
-- eg, if |x> and |y> are twin sisters.
--
-- thought of a use case:
-- say we have just learnt "elm" is a type of tree.
-- well, load that up with some default values we know about all tree's:
-- (cf. inheriting from a parent class in OO programming)
-- clone(|plant: tree>,|plant: tree: elm>)
-- then fill in more specific data as we learn more.

-- the relevant-kets[op] function
-- returns a list of all the kets in the current context that have "op" defined.
-- relevant-kets[op] is frequently useful for generating lists we can apply the map function to.
-- eg, learn some data:
sa: friends |Fred> => |Sam> + |Liz>
sa: friends |Rob> => |Jack> + |Tom>
sa: age |Fred> => |22>

-- now find who knows what operator types:
sa: relevant-kets[friends] |>
|Fred> + |Rob>

sa: relevant-kets[age] |>
|Fred>

-- there is a variant on this.
-- returns: intersection(relevant-kets[op],SP)
intn-relevant-kets[op] SP

-- eg, we can chain them and find all kets that support both friends, and age:
-- (NB: one has "intn" prefix, and one doesn't!)
intn-relevant-kets[age] relevant-kets[friends] |>

-- the pretty print rules as a matrix function.
-- first define some rules:
sa: op |a> => |a> + 2.000|b> + 3.000|c>
sa: op |b> => 0.500|b> + 9.000|c> + 5.000|e>
sa: op |c> => 7.000|e> + 2.000|b>

-- now take a look:

sa: matrix[op]
[ a ] = [  1.00  0     0     ] [ a ]
[ b ]   [  2.00  0.50  2.00  ] [ b ]
[ c ]   [  3.00  9.00  0     ] [ c ]
[ e ]   [  0     5.00  7.00  ]
|matrix>

-- and we finish with a slightly more interesting function, the train-of-thought function.
-- this code makes heavy use of supported-ops, pick-elt and apply().
-- and will work much better with a big knowledge base, but even a small one gives hints of what a large example will be like.
sa: load early-us-presidents.sw -- load up some knowledge
sa: create inverse -- needed, else we run into dead ends.
sa: train-of-thought[13] |Madison> -- take 13 steps, starting with |Madison>

context: sw console
one: |Madison>
n: 13
|X>: |Madison>

|early US Presidents: _list>
|Adams>
|year: 1797>
|Washington>
|early US Presidents: _list>
|Adams>
|number: 2>
|Adams>
|year: 1801>
|Jefferson>
|early US Presidents: _list>
|Adams>
|year: 1799>

Anyway, I guess the summary of this post is that we have some proof of concept functions trying to map our BKO scheme towards a more general purpose knowledge engine. Don't take the above functions as finished, take them as hints on where we could take this project.

The Semantic DB Project

Saturday, 24 January 2015

a big collection of function operators

No comments:

Post a Comment