Wednesday, 25 March 2015

new function: find-unique[op]

The idea is simple enough. Given a collection of superpositions, find the kets that are unique to each of those superpositions. Motivated by my thoughts in my last post.

Here is a short example in the console:
-- load some data:
sa: load fred-sam-friends.sw

-- see what we have:
sa: dump
----------------------------------------
|context> => |context: friends>

friends |Fred> => |Jack> + |Harry> + |Ed> + |Mary> + |Rob> + |Patrick> + |Emma> + |Charlie>

friends |Sam> => |Charlie> + |George> + |Emma> + |Jack> + |Rober> + |Frank> + |Julie>
----------------------------------------

-- find the unique kets:
sa: find-unique[friends]

-- see what we now have:
sa: dump
----------------------------------------
|context> => |context: friends>

friends |Fred> => |Jack> + |Harry> + |Ed> + |Mary> + |Rob> + |Patrick> + |Emma> + |Charlie>
unique-friends |Fred> => |Harry> + |Ed> + |Mary> + |Rob> + |Patrick>

friends |Sam> => |Charlie> + |George> + |Emma> + |Jack> + |Rober> + |Frank> + |Julie>
unique-friends |Sam> => |George> + |Rober> + |Frank> + |Julie>
----------------------------------------
Here is a bigger example:
sa: load names.sw
sa: find-unique[names]
sa: save full-unique-names.sw

-- let's see how big our data set is now:
sa: how-many names |female name>
|number: 4275>
sa: how-many unique-names |female name>
|number: 2914>

sa: how-many names |male name>
|number: 1219>
sa: how-many unique-names |male name>
|number: 161>

sa: how-many names |last name>
|number: 88799>
sa: how-many unique-names |last name>
|number: 86747>
So hopefully that is all clear enough. I will put it to interesting use in the next post.

Also, BTW, this code seems to be really quite fast!

Update: I should also mention find-unique[op] belongs in the category of features that can learn new knowledge from our existing knowledge with minimal work from a human. This is always a good thing! A couple of other examples are "create inverse" and "map".

Update: a couple of other uses for find-unique[op].

1) say we have data on cultural factors:
cultural-factors |Australia> => ...
cultural-factors |UK> => ...
cultrual-factors |US> => ...
then:
find-unique[cultural-factors]

2) say we have dictionaries for different English dialects:
words |Australian English> => ...
words |UK English> => ...
words |US English> => ...
then:
find-unique[words]

Anyway, simple enough, but still useful!

No comments:

Post a Comment