Thursday 25 June 2015

introducing function matrices

Just some simple notation that merges the idea of functions and matrices into the same structure.

Let's start with simplest possible example:
[ d ]   [ fn1[x1] ] [ a ]
[ e ] = [ fn2[x2] ] [ b ]
[ f ]   [ fn3[x3] ] [ c ]
which expands to:
d = fn1[a]
e = fn2[b]
f = fn3[c]
Now, a slightly more interesting example:
[ d ]   [ bah1[x3]       ] [ a ]
[ e ] = [ bah2[x2,x1]    ] [ b ]
[ f ]   [ bah3[x1,x2,x3] ] [ c ]
which expands to:
d = bah1[c]
e = bah2[b,a]
f = bah3[a,b,c]
And we note a couple of things:
1) x1 corresponds to a, x2 corresponds to b, x3 corresponds to c. And we can use them in any row of the function section of the function-matrix. eg, x1 is used in row 2 and 3. x3 is used in row1 and row 3. The point is, we want to name the elements in the applied vector, so we chose x_i. And we use x as the name for the entire vector.
2) the functions in the function-matrix can have more than one parameter, eg, bah2[] and bah3[].

Now, another property is that function-matrices can have stored data. Yeah, they have their own memory I suppose you could say. Here is a simple example:
[ d ]   [ foo[L1,x] ] [ a ]
[ e ] = [ foo[L2,x] ] [ b ]
[ f ]   [ foo[L3,x] ] [ c ]
which expands to:
d = foo[L1,(a,b,c)]
e = foo[L2,(a,b,c)]
f = foo[L3,(a,b,c)]
And noting that x (without subscripts) corresponds to the name of the entire incoming vector.

Now, for example, define foo[] as:
foo[u,v] = dot-product[u,v]
And for example, define L_i as:
L1 = (m1,m2,m3)
L2 = (m4,m5,m6)
L3 = (m7,m8,m9)
Then:
[ d ]   [ foo[L1,x] ] [ a ]
[ e ] = [ foo[L2,x] ] [ b ]
[ f ]   [ foo[L3,x] ] [ c ]
expands to the standard matrix:
[ d ]   [ m1 m2 m3 ] [ a ]
[ e ] = [ m4 m5 m6 ] [ b ]
[ f ]   [ m7 m8 m9 ] [ c ]
ie:
d = m1*a + m2*b + m3*c
e = m4*a + m5*b + m6*c
f = m7*a + m8*b + m9*c
The next thing to note is we can have many layers of these things. eg:
[ d ]   [ foo1[x1] ] [ 5 6 1 0 2 ] [ fn1[x1]  ] [ m1 m2 m3 ] [ a ]
[ e ] = [ foo1[x2] ] [ 8 8 7 2 1 ] [ fn2[x3]  ] [ m4 m5 m6 ] [ b ]
[ f ]   [ foo2[x1] ]               [ fn3[x2]  ] [ m7 m8 m9 ] [ c ]
[ g ]   [ foo2[x2] ]               [ bah1[x1] ]
                                   [ bah2[x3] ]
It is ugly, but we can expand this beast out:
[ d ]   [ foo1[x1] ] [ 5 6 1 0 2 ] [ fn1[x1]  ] [ m1*a + m2*b + m3*c ]
[ e ] = [ foo1[x2] ] [ 8 8 7 2 1 ] [ fn2[x3]  ] [ m4*a + m5*b + m6*c ] 
[ f ]   [ foo2[x1] ]               [ fn3[x2]  ] [ m7*a + m8*b + m9*c ]
[ g ]   [ foo2[x2] ]               [ bah1[x1] ]
                                   [ bah2[x3] ]

[ d ]   [ foo1[x1] ] [ 5 6 1 0 2 ] [ fn1[m1*a + m2*b + m3*c]  ]
[ e ] = [ foo1[x2] ] [ 8 8 7 2 1 ] [ fn2[m7*a + m8*b + m9*c]  ] 
[ f ]   [ foo2[x1] ]               [ fn3[m4*a + m5*b + m6*c]  ]
[ g ]   [ foo2[x2] ]               [ bah1[m1*a + m2*b + m3*c] ]
                                   [ bah2[m7*a + m8*b + m9*c] ]

[ d ]   [ foo1[x1] ] [ 5*fn1[m1*a + m2*b + m3*c] + 6*fn2[m7*a + m8*b + m9*c] + 1*fn3[m4*a + m5*b + m6*c] + 0*bah1[m1*a + m2*b + m3*c] + 2*bah2[m7*a + m8*b + m9*c] ]
[ e ] = [ foo1[x2] ] [ 8*fn1[m1*a + m2*b + m3*c] + 8*fn2[m7*a + m8*b + m9*c] + 7*fn3[m4*a + m5*b + m6*c] + 2*bah1[m1*a + m2*b + m3*c] + 1*bah2[m7*a + m8*b + m9*c] ] 
[ f ]   [ foo2[x1] ]               
[ g ]   [ foo2[x2] ]               
                                   
d = foo1[5*fn1[m1*a + m2*b + m3*c] + 6*fn2[m7*a + m8*b + m9*c] + 1*fn3[m4*a + m5*b + m6*c] + 0*bah1[m1*a + m2*b + m3*c] + 2*bah2[m7*a + m8*b + m9*c]]
e = foo1[8*fn1[m1*a + m2*b + m3*c] + 8*fn2[m7*a + m8*b + m9*c] + 7*fn3[m4*a + m5*b + m6*c] + 2*bah1[m1*a + m2*b + m3*c] + 1*bah2[m7*a + m8*b + m9*c]]
f = foo2[5*fn1[m1*a + m2*b + m3*c] + 6*fn2[m7*a + m8*b + m9*c] + 1*fn3[m4*a + m5*b + m6*c] + 0*bah1[m1*a + m2*b + m3*c] + 2*bah2[m7*a + m8*b + m9*c]]
g = foo2[8*fn1[m1*a + m2*b + m3*c] + 8*fn2[m7*a + m8*b + m9*c] + 7*fn3[m4*a + m5*b + m6*c] + 2*bah1[m1*a + m2*b + m3*c] + 1*bah2[m7*a + m8*b + m9*c]]
Now, we need to note that in general these things are not invertible (unlike standard matrices). A simple proof is consider the function layer to be a secure hash. Then:
[ y1 ] = [ secure-hash[x1] ] [ m1 m2 ] [ a ]
[ y2 ]   [ secure-hash[x2] ] [ m3 m4 ] [ b ]

y1 = secure-hash[m1*a + m2*b]
y2 = secure-hash[m3*a + m4*b]
ie, given y1 and y2 it is impossible, except by brute force, to find the input vector (a,b).

I guess that is about it. Now to put them to use in the next few posts.

Update: recall above I gave this function matrix:
[ d ]   [ foo[L1,x] ] [ a ]
[ e ] = [ foo[L2,x] ] [ b ]
[ f ]   [ foo[L3,x] ] [ c ]
And if the Li's are vectors of the same length as x (which in this case is just [a,b,c]), and foo[u,v] := dot-product[u,v], then this function matrix is identical to a standard matrix. OK. That is simple enough. But the comment I want to make is, if foo[u,v] := simm[u,v], then this function matrix is essentially a simple if-then machine. The question then becomes, what properties do multiple layers of these guys have? And what is the best approach for finding all the Li's, if we use this system as a replacement for standard ANN's?

No comments:

Post a Comment