First, note that in my travels I have seen this given as a similarity metric:

similarity-metric(f,g,s) = exp(-||f - g||^2/2s)So, if f == g, it returns 1, if ||f - g||^2 is big, it approaches 0.

Now, for the derivation of my metric.

First, define a multiplication operator that will simplify the equations a little:

a*b = \Sum_k abs(a[k] . b[k]) -- discrete version a*b = \Int dt abs(a(t) . b(t)) -- continuous version where . is the standard multiplication operator.Both of these have the property:

0 <= w*[f - g] <= w*f + w*g where 0 if f == g, and w*f + w*g if f and g are perfectly disjoint.which is really just another way of stating a couple of standard properties of metrics:

2. d(x,y) = 0 if and only if x = y 4. d(x,z) <= d(x,y) + d(y,z)And it is illustrative to note that:

w*[f - g] = \Sum_k |w[k].f[k] - w[k].g[k]| -- discrete version w*[f - g] = \Int dt |w(t).f(t) - w(t).g(t)| -- continuous versionNow, just divide through by the right hand side, casting it into the range [0,1]

0 <= w*[f - g]/(w*f + w*g) <= 1Now, this is a good start, but for some examples it doesn't work as well as expected (maybe one day I will dig up an example to show my point!).

But we can fix this by adding a term: R.abs(w*f - w*g), where R >= 0, and works best with R = 1

And hence simm:

0 <= (w*[f - g] + R.abs(w*f - w*g))/(w*f + w*g + R.abs(w*f - w*g)) <= 1So, OK. Let's set R = 1, and note that a + b + abs(a - b) = 2.max(a,b), we can now compact it down to:

0 <= (w*[f - g] + abs(w*f - w*g))/2.max(w*f,w*g) <= 1Cool. And now observe, as in the physics tradition, this equation has some symmetries:

given the scalar k (not equal to 0), we have: 1) symmetry under w => k.w 2) symmetry under f => k.f and g => k.g 3) symmetry under w => w.exp(i*t)/R(t), f => R(t).exp(-i*t).f, g => R(t).exp(-i*t).gNow, I can't think of a use case for the third symmetry there, but I put it in because it reminds me of gauge transformations in physics, and is a good decider on what terms are valid to use in our metric. eg: w*f, w*g, w*[f - g], abs(w*f - w*g).

We are about done. Final thing to note is the above metric returns 0 for exact match, and 1 for perfectly disjoint. If we want to swap this just map the metric m to 1 - m (presuming m "attains" its upper and lower bounds)

ie:

(w*[f - g] + R.abs(w*f - w*g))/(w*f + w*g + R.abs(w*f - w*g))becomes:

1 - (w*[f - g] + R.abs(w*f - w*g))/(w*f + w*g + R.abs(w*f - w*g)) = (w*f + w*g + R.abs(w*f - w*g) - w*[f - g] - R.abs(w*f - w*g))/(w*f + w*g + R.abs(w*f - w*g)) = (w*f + w*g - w*[f - g])/(w*f + w*g + R.abs(w*f - w*g)) = (w*f + w*g - w*[f - g])/2.max(w*f,w*g) -- assuming R = 1So there we have it:

simm(w,f,g) = (w*f + w*g - w*[f - g])/2.max(w*f,w*g) NB: if w is not given, ie we are using simm(f,g), then assume w[k] = 1 for all k, or w(t) = 1 for all t.Now a couple of observations:

1) the above derivation of my simm was based on the Manhattan metric. A similar derivation could be done using the standard Euclidean metric. Or presumably any metric. Though I think Manhattan is better than Euclidean because it is in some sense linear, and doesn't require any sqrt()s.

2) the fact that the denominator collapses down only when R = 1 is a hint that that is the right choice!

3) from experience playing with this, simm(w,f,g) usually, though not always, gives best results when f and g are rescaled so that w*f == w*g (excluding the case where they are length 1 vectors, in which case you should not rescale!)

4) if f[k] and g[k] are >= 0 for all terms, then the equation can be compressed using a + b - abs(a - b) = 2.min(a,b), and becomes:

simm(w,f,g) = \Sum_k w[k] min(f[k],g[k]) / max(w*f,w*g)

And that's it! An implementation in python, and some examples coming up next.

Update: A quick expansion on observation (1). So, using the Manhattan metric as the base, we have:

w*[f - g] = \Sum_k |w[k].f[k] - w[k].g[k]|

If we used the Eucldiean metric instead, we would have:

w*[f - g] = sqrt(\Sum_k |w[k].f[k] - w[k].g[k]|^2)

and hence my comment "in some sense linear".

And an expansion on observation (4):

Consider just the numerator:

w*f + w*g - w*[f - g]

Now, expand that out using our definition of a*b:

w*f + w*g - w*[f - g] = \Sum_k |w[k].f[k]| + \Sum_k |w[k].g[k]| - \Sum_k |w[k].f[k] - w[k].g[k]| = \Sum_k |w[k].f[k]| + |w[k].g[k]| - |w[k].f[k] - w[k].g[k]| = \Sum_k |w[k]| (|f[k]| + |g[k]| - |f[k] - g[k]|)Now, make use of "f[k] and g[k] are >= 0 for all terms":

= \Sum_k |w[k]| (f[k] + g[k] - |f[k] - g[k]|)Now, make use of a + b - abs(a - b) = 2.min(a,b):

= \Sum_k |w[k]| 2.min(f[k],g[k])And hence (assuming w[k] >= 0 for all k):

simm(w,f,g) = \Sum_k w[k] min(f[k],g[k]) / max(w*f,w*g)as promised! And we take this pretty compression as a strong hint that Manhattan was a better choice than Euclidean! BTW, also note the factor of 2 from in front of min, and max cancelled out.

Update: OK. I don't know a use case for these, but we can generalize to more than pairs {f,g}.

simm(w,f1,f2) = \Sum_k w[k] min(f1[k],f2[k])/max(w*f1,w*f2) simm(w,f1,f2,f3) = \Sum_k w[k] min(f1[k],f2[k],f3[k])/max(w*f1,w*f2,w*f3) simm(w,f1,f2,f3,f4) = \Sum_k w[k] min(f1[k],f2[k],f3[k],f4[k])/max(w*f1,w*f2,w*f3,w*f4) simm(w,f1,f2,f3,f4,f5) = \Sum_k w[k] min(f1[k],f2[k],f3[k],f4[k],f5[k])/max(w*f1,w*f2,w*f3,w*f4,w*f5)and so on.

## No comments:

## Post a Comment