Thursday, 12 March 2015

fixed supervised learning of iris classes

Last time I gave an example of supervised learning of iris classes. Funnily enough my code was buggy, and yet still gave good results! This post, the results with the fixed code.

The problem was my definition of the superpositions. I used:
r = ket("sepal-length: " +  sepal_len) + ket("sepal-width: " + sepal_width) + ket("petal-length: " + petal_len) + ket("petal-width: " + petal_width)
this means the similarity metric has no access to the float values. Either the kets are identical, or they are not. I'm surprised it worked at all! This is the fix:
r = ket("sepal-length",sepal_len) + ket("sepal-width",sepal_width) + ket("petal-length",petal_len) + ket("petal-width",petal_width)
And now, let's re run the data:
sa: load improved-iris-pattern-recognition.sw
sa: h2 |*> #=> coeff-sort M similar[input-pattern,pattern] |_self>
sa: discrimination2 |*> #=> round[3] push-float discrim h2 |_self>
sa: table[input,h2,discrimination2] split |node-41 node-42 node-43 node-44 node-45 node-46 node-47 node-48 node-49 node-50>
+---------+----------------------------------------------------------------+-----------------+
| input   | h2                                                             | discrimination2 |
+---------+----------------------------------------------------------------+-----------------+
| node-41 | 38.87 Iris-setosa, 30.68 Iris-versicolor, 28.65 Iris-virginica | 8.198           |
| node-42 | 37.31 Iris-setosa, 31.96 Iris-versicolor, 29.93 Iris-virginica | 5.349           |
| node-43 | 38.95 Iris-setosa, 30.93 Iris-versicolor, 28.90 Iris-virginica | 8.021           |
| node-44 | 38.20 Iris-setosa, 32.56 Iris-versicolor, 30.54 Iris-virginica | 5.632           |
| node-45 | 38.12 Iris-setosa, 32.55 Iris-versicolor, 30.53 Iris-virginica | 5.563           |
| node-46 | 38.77 Iris-setosa, 31.50 Iris-versicolor, 29.47 Iris-virginica | 7.271           |
| node-47 | 38.81 Iris-setosa, 31.07 Iris-versicolor, 29.04 Iris-virginica | 7.745           |
| node-48 | 39.09 Iris-setosa, 31.15 Iris-versicolor, 29.12 Iris-virginica | 7.941           |
| node-49 | 39.07 Iris-setosa, 30.69 Iris-versicolor, 28.67 Iris-virginica | 8.375           |
| node-50 | 39.07 Iris-setosa, 30.80 Iris-versicolor, 28.78 Iris-virginica | 8.264           |
+---------+----------------------------------------------------------------+-----------------+

sa: table[input,h2,discrimination2] split |node-91 node-92 node-93 node-94 node-95 node-96 node-97 node-98 node-99 node-100>
+----------+----------------------------------------------------------------+-----------------+
| input    | h2                                                             | discrimination2 |
+----------+----------------------------------------------------------------+-----------------+
| node-91  | 38.71 Iris-versicolor, 38.51 Iris-virginica, 30.31 Iris-setosa | 0.199           |
| node-92  | 39.00 Iris-versicolor, 38.19 Iris-virginica, 30.76 Iris-setosa | 0.813           |
| node-93  | 39.07 Iris-versicolor, 37.60 Iris-virginica, 31.36 Iris-setosa | 1.468           |
| node-94  | 38.91 Iris-versicolor, 37.14 Iris-virginica, 31.83 Iris-setosa | 1.769           |
| node-95  | 39.03 Iris-versicolor, 38.24 Iris-virginica, 30.72 Iris-setosa | 0.789           |
| node-96  | 38.80 Iris-versicolor, 37.62 Iris-virginica, 31.34 Iris-setosa | 1.181           |
| node-97  | 38.96 Iris-versicolor, 37.90 Iris-virginica, 31.06 Iris-setosa | 1.064           |
| node-98  | 39.07 Iris-versicolor, 37.55 Iris-virginica, 31.42 Iris-setosa | 1.525           |
| node-99  | 38.14 Iris-versicolor, 36.32 Iris-virginica, 32.64 Iris-setosa | 1.821           |
| node-100 | 39.04 Iris-versicolor, 37.84 Iris-virginica, 31.12 Iris-setosa | 1.194           |
+----------+----------------------------------------------------------------+-----------------+

sa: table[input,h2,discrimination2] split |node-141 node-142 node-143 node-144 node-145 node-146 node-147 node-148 node-149 node-150>
+----------+----------------------------------------------------------------+-----------------+
| input    | h2                                                             | discrimination2 |
+----------+----------------------------------------------------------------+-----------------+
| node-141 | 38.82 Iris-virginica, 37.60 Iris-versicolor, 28.68 Iris-setosa | 1.217           |
| node-142 | 38.47 Iris-virginica, 38.17 Iris-versicolor, 29.65 Iris-setosa | 0.299           |
| node-143 | 39.02 Iris-virginica, 37.54 Iris-versicolor, 28.59 Iris-setosa | 1.478           |
| node-144 | 38.97 Iris-virginica, 37.57 Iris-versicolor, 28.64 Iris-setosa | 1.402           |
| node-145 | 38.62 Iris-virginica, 37.50 Iris-versicolor, 28.64 Iris-setosa | 1.12            |
| node-146 | 38.68 Iris-virginica, 38.00 Iris-versicolor, 29.22 Iris-setosa | 0.679           |
| node-147 | 38.85 Iris-virginica, 37.95 Iris-versicolor, 29.08 Iris-setosa | 0.905           |
| node-148 | 38.97 Iris-virginica, 38.25 Iris-versicolor, 29.41 Iris-setosa | 0.727           |
| node-149 | 38.30 Iris-virginica, 37.50 Iris-versicolor, 28.86 Iris-setosa | 0.797           |
| node-150 | 38.89 Iris-virginica, 37.99 Iris-versicolor, 29.19 Iris-setosa | 0.9             |
+----------+----------------------------------------------------------------+-----------------+
And there we have it! 100% success rate. I would like the discrimination to be higher, but otherwise it is good. I guess a bigger example is in my near future. The good news is we can largely reuse this code for the next example.

Also, we may have luck using a h with a drop-below in it. eg:
h |*> #=> coeff-sort M drop-below[t] similar[input-pattern,pattern] |_self>
the question is how to find the best t? I guess discrimination is part of that answer. And there are other alternatives using sigmoids and so on in there. For now I don't super care.

No comments:

Post a Comment