Here is the BKO:

-- define the list of average websites: |ave list> => |average abc> + |average adelaidenow> + |average slashdot> + |average smh> + |average wikipedia> + |average youtube> -- we want average hash to be distinct from the other hashes: |null> => map[hash-4B,average-hash-4B] "" |ave list> -- now, let's see how well these patterns recognize the pages we left out of our average: result |abc 11> => 100 similar[hash-4B,average-hash-4B] |abc 11> result |adelaidenow 11> => 100 similar[hash-4B,average-hash-4B] |adelaidenow 11> result |slashdot 11> => 100 similar[hash-4B,average-hash-4B] |slashdot 11> result |smh 11> => 100 similar[hash-4B,average-hash-4B] |smh 11> result |wikipedia 11> => 100 similar[hash-4B,average-hash-4B] |wikipedia 11> result |youtube 11> => 100 similar[hash-4B,average-hash-4B] |youtube 11> -- tidy results: tidy-result |abc 11> => drop-below[40] result |_self> tidy-result |adelaidenow 11> => drop-below[40] result |_self> tidy-result |slashdot 11> => drop-below[40] result |_self> tidy-result |smh 11> => drop-below[40] result |_self> tidy-result |wikipedia 11> => drop-below[40] result |_self> tidy-result |youtube 11> => drop-below[40] result |_self>And now, drum-roll, the results!

sa: load improved-fragment-webpages.sw sa: load create-average-website-fragments.sw sa: load create-website-pattern-recognition-matrix.sw sa: matrix[result] [ average abc ] = [ 91.70 28.73 25.76 37.77 29.45 24.33 ] [ abc 11 ] [ average adelaidenow ] [ 28.77 78.11 26.71 29.85 25.25 28.18 ] [ adelaidenow 11 ] [ average slashdot ] [ 25.76 26.88 79.05 28.27 26.86 23.20 ] [ slashdot 11 ] [ average smh ] [ 37.80 29.75 28.16 85.55 32.06 24.95 ] [ smh 11 ] [ average wikipedia ] [ 29.71 25.25 26.91 31.86 85.19 22.09 ] [ wikipedia 11 ] [ average youtube ] [ 24.32 28.18 23.47 24.92 21.94 82.12 ] [ youtube 11 ] sa: matrix[tidy-result] [ average abc ] = [ 91.70 0 0 0 0 0 ] [ abc 11 ] [ average adelaidenow ] [ 0 78.11 0 0 0 0 ] [ adelaidenow 11 ] [ average slashdot ] [ 0 0 79.05 0 0 0 ] [ slashdot 11 ] [ average smh ] [ 0 0 0 85.55 0 0 ] [ smh 11 ] [ average wikipedia ] [ 0 0 0 0 85.19 0 ] [ wikipedia 11 ] [ average youtube ] [ 0 0 0 0 0 82.12 ] [ youtube 11 ]Finally, let's look at the discrimination. ie the difference between the highest matching result and the second highest:

sa: discrimination |*> #=> discrim result |_self> sa: table[page,discrimination] rel-kets[result] |> +----------------+----------------+ | page | discrimination | +----------------+----------------+ | abc 11 | 53.90 | | adelaidenow 11 | 48.36 | | slashdot 11 | 50.89 | | smh 11 | 47.78 | | wikipedia 11 | 53.14 | | youtube 11 | 53.94 | +----------------+----------------+There we have it. Discrimination on the order of 50%! That is good.

Heaps more to come!

## No comments:

## Post a Comment