Monday, 23 November 2015

revisiting the letter rambler

After changing the back-end meaning of add_learn, from something that was really append-learn to a literal add_learn (I hope I didn't break anything in the process!), I decided to redo the letter rambler example. Now our ngrams have frequency information too. This has a big effect on the letter rambler, as I will shortly show.

First, a comparison of the resulting sw files:
append-learn version
add-learn (frequency) version
Noting they both use the same code

A couple of lines to visually show the difference (note the coefficients in the second case):
next-2-letters | by> => |  > + | t> + | s> + | a> + | h> + | w> + | n> + | d> + | o> + | q> + | M> + |. > + | e> + | m> + | p> + | b> + | r> + | y> + | H> + | C> + | i> + | c> + |?"> + | f> + | S> + | l> + | F> + | A> + |, > + | u> + | k> + | U>
next-2-letters | by> => |  > + 138.0| t> + 19.0| s> + 42.0| a> + 25.0| h> + 12.0| w> + 7.0| n> + 4.0| d> + 8.0| o> + 2.0| q> + 9.0| M> + |. > + 6.0| e> + 16.0| m> + 7.0| p> + 3.0| b> + 4.0| r> + 6.0| y> + | H> + | C> + 3.0| i> + 7.0| c> + |?"> + 2.0| f> + | S> + | l> + | F> + 3.0| A> + |, > + | u> + | k> + | U>
And now the new letter rambler:
sa: load ngram-letter-pairs--sherlock-holmes--add-learn.sw
sa: letter-ramble |*> #=> merge-labels(|_self> + weighted-pick-elt next-2-letters extract-3-tail-chars |_self>)
sa: letter-ramble^200 |The>
|They wered. Now did not acces of a humable, as shall beeched to belong meet as en bars in part the seriend's ask Mr. As when you. I could no had steps bell, and some forty, Mrs. Readings as so weathe that his is long colour minute mone out this, of the streat axe-Consibly after, whom I claim above use of my my doctorter inquiries. For and pisoda angry, as fulling heave dog-wheer ther obviouse when, >
And we can compare with the old version that had no frequency information by dropping back from weighted-pick-elt to just pick-elt (weighted-pick-elt takes coeffs in to account, while pick-elt does not):
sa: old-letter-ramble |*> #=> merge-labels(|_self> + pick-elt next-2-letters extract-3-tail-chars |_self>)
sa: old-letter-ramble^200 |The>
|The ont by, dea wed upset book, anxious him. Totts vanies tangibly ignotum tea cartyrdom Hosmoppine-margins," reporary, curson. Young criminas, apables Mr. Just you:  "Holbornine Aventrally, wore Sand. Royal brary. Warshy mergesty. John Ha! The dug an End on samps onlike a wound--1000 pound ord Suburly or 'G' wish us. Yes?" a "P," Holmask.  "Star' had. I owe, Winchisels whipping-schan legs. Having, I>
So similar, just a bit less English like.

That's all I wanted to show for this post.

No comments:

Post a Comment