Let's pick "Looking forward to" as my seed string.
Now, in the console:
sa: load ngram-pairs--webboard.sw sa: ramble |*> #=> merge-labels(|_self> + | > + pick-elt next-2 extract-3-tail |_self>) -- apply it once: sa: ramble |Looking forward to> |Looking forward to when the> -- apply it twice: sa: ramble^2 |Looking forward to> |Looking forward to the "Geometric Visions" cover> -- apply it 10 times: sa: ramble^10 |Looking forward to> |Looking forward to you posting hot licks on YouTube. I may not agree with Because I'm not some idiot who thinks the only> -- apply it 50 times: sa: ramble^50 |Looking forward to> |Looking forward to Joe Biden going as nasty as everyone says it is" moments. Of course, I do stuff like that cannot get outsourced as NT 3.1 wasn't even shipped to India yet. It was circa 1993 before the WWW become popular and the Internet was fast enough to keep circulating through so you have to find some way to reproduce your crash. Then hopefully I can reproduce it on my brother and him getting punished for it that people don't know that you'd be able to escape. I didn't even need to tell you how sorry I am for your loss," Erin>For our next example, let's apply it 1000 times, with seed string "to start at"
ramble^1000 |to start at>This is too big to post here, so I've uploaded it to here.
Go read, it is fun!
The output is a giant wall of text, so I wrote some code to tidy that up by creating fake paragraphs:
#!/usr/bin/env python3 import sys import random filename = sys.argv[1] paragraph_lengths = [1,2,2,2,2,3,3,3,3,3,3,3,3,4,4,4,4,4,5] dot_found = False dot_count = 0 with open(filename,'r') as f: for line in f: for c in line: if c == ".": dot_found = True print(c,end='') elif c == " " and dot_found: dot_found = False dot_count += 1 if dot_count == random.choice(paragraph_lengths) or dot_count == max(paragraph_lengths): print("\n") dot_count = 0 else: print(c,end='') else: dot_found = False print(c,end='')Same example as above, but this time with fake/random paragraphs. See here. Take a look! Those fake paragraphs are a really big improvement.
Now, one more example. Let's pick "fade out again" as the seed string, and 2000 words (ie, ramble^1000). Result here.
Anyway, I think it is really rather remarkable how good the grammar is from this thing. And I have to wonder if other p/q choices (here we use 3/5) will give better or worse results. And of course, the big question is, is this approaching what a human brain does? Certainly seems likely that most human brains only store n-grams, not full blocks of text. Perhaps n = 10 at the max? Though favourite quotes from plays or movies will be longer than the average stored ngram size. And it also seems likely that the brain stitches sequences. A good example are songs. Given a sequence of notes, your brain predicts what is next. And then from that, what is next after that. And so on. Which seems like joining ngrams to me.
No comments:
Post a Comment