Wednesday, 10 December 2014

introducing file_recall()

So, with small files it is fine to use the standard:
context.recall(op,label)
but once you start working with large sw files, it is prohibitively slow to load the entire sw file into memory, then work on it, then spit out your result. I'm talking days here for the improved-imdb.sw file.
The fix is this function:
# filename is sw data/source file
# op is the operator label, a string
# label is the ket label, a string or a ket
#
# returns a superposition
def file_recall(filename,op,label):
  if type(label) == ket:
    coeff = label.value
    ket_label = label.label
  else:
    coeff = 1
    ket_label = label

  pattern = op + " |" + ket_label + "> => "
  n = len(pattern)
#  print("pattern:",pattern)
#  print("n:      ",n)

  with open(filename,'r') as f:
    for line in f:
      if line.startswith(pattern):
        return extract_literal_superposition(line[n:])[0].multiply(coeff)
  return ket("",0)

And we can tweak this. If all superpositions in your sw file are "clean superpositions". Yeah, a term I just made up. But what I mean is:
if all coeffs in our superposition are 1, and implicit, we can call it a "clean superposition"
So: |a> + |b> + |c> is a clean superposition, and 3|a> + |b> + 21|d> is not.

And if we also have op and label are always strings, then we can simplify the file_recall() code down to this (which BTW returns a list and not a superposition):
def file_recall(filename,op,label):
  pattern = op + " |" + label + "> => "
  n = len(pattern)
  with open(filename,'r') as f:
    for line in f:
      if line.startswith(pattern):
        line = line[n:]
        return line[1:-1].split("> + |")
  return []

Anyway, we make use of file_recall() in the next couple of posts.
And a couple of notes:
1) often you can drastically speed up processing of large sw files by grepping down to the rule-types of interest. eg:
grep "^op " example-file.sw
2) maybe one day make the in-memory/in-file distinction transparent to the working code? I don't yet know if that is a good idea or not.
3) note how simple the second file_recall() code is! This is another win from our simple sw notation:
OP KET => SUPERPOSITION

No comments:

Post a Comment