Faster knit extraction

Mon Feb 19 14:06:19 GMT 2007

John, I think you were wondering about a way to apply several patches in
one pass, to avoid doing insertion in lists of lines.

For whatever reason, it just came into my head.

Start with a series of patches, each pointing at the previous until you
hit a fulltext:

class Patch(object):

    def __init__(self, prev_patch, hunk_iterator,
                 revision_id, num_lines):
        self.prev_patch = prev_patch
        self.hunk_iterator = hunk_iterator
        self.revision_id = revision_id
        self.num_lines = num_lines
        self.hunk = self.hunk_iterator.next()
        self.hunk_start = 0

    def _get_line(self, line_no):
        # Hunk format is from memory:
        # lines: lines to insert
        # i: position of beginning of matching lines in old
        # j: position of beginning of matching lines in new
        # n: number of matching lines
        # Note: line_no may never be less than a previous value.
        while True:
            line_no_off = line_no - self.hunk_start.
            lines, i, j, n = self.hunk
            if line_no_off < len(lines):
                return revision_id, line[line_no_off]
            elif line < j + n:
                return prev_patch.get_line(line_no-j)
            else:
                self.hunk_start = self.hunk_start = j + n
                self.hunk = self.hunk_iterator.next()

    def iter_annotated_version(self):
        return (self.get_line(x) for x in xrange(self.num_lines))

This should make minimal use of memory, run efficiently, avoid inserting
in lists, and still be useful for extracting multiple versions at once.

When extracting multiple versions at once, you just extract the oldest
version you want, turn it into a FullText, and use that FullText when
generating the next version you want.

Aaron