[BUG] patch.py not portable to windows

Martin Pool mbp at sourcefrog.net
Mon Jul 4 12:52:37 BST 2005


On  1 Jul 2005, John A Meinel <john at arbash-meinel.com> wrote:

> >It's unfortunate that we can't just bundle subprocess as we do
> >ElementTree, but the need to compile a C module on Windows kills that.
> >Canonical's using a library called 'gnarly' for process spawning that's
> >all-Python, and supposed to be subprocess-compatible, so that might work.
> >
> >http://ddaa.net/arch/2004/gnarly/gnarly--devel/gnarly--devel--0/
> 
> I might check into that. But I think you can bundle subprocess, and just
> state that for windows you must install subprocess manually for 2.3, or
> use 2.4. I honestly don't think it is a big dependency. It is one
> package that needs to be installed in 1 platform for 1 version. It's not
> something that most people have to install.

I'm not sure, but I think many windows users would find it easier to
just install python2.4 than to compile a module themselves.

As Fredrik says it's less likely they'll have a good diff and patch so
it would be nice to do that internally.

> >>In the long term, it would be nice to switch to perhaps a python
> >>implementation of patch, something that could keep everything in memory,
> >>rather than having to write out a bunch of temporary files. Right now,
> >>you have to decompress the store data, write that to a file, then spawn
> >>patch and pipe in the changes and create a new file, the read that back
> >>in, and delete the other files.

> >It's rather shocking that Python can produce unified diffs, but can't
> >apply them, eh?  (Actually, it's not producing them properly, either.)
> 
> Isn't it just that it doesn't handle a missing EOL the same way (and
> that it changes the number for /dev/null).

I think what aaron means is that there's nothing which takes a unified
diff format and turns it back into a series of difflib instructions.

Separately from that we need to specialcase the absence of a trailing
newline to work the same way as difflib.

> I think you would have a very strong possibly of slightly different but
> equally correct. Diff and patch in fuzzy mode is obviously not an exact
> science. (One of my personal peeves with diff is the case where you
> insert an if just before another one. It will frequently latch on to
> either an empty line or a line with just {, and then show it as a delete
> + modify + add, rather than just an add).

We might be able to improve that by using the SequenceMatcher junk
option.

I might also mention that Wiggle is very nice for dealing with patch rejects.

http://cgi.cse.unsw.edu.au/~neilb/source/wiggle/ANNOUNCE

> I know diff3 is pretty important for bzr, since it is the real merge
> workhorse.

I think it should be fairly easy to do diff3 internally on top of
difflib, by just comparing the two diff opcode streams and looking for
overlapping changes.  (Did I miss something?)

> The specific warning from the python documentation is:
> 
> The only way to retrieve the return codes for the child processes is by
> using the poll() or wait() methods on the Popen3 and Popen4 classes;
> these are only available on Unix.

Yes, this is the reason I went back to requiring subprocess for test.
Test could obviously needs to check both the return code and the output
of subprocesses and that just can't be done well in python2.3.  We could
perhaps just skip those particular tests.

-- 
Martin




More information about the bazaar mailing list