[BUG] patch.py not portable to windows

Mon Jul 4 15:55:50 BST 2005

Martin Pool wrote:

>On  1 Jul 2005, John A Meinel <john at arbash-meinel.com> wrote:
>
>  
>
>>>It's unfortunate that we can't just bundle subprocess as we do
>>>ElementTree, but the need to compile a C module on Windows kills that.
>>>Canonical's using a library called 'gnarly' for process spawning that's
>>>all-Python, and supposed to be subprocess-compatible, so that might work.
>>>
>>>http://ddaa.net/arch/2004/gnarly/gnarly--devel/gnarly--devel--0/
>>>      
>>>
>>I might check into that. But I think you can bundle subprocess, and just
>>state that for windows you must install subprocess manually for 2.3, or
>>use 2.4. I honestly don't think it is a big dependency. It is one
>>package that needs to be installed in 1 platform for 1 version. It's not
>>something that most people have to install.
>>    
>>
>
>I'm not sure, but I think many windows users would find it easier to
>just install python2.4 than to compile a module themselves.
>
>As Fredrik says it's less likely they'll have a good diff and patch so
>it would be nice to do that internally.
>  
>
You could install the binary module (you don't have to install it from
source on windows). http://effbot.org/downloads/#subprocess
Has binaries for both python 2.2 and 2.3.
Also, as Fredrik pointed out, there is support in the subprocess.py
script for using the win32all library. And on the python download page
they recommend it. (Unfortunately it is "if 0"ed out of the default
code, but we could re-order it a little bit for a bundled version.)
Just to say, they can install several different compiled forms, and not
have to compile it themselves.
I would just say "If you are using python2.3 on windows, you must
install the subprocess library available here."

>  
>
>>>>In the long term, it would be nice to switch to perhaps a python
>>>>implementation of patch, something that could keep everything in memory,
>>>>rather than having to write out a bunch of temporary files. Right now,
>>>>you have to decompress the store data, write that to a file, then spawn
>>>>patch and pipe in the changes and create a new file, the read that back
>>>>in, and delete the other files.
>>>>        
>>>>
>
>  
>
>>>It's rather shocking that Python can produce unified diffs, but can't
>>>apply them, eh?  (Actually, it's not producing them properly, either.)
>>>      
>>>
>>Isn't it just that it doesn't handle a missing EOL the same way (and
>>that it changes the number for /dev/null).
>>    
>>
>
>I think what aaron means is that there's nothing which takes a unified
>diff format and turns it back into a series of difflib instructions.
>
>Separately from that we need to specialcase the absence of a trailing
>newline to work the same way as difflib.
>  
>
I was commenting on the "Actually, it's not producing the properly,
either." I understand that there is no 'patchlib'.

>  
>
>>I think you would have a very strong possibly of slightly different but
>>equally correct. Diff and patch in fuzzy mode is obviously not an exact
>>science. (One of my personal peeves with diff is the case where you
>>insert an if just before another one. It will frequently latch on to
>>either an empty line or a line with just {, and then show it as a delete
>>+ modify + add, rather than just an add).
>>    
>>
>
>We might be able to improve that by using the SequenceMatcher junk
>option.
>
>I might also mention that Wiggle is very nice for dealing with patch rejects.
>
>http://cgi.cse.unsw.edu.au/~neilb/source/wiggle/ANNOUNCE
>
>  
>
I have heard of wiggles improved ability to merge bits, but haven't
tried it.

>>I know diff3 is pretty important for bzr, since it is the real merge
>>workhorse.
>>    
>>
>
>I think it should be fairly easy to do diff3 internally on top of
>difflib, by just comparing the two diff opcode streams and looking for
>overlapping changes.  (Did I miss something?)
>
>  
>
I've never gotten into the guts of diff3, so I don't really know what
algorithms it uses.
I would guess you are right about the basic idea, but does it get fancy
in corner cases?

>>The specific warning from the python documentation is:
>>
>>The only way to retrieve the return codes for the child processes is by
>>using the poll() or wait() methods on the Popen3 and Popen4 classes;
>>these are only available on Unix.
>>    
>>
>
>Yes, this is the reason I went back to requiring subprocess for test.
>Test could obviously needs to check both the return code and the output
>of subprocesses and that just can't be done well in python2.3.  We could
>perhaps just skip those particular tests.
>  
>
But then you would have to write something that uses subprocess when
available, and otherwise uses your workaround, and knows not to do
anything for certain tests. *Way* too much hacking just to support an
edge configuration, IMHO.

John
=:->