Python2 demotion (moving from main to universe) in progress

Fri Dec 15 02:38:20 UTC 2017

Colin Watson schreef op 09-12-2017 13:51:
> On Sat, Dec 09, 2017 at 12:47:28PM +0100, Xen wrote:
>> Colin Watson schreef op 09-12-2017 0:24:
>> > there are good reasons behind many of the changes in Python 3
>> 
>> You know, an appeal to "good reasons" is really a blanket statement 
>> that
>> betrays the absence of any good reasons.
> 
> No, it betrays limited time to write.

Yes, that's what everyone says.

This phrase is used time and time again by people who have no real 
arguments for why a specific thing was done in a certain way.

   To take just one example, Python
> 2's willingness to mix Unicode and binary types provided that the 
> binary
> strings were limited to 7-bit ASCII

You mean the implicit encoding to ASCII in order to subsequently decode 
to e.g. UTF-8.

Yes that sounds or sounded like a bad thing.

But instead of only removing that they removed all encodings from byte 
strings.

Or all decodings, whatever.

> Even as somebody generally very sympathetic to the needs of
> localisation, I've got this wrong because Python 2 had just too many
> ways to make mistakes in this area.

So you are basically saying that you still don't agree but you have 
somehow accepted your own fallibility in that something you like is 
wrong and something you dislike is right.

You are basically saying to me that you dislike the Python 3 solution 
but you have accepted that it is "superior".

And you are now forcing yourself into that even as you still dislike it, 
in a post that is meant to convince me that there is no forcing going on 
in Python.

I mean Mr. Watson, I am happy that you are so honest,

but this clearly demonstrates what I am saying. You are giving the 
perfect example.

You are directly telling me that you are forcing yourself to like 
something that you don't like.

> In Python 3 you have to confront
> the distinction earlier and have a much better chance of getting it
> right.

Except when you use low-level code, apparently -- I cannot comment 
enough on what you are saying, but --- what they have done is make the 
higher level object the default, and make it very hard for people to use 
the lower level objects, because conversion for byte types is not 
supported, and only conversion from unicode strings into bytes.

You should know and everyone knows that if you limit yourself to a 
higher level construct, you then get in trouble when you find out it is 
not enough.

I once created something that I designed for LVM -- just a script -- but 
it could basically work on any device, but my interface only allowed 
specification of LVM volumes.

Then I found out that I also needed it for Crypt volumes but my code 
couldn't handle that designation.

I found myself grabbing pieces from my script and using it directly in 
the shell in order to get around my own.... limited vision.

So I found the need to downgrade "LVM only" code to "DM code" and then 
later I would probably have found that I needed to downgrade it further 
to "any device".

This whole Python bytes/string thing is the same thing.

You cannot use a high level construct as your foundation and then try to 
build the lower level construct out of that, which is what they have 
done.

The foundation has to be the low level. They have tried to reverse that.

> You can certainly begin to make the distinction in Python 2
> code, but adding the extra type-safety required a breaking change.

I cannot comment but the point you are trying to make is that "any 
solution" would be a "good solution", or that just because it was solved 
in some way, it is by definition good, or better than not solving it.

I think this is wrong.

And this guy makes that point.

http://lucumr.pocoo.org/2014/1/5/unicode-in-2-and-3/

And I don't want to get into politics but some politicians also make 
that statement that "any solution is a good solution" and this usually 
results in mayhem....

Here is his words:

"Let me be clear upfront: Python 2's way of dealing with Unicode is 
error prone and I am all in favour of improving it. My point though is 
that the one in Python 3 is a step backwards and brought so many more 
issues that I absolutely hate working with it."

Another example is:

"I've hit this limitation a couple more times, and none of the proposed 
workarounds are adequate. Working with protocols and file formats that 
use human-readable markup is significantly clumsier than it was with 
Python 2 (using either the % operator, which also lost its support for 
byte strings in Python 3, or .format())."

I happen to know the % operator ;-).

It is ridiculous that you cannot use it for binary-only text, or what I 
would call just.... byte strings.

So you can use the ASCII encoder. But what if there is something else in 
there, that you want to leave unharmed?

This low level feature is just adequately expressed in what existed 
before.

https://bugs.python.org/issue3982

However my intent was not to discuss low level Python semantics...

> I'm not going to continue to spend time explaining the underlying
> reasons in response to vague insinuations, though.  If there are some
> particular changes in Python 3 that you think weren't well-founded, 
> name
> them.  This is a list for developers; we can be specific.

You just named one.

Arguably probably the most important one, which is why you named it.

Except that you named it as an improvement, and I see it as a detriment.

And I have already discussed it now.

So yes I did just read that entire guy's blog post.

And delved into that bug report that spoke of the need for byte string 
conversions.

Basically, printf on something that is just simple bytes.

That can hold binary representation of any type of data including text.

I still don't understand it, it is very confusing.

Convoluted. But that is precisely the problem.

The entire Python 3 model is deeply confusing. The Python 2 model was 
not.

In Python 2, there are just 2 different types, and you need to convert 
them.

The whole idea that a "bytes" object is something _special_ is wrong.

The very word "bytes" or "binary" confuses me because that is the normal 
mode of operation.

If you want unicode, you treat your text. You use the higher level 
function explicitly, or perhaps automatically if you can live in that 
"subsystem".

But not as the lower level fundamental mechanic.

We *have* to keep dealing with just bytes.

Whatever's in them is up for debate, but text is not even a special 
category, all kinds of formats exist.

So yes they made the conversion code unicode specific but then found 
that it was too specific and couldn't go back so easily.

This example really says it all:

Python 2: b'Content-length: {}\r\n'.format(length)

Python 3: b''.join([b'Content-length: ', (bytes if bytes is str else 
str)(length).encode('ascii'), b'\r\n'])

Or well, that is 2<>3 compatible code.

I grow a bit tired of reading that bug discussion.

Guido van Rossum wants the feature, 60% of naysayers try to oppose it 
and throw up irrelevant barriers all the time, after he chimes in they 
turn around a bit.

Mercurial chimes in and wants opaque ways to treat user data, but can't 
do it in Python 3.

The whole low level aspect is completely broken (at least there in 3.3) 
and people try to suggest that there are ways around it and consistently 
miss the point or don't think of something.

The problem of course is that binary data (in the high bit range) gets 
corrupted by UTF-8 conversion.

They need to be able to use text functions, but they can't first convert 
from some known format, because they don't know the format.

So they need functions to operate on raw text (including perhaps binary 
data) and it can't be done. Or only very inelegantly.

>> So you go on to detail the similarities with C but with C there never 
>> was
>> one breaking point, just incremental changes.
> 
> There have been many breaking changes in C over the years, though since
> it's an essentially smaller language most of those changes (not all!)
> have been at the runtime or library levels.  Small consolation if 
> that's
> what breaks your code, of course.

The point is indeed that this is partial stuff at every moment.

> But you seem to be under the impression that moving from Python 2 to
> Python 3 requires non-incremental changes: i.e. a flag day where your
> code stops working on Python 2 and starts working on Python 3.  This
> isn't so.

Well that is what everyone is doing.

First, they have to get it working on Python 3 to begin with.

You say you can first transform 2 code to 3 by making it bilingual but 
there are many libraries where this is excessively difficult (the low 
level protocol stuff for instance),

basically transforming it from within to make it ready for d-day, 
merging it completely into a bilingual beast, and then when it runs on 
3, you can gradually begin dropping python 2 support (or perhaps not 
gradually, but when you can drop it).

But it is precisely this "combined" code that has been described as "not 
fun" and "terrible to write".

> There's a large subset that works fine in both: nearly all
> the code I write these days is "bilingual" in this way, and the obvious
> porting strategy for most code involves making it be this way, so at 
> the
> end all you have to do is flip to the newer interpreter after your 
> tests
> pass.

> I would rather that Python 3 had taken the Perl 6 approach of having 
> the
> newer interpreter be able to execute older code in a special
> compatibility mode

This seems a much more sensible approach yes. Then people are free to go 
with 3 whenever they like, instead of in a forced way.

That would take the whole forcing part out of it, and guarantees that 
older code keeps running, while everyone moves to a 3 interpreter and 
can take their leisury time to play with it and/or move to it 
completely.

> , so that we could mix-and-match more freely and the
> transition would have been easier.

> (And, much though I like Perl, Perl 6 has taken even longer to get into
> typical developers' hands than Python 3 has, so ...)

Don't know much about Perl myself.

>> The work of "God" (unfortunate events) should not be willfully 
>> perpetrated
>> by humans on one another on purpose.
> 
> What is your *concrete* proposal here?

When I wrote my message my concern was only that I experience a deep 
lack of concern and a presentation of the deprecation that seems to be 
without concern.

I mean that it is presented as a good thing, as nothing to worry about, 
and as something that won't cause trouble.

As something that (in this case) Ubuntu likes and supports.

In that sense that there is not any part that says "Unfortunately".

Anything in this area that could be possible would begin, in the general 
sense, first with a desire to have it.

Now of course I understand that Python has worked so hard to discard the 
old that there is not much you can do about it yourself but this "mixing 
and matching" would have made everything a lot easier of course and put 
away the whole problem.

 From my point of view:

- you come up with good suggestions
- you are forcing yourself to like it (python 3)
- you consider it unfortunate but inevitable.

However in the broader sense I think (without being expert on this) that 
not even is the transition hard, but also the end result is bad and 
python 3 is a worse language than python 2.

So even if everything works out, you now ended up with something you 
will never like.

And now you're stuck and of course you can't go back, or very hard.

I think that maybe we should start looking at a fork of Python 2.

At least there is one person who was working on a Python 2.8 and 
apparently renamed to Tauthon.

But even though it would be easy to add a package you get an issue with 
interpreter name.

I also don't know if it has a future, and it begins to sound like a 
FreePascal.

A fringe thing that few people use.

Particularly as a result of the name change, obviously.

More words:

"Python 3 unicode design is totally braindead. We now have several 
examples of GOOD unicode-enabled design, from both parts of the 
spectrum:

1) Perl 6 dove deep and encodes strings in the NFG, thus making string 
operations intuitively correct. It won't split words between combining 
characters, for example. Python 3, of course, is completely clueless 
about that.

2) Go treats strings as arrays of octets. Always. It also has handy 
functions to manipulate arrays of octets that happen to be UTF-8 encoded 
texts.
Sorry. But Py3 is not well-designed, it's a great example of a "second 
system effect". The fact that companies are migrating from Py27 to 
Golang should give Py3 developers some clue."

https://en.wikipedia.org/wiki/Second-system_effect

(Systemd is also an example of the second system effect, but that aside.

You can also call it a pendulum between "too little" and "too much").

But anyway.

(On 2.8: https://lwn.net/Articles/711061/)

> Bear in mind that we don't have
> the resources to maintain Python 2 indefinitely; if that's your
> proposal, it's going to run up against real-world constraints.

The issue is that a lot of people will want a 2.7 compatible 
interpreter.

There is no room in this world for "two" pythons (I mean 2 with a 
different name),

If python3 is going to be called "python" you will get issues with 
running both systems at once as well.

So I don't know what your solution would be to that, update-alternatives 
can't fix that.

On the other hand you can point python 2.7 to tauthon if need be.

(I am sure tauthon derives from the greek letter tau).

So if this person does keep maintaining this project and it gets some 
traction, you could have a 'supported' python 2 interpretor being 
callable by "python2" or "python2.7" or even "python2.8" for some time 
to come.

If more people rally around that you could even have an unofficial 
'official' Python 2.8 specification ;-).

Of which Tauthon would then be one interpreter ;-).

You could then move Tauthon (package "tauthon") to universe around the 
time that you would move python 2.7 out of it.

The only thing tauthon cannot support is the bytes/str(unicode) 
mechanics.

But who knows what he might come up with, but anyway.

In lieu of anything else, maybe that could be something to go after....

Or just consider.

Or just choose.

> But
> there may be other things we can do to make it easier for people.

Well I guess this would be one way.

Or one thing.

> Do
> you have some specific projects that are troublesome?

No unfortunately I am a _beginning_ python developer that doesn't much 
like having to program in 3 ;-).

I mean I made the decision to drop having anything to do with GTK2 so I 
don't mind that choice so much.

I don't want to end up in a world with lesser choices available to me.