case conversion in sed substitution does not work

Dennis Decker Jensen dennisdjensen at tiscali.dk
Thu Nov 4 17:53:10 UTC 2004


On Thu, 2004-11-04 at 10:02 +0100, Erik Bågfors wrote:
> In the first case you are looking for the characters a-z and turn them
> to upper case, your search does not include a space, hence it stops at
> the first space. The correct regex is
> 
> 
> echo dennis decker jensen | sed -re 's/[a-z ]+/\U&/'
> DENNIS DECKER JENSEN
> 
> Note the space after the z.

Yes. I should have been more clear. The first case is OK.

> In the second case if you only want to convert the first D use
> 
> echo dennis decker jensen | sed -re 's/[a-z]/\U&/' 
> Dennis decker jensen

Yes, \U works, but \u is still bugged, which is what I was trying to
show in my own vague and twisted way. It should uppercase one and one
only character, the next character to be exact. However, instead it eats
the character (the d it was supposed to upcase)!

The documentation (texinfo) for sed says this in `sed programs' -> `the
"s" command':

----
Finally, as a GNU `sed' extension, you can include a special sequence
made of a backslash and one of the letters `L', `l', `U', `u', or `E'.
The meaning is as follows:

`\L'
     Turn the replacement to lowercase until a `\U' or `\E' is found,

`\l'
     Turn the next character to lowercase,

`\U'
     Turn the replacement to uppercase until a `\L' or `\E' is found,

`\u'
     Turn the next character to uppercase,

`\E'
     Stop case conversion started by `\L' or `\U'.
----

To try it a little differently:

echo dennis decker jensen | sed 's/.*/\ufish &/'
ish dennis decker jensen

See? I expected `Fish dennis decker jensen' and not f to be eaten!

\l behaves the same:

echo DENNIS DECKER JENSEN | sed 's/.*/\lFISH &/'
ISH DENNIS DECKER JENSEN

My poor little lowercase f got eaten!

> I don't think there is anything wrong with sed, just how you use the regex.
> 

Not according to documentation, but maybe documentation is wrong or I
have misunderstood it?

Please note, I'm not in some desperate need for a bugfix, I just
stumbled across this by accident.

/Dennis

> On Wed, 03 Nov 2004 00:29:12 +0000, Dennis Decker Jensen
> <dennisdjensen at tiscali.dk> wrote:
> > Package: sed
> > Version: 4.1.2-1
> > Severity: normal
> > 
> > case 1:
> > 
> > echo dennis decker jensen | sed -re 's/[a-z]+/\U&/'
> > DENNIS decker jensen
> > 
> > case 2:
> > 
> > echo dennis decker jensen | sed -re 's/[a-z]+/\u&/'
> > ennis decker jensen
> > 
> > I expected this in case 2:
> > Dennis decker jensen
> > 
> > It eats the character! The same thing happens when using \l (\L).
> > 
> > -- System Information:
> > Debian Release: testing/unstable
> > Architecture: i386 (i686)
> > Kernel: Linux 2.6.8.1-3-386
> > Locale: LANG=en_DK, LC_CTYPE=en_DK
> > 
> > Versions of packages sed depends on:
> > ii  libc6              2.3.2.ds1-13ubuntu2.2 GNU C Library: Shared libraries an
> > 
> > -- no debconf information
> > 
> > --
> > ubuntu-users mailing list
> > ubuntu-users at lists.ubuntu.com
> > http://lists.ubuntu.com/mailman/listinfo/ubuntu-users
> >





More information about the ubuntu-users mailing list