Reading a variable line by line with while loop

James Michael Fultz croooow at gmail.com
Wed Dec 2 04:03:55 UTC 2009


* Ray Parrish <crp at cmc.net> [2009-12-01 17:41 -0800]:
> James Michael Fultz wrote:
> > * Ray Parrish <crp at cmc.net> [2009-12-01 16:45 -0800]:
> >   
> >> I have found an if statement that reduces the duplicates to only two 
> >> pairs for the entire file. That's pretty good considering the number of 
> >> actual duplicates there were in the file to begin with. Here is the code 
> >> for that one -
> >>
> >> function CompressHistory {
> >>      BashHistory=`cat ~/.bash_history`
> >>      while read ThisCommand; do
> >>            ThisCommand=${ThisCommand// /__};
> >>            if  [[ "$History" =~ "$ThisCommand" ]]
> >>                then
> >>                    echo "nothing" >/dev/null
> >>                else
> >>                    History="$History $ThisCommand";
> >>            fi
> >>      done <<< "$BashHistory"
> >>      echo "$History"
> >> }
> >>
> >> This next bit of code has me frazzled trying to figure out the proper 
> >> regular expression to say "if this string exists within the larger 
> >> string". Nothing I have tried so far has worked, so here is the code, 
> >> and maybe someone can correct my regular expression so it works to weed 
> >> out duplicates as well.
> >>
> >>      BashHistory=`cat ~/.bash_history`
> >>      while read ThisCommand; do
> >>            ThisCommand=${ThisCommand// /__};
> >>            if  [[ "$History" == [.]*$ThisCommand[.]* ]]
> >>     
> >
> > I think you may want this.
> >
> > if [[ "$History" =~ .*"$ThisCommand".* ]]
> >
> > Also, placing '.' inside of brackets treats it as a literal character
> > when used in a regex.
> >
> >   
> >>                then
> >>                    echo "nothing" >/dev/null
> >>                else
> >>                    History="$History $ThisCommand";
> >>            fi
> >>      done <<< "$BashHistory"
> >>
> >>  From what I have read the dot is supposed to match any character, and 
> >> then the * specifies that any number of characters can appear in that 
> >> position. I'm just realizing that would specify perhaps one character 
> >> only, repeated to the ends of the large string before it would match.
> >>     
> >
> > Your description of the dot and asterisk regex metacharacters is
> > correct, but I think that you are confused on regular expressions and
> > glob characters.
> >
> > Bash uses regular expressions in a [[ ... =~ ... ]] expression.
> > Whereas, glob expansion could be used in a [[ ... == ... ]]  expression.
> >
> >   
> >> I do not know how to specify that the whole front, and the whole back of 
> >> the variable should be the same with the matching string in the middle 
> >> of them.
> >>     
> OK, thanks for the tips. I did find a way to make the glob expansion 
> work. [not that I know a glob expansion from anything else, just 
> identifying it by the == you mentioned]

$ man 7 glob

May help to explain glob patterns.

$ man 7 regex

Likewise for regular expressions.

> Here is working code that now only misses one pair of duplicates in
> the entire file. I do not know what it is about that particular line
> that makes it not match, but it's kinda frustrating.
> 
> Note the curly braces around the ThisCommand variable in the if 
> statement. Without them this expression would not catch any matches at all.
> 
> function CompressBashHistory {
>      BashHistory=`cat ~/.bash_history`
>      while read ThisCommand; do
>            ThisCommand=${ThisCommand// /__};
>            if  [[ "$History" == *${ThisCommand}* ]]

Try quoting variable which is the fixed part of your pattern.

if [[ "$History" == *"${ThisCommand}"* ]]

>                then
>                    echo "nothing" >/dev/null
>                else
>                    History="$History $ThisCommand";
>            fi
>      done <<< "$BashHistory"
>      Command=`Xdialog --stdout --title "Ray's Bash Manager - Select 
> Command to Copy" --cancel-label "Exit" --combobox "Select a Command to 
> Copy to the Clipboard." 0 60 $History`
> echo "$Command"
> }
> 
> Here is the line that always gets repeated just once in the results -
> 
> SearchTerm="<div><a__href="index.html">Ray's__Links__Home</a>{A-Za-z0-9/<>]*</div>"
> 
> Is there something in that line that prevents it from matching itself 
> when tested?

I think it's the embedded regex that's being interpolated and
misinterpreted as a glob, and there may be issues with embedded quote
characters as well.  I think that quoting the variable embedded in the
pattern will help.

You may already peruse Usenet; I do not know.  Anyway, if you are
inclined, the group comp.unix.shell is a well of knowledge for shell
scripting.




More information about the ubuntu-users mailing list