Reading a variable line by line with while loop

James Michael Fultz croooow at gmail.com
Wed Dec 2 01:19:56 UTC 2009


* Ray Parrish <crp at cmc.net> [2009-12-01 16:45 -0800]:
> I have found an if statement that reduces the duplicates to only two 
> pairs for the entire file. That's pretty good considering the number of 
> actual duplicates there were in the file to begin with. Here is the code 
> for that one -
> 
> function CompressHistory {
>      BashHistory=`cat ~/.bash_history`
>      while read ThisCommand; do
>            ThisCommand=${ThisCommand// /__};
>            if  [[ "$History" =~ "$ThisCommand" ]]
>                then
>                    echo "nothing" >/dev/null
>                else
>                    History="$History $ThisCommand";
>            fi
>      done <<< "$BashHistory"
>      echo "$History"
> }
> 
> This next bit of code has me frazzled trying to figure out the proper 
> regular expression to say "if this string exists within the larger 
> string". Nothing I have tried so far has worked, so here is the code, 
> and maybe someone can correct my regular expression so it works to weed 
> out duplicates as well.
> 
>      BashHistory=`cat ~/.bash_history`
>      while read ThisCommand; do
>            ThisCommand=${ThisCommand// /__};
>            if  [[ "$History" == [.]*$ThisCommand[.]* ]]

I think you may want this.

if [[ "$History" =~ .*"$ThisCommand".* ]]

Also, placing '.' inside of brackets treats it as a literal character
when used in a regex.

>                then
>                    echo "nothing" >/dev/null
>                else
>                    History="$History $ThisCommand";
>            fi
>      done <<< "$BashHistory"
> 
>  From what I have read the dot is supposed to match any character, and 
> then the * specifies that any number of characters can appear in that 
> position. I'm just realizing that would specify perhaps one character 
> only, repeated to the ends of the large string before it would match.

Your description of the dot and asterisk regex metacharacters is
correct, but I think that you are confused on regular expressions and
glob characters.

Bash uses regular expressions in a [[ ... =~ ... ]] expression.
Whereas, glob expansion could be used in a [[ ... == ... ]]  expression.

> I do not know how to specify that the whole front, and the whole back of 
> the variable should be the same with the matching string in the middle 
> of them.




More information about the ubuntu-users mailing list