[CoLoCo] bash question

David Overcash funnylookinhat at gmail.com
Fri Aug 19 22:56:32 UTC 2011


Might I add that - while you all are python fanatics - my PHP code probably
would have "just worked" by now... :)

On Fri, Aug 19, 2011 at 4:21 PM, Neal McBurnett <neal at bcn.boulder.co.us>wrote:

> This is actually a good example of why a modern language is important
> to have in your toolbelt.  It just isn't possible in general to decide
> what to do with a non-ascii character unless you know which character
> encoding is being used.  Python gives you lots of tools to help with
> that, which bash and awk don't.
>
> I'm guessing that on the line that says
>    data = open('big2.log')
>
> you want to declare the encoding of the file.  What kind of file is
> it, from where?
>
> Here is one guess, if that was an "a-circumflex" character, using the
> most popular western european encoding:
>
>  data = open('big2.log', encoding='latin-1')
>
> See the python doc Unicode Howto:
> http://docs.python.org/howto/unicode.html
>
> Neal McBurnett                 http://neal.mcburnett.org/
>
> On Fri, Aug 19, 2011 at 04:07:52PM -0600, Jim Hutchinson wrote:
> > Joey,
> >
> > Tried this in Python using files.txt in place of big2.log (shouldn't
> matter
> > what I call it, right?) and got this error
> >
> > Non-ASCII character '\xc2' in file test.sh on line 6, but no encoding
> declared;
> >
> > I copied and pasted your script as written and saved to test.sh and ran
> it. I
> > used full path in the files.txt file.
> >
> > Any ideas?
> >
> > Thanks,
> > Jim
> >
> > On Fri, Aug 19, 2011 at 3:08 PM, Joey Stanford <joey at canonical.com>
> wrote:
> >
> >     I think the easier way is going to be with awk ... but here's a
> python
> >     program that's roughly equivalent...just not looking for the 3rd
> field
> >
> >     #! /usr/bin/env python
> >
> >     data = open('big2.log')
> >     totals = {}
> >     for line in data:
> >        line = line.strip()
> >        if line:
> >            pageid = line.split()[1]
> >            pagecount = int(line.split()[0])
> >            if pageid in totals:
> >               totals[pageid] += pagecount
> >            else:
> >               totals[pageid] = pagecount
> >
> >     for key in totals:
> >            print totals[key], key
> >
> >
> >     On Fri, Aug 19, 2011 at 14:45, Jim Hutchinson <jim at ubuntu-rocks.org>
> wrote:
> >     > Wondering if any of you script gurus can help with a small problem.
> I
> >     have
> >     > several text files containing 3 columns. I was to count the number
> of
> >     > occurrences of the text in column 2 (or just count the lines) and
> sum
> >     column
> >     > 3 which is a number. I know how to do the latter with something
> like
> >     >
> >     > #!/bin/bash
> >     >
> >     > file="/home/test/file1.txt"
> >     > cat ${file} | \
> >     > while read name article count
> >     > do
> >     > sum=$(($sum + $count ))
> >     > echo "$sum"
> >     > done
> >     >
> >     > Although that prints each sum as it goes rather than just the final
> sum.
> >     > I'm not sure how to count text (basically counting the lines that
> contain
> >     > the numbers would work the same). Also, because each file has a
> header
> >     row
> >     > it's giving errors so I need to tell it to skip row 1.
> >     > Finally, I want to automate the input of each file so having it
> read the
> >     > list of text files from somewhere, process the file, output to a
> new file
> >     > amending each time, and then repeat with the next one until all
> files are
> >     > done.
> >     > Any ideas?
> >     > Thanks.
> >     > --
> >     > Jim (Ubuntu geek extraordinaire)
> >     > ----
> >     > Please avoid sending me Word or PowerPoint attachments.
> >     > See http://www.gnu.org/philosophy/no-word-attachments.html
> >     >
> >     > --
> >     > Ubuntu-us-co mailing list
> >     > Ubuntu-us-co at lists.ubuntu.com
> >     > Modify settings or unsubscribe at:
> >     > https://lists.ubuntu.com/mailman/listinfo/ubuntu-us-co
> >     >
> >     >
> >
> >     --
> >     Ubuntu-us-co mailing list
> >     Ubuntu-us-co at lists.ubuntu.com
> >     Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/
> >     listinfo/ubuntu-us-co
> >
> >
> >
> >
> > --
> > Jim (Ubuntu geek extraordinaire)
> > ----
> > Please avoid sending me Word or PowerPoint attachments.
> > See http://www.gnu.org/philosophy/no-word-attachments.html
>
> > --
> > Ubuntu-us-co mailing list
> > Ubuntu-us-co at lists.ubuntu.com
> > Modify settings or unsubscribe at:
> https://lists.ubuntu.com/mailman/listinfo/ubuntu-us-co
>
>
> --
> Ubuntu-us-co mailing list
> Ubuntu-us-co at lists.ubuntu.com
> Modify settings or unsubscribe at:
> https://lists.ubuntu.com/mailman/listinfo/ubuntu-us-co
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.ubuntu.com/archives/ubuntu-us-co/attachments/20110819/026cdd6b/attachment-0001.html>


More information about the Ubuntu-us-co mailing list