[CoLoCo] bash question

David Overcash funnylookinhat at gmail.com
Sat Aug 20 13:19:35 UTC 2011


Sounds like one of your files is corrupted...

Right after "total=0" add this:
   print "%s" , filename

That should print every file that works and then the final one that doesn't
( if I'm counting lines correctly... it's still a bit early...  ;)  )

On Fri, Aug 19, 2011 at 11:14 PM, Jim Hutchinson <jim at ubuntu-rocks.org>wrote:

> Thanks Neal,
>
> Gave that a try and got
>
> "SyntaxError: Non-ASCII character '\xc2' in file averages.py on line 14,
> but no encoding declared;"
>
> I just copied your code to a file and called it "averages.py" and ran it
> like you said:
>
> python averages.py *.txt > averages.out
>
> Seemed like it was thinking then gave the error. It created the output file
> but it's empty.
>
> Thanks,
> Jim
>
>
> On Fri, Aug 19, 2011 at 10:20 PM, Neal McBurnett <neal at bcn.boulder.co.us>wrote:
>
>> The the way you restated the problem is very different and much
>> easier.  Here is a program to do that.  Just put all 1000 files in one
>> directory, which I'll assume all end in ".txt", and say
>>
>>  python jim-averages.py *.txt > averages.out
>>
>> and it will put the results in "averages.out" (which does not end in
>> .txt....).
>> They are tab separated for easy loading into a spreadsheet.
>>
>> By the way, I showed how to run the one I sent before in that message.
>> I.e., just list all the files you want it to process on the command
>> line.
>>
>> >     Run:
>> >      python count_sum.py /tmp/qf /tmp/qy
>>
>> If you don't list any files, it doesn't have anything to do....
>>
>> Neal McBurnett                 http://neal.mcburnett.org/
>>
>> --- the program jim-averages.py ---
>> #! /usr/bin/env python
>> """
>> "Print the total of column 3, and the average value of column 3.
>> Skip the first line (a header)
>> """
>>
>> import sys
>>
>> FILES = sys.argv[1:]
>>
>> print "file\ttotal\taverage"
>>
>> for filename in FILES:
>>    total = 0
>>
>>    for n, line in enumerate(open(filename)):
>>        if n == 0:
>>            continue
>>
>>        total += int(line.split()[2])
>>
>>    print "%s\t%d\t%f" % (filename, total, total * 1.0 / n)
>> ---
>>
>> On Fri, Aug 19, 2011 at 10:05:17PM -0600, Jim Hutchinson wrote:
>> > Neal,
>> >
>> > I tried the script you attached. I ran it from a terminal by typing
>> >
>> > python count_sum.py
>> >
>> > It ran but gave no output and if it created a file I can't find it. I
>> suspect I
>> > have to have a file that it reads in first but not sure where to put
>> that in
>> > the script, the path to use, the location of the .py file, etc.
>> >
>> > Any suggestions?
>> >
>> > Thanks,
>> > Jim
>> >
>> > On Fri, Aug 19, 2011 at 7:37 PM, Neal McBurnett <neal at bcn.boulder.co.us>
>> wrote:
>> >
>> >     I've attached a program and two sample files that I think does the
>> >     rest of the stuff you asked for, and is a bit more idiomatic.
>> >
>> >     One of the test files has a unicode character in it, and the other
>> has
>> >     a latin-1 character in it, but neither gives an error like what you
>> >     saw.  I'm wondering if your input file has an
>> internally-inconsistent
>> >     encoding problem.
>> >
>> >     I actually included the test files (and another copy of the program)
>> >     in a zip file so the characters get thru with their varied
>> encodings.
>> >
>> >     Run:
>> >      python count_sum.py /tmp/qf /tmp/qy
>> >
>> >     and it produces this:
>> >
>> >     Writing totals to /tmp/qf-out
>> >     Writing totals to /tmp/qy-out
>> >
>> >     and for example, /tmp/qf-out contains:
>> >
>> >     6 blue
>> >     5 red
>> >
>> >     If that's not what you wanted, say what you want.
>> >
>> >     Neal McBurnett                 http://neal.mcburnett.org/
>> >
>> >     On Fri, Aug 19, 2011 at 04:07:52PM -0600, Jim Hutchinson wrote:
>> >     > Joey,
>> >     >
>> >     > Tried this in Python using files.txt in place of big2.log
>> (shouldn't
>> >     matter
>> >     > what I call it, right?) and got this error
>> >     >
>> >     > Non-ASCII character '\xc2' in file test.sh on line 6, but no
>> encoding
>> >     declared;
>> >     >
>> >     > I copied and pasted your script as written and saved to test.sh
>> and ran
>> >     it. I
>> >     > used full path in the files.txt file.
>> >     >
>> >     > Any ideas?
>> >     >
>> >     > Thanks,
>> >     > Jim
>> >     >
>> >     > On Fri, Aug 19, 2011 at 3:08 PM, Joey Stanford <
>> joey at canonical.com>
>> >     wrote:
>> >     >
>> >     >     I think the easier way is going to be with awk ... but here's
>> a
>> >     python
>> >     >     program that's roughly equivalent...just not looking for the
>> 3rd
>> >     field
>> >     >
>> >     >     #! /usr/bin/env python
>> >     >
>> >     >     data = open('big2.log')
>> >     >     totals = {}
>> >     >     for line in data:
>> >     >        line = line.strip()
>> >     >        if line:
>> >     >            pageid = line.split()[1]
>> >     >            pagecount = int(line.split()[0])
>> >     >            if pageid in totals:
>> >     >               totals[pageid] += pagecount
>> >     >            else:
>> >     >               totals[pageid] = pagecount
>> >     >
>> >     >     for key in totals:
>> >     >            print totals[key], key
>> >     >
>> >     >
>> >     >     On Fri, Aug 19, 2011 at 14:45, Jim Hutchinson <
>> jim at ubuntu-rocks.org>
>> >     wrote:
>> >     >     > Wondering if any of you script gurus can help with a small
>> problem.
>> >     I
>> >     >     have
>> >     >     > several text files containing 3 columns. I was to count the
>> number
>> >     of
>> >     >     > occurrences of the text in column 2 (or just count the
>> lines) and
>> >     sum
>> >     >     column
>> >     >     > 3 which is a number. I know how to do the latter with
>> something
>> >     like
>> >     >     >
>> >     >     > #!/bin/bash
>> >     >     >
>> >     >     > file="/home/test/file1.txt"
>> >     >     > cat ${file} | \
>> >     >     > while read name article count
>> >     >     > do
>> >     >     > sum=$(($sum + $count ))
>> >     >     > echo "$sum"
>> >     >     > done
>> >     >     >
>> >     >     > Although that prints each sum as it goes rather than just
>> the final
>> >     sum.
>> >     >     > I'm not sure how to count text (basically counting the lines
>> that
>> >     contain
>> >     >     > the numbers would work the same). Also, because each file
>> has a
>> >     header
>> >     >     row
>> >     >     > it's giving errors so I need to tell it to skip row 1.
>> >     >     > Finally, I want to automate the input of each file so having
>> it
>> >     read the
>> >     >     > list of text files from somewhere, process the file, output
>> to a
>> >     new file
>> >     >     > amending each time, and then repeat with the next one until
>> all
>> >     files are
>> >     >     > done.
>> >     >     > Any ideas?
>> >     >     > Thanks.
>> >
>> >     --
>> >     Ubuntu-us-co mailing list
>> >     Ubuntu-us-co at lists.ubuntu.com
>> >     Modify settings or unsubscribe at:
>> https://lists.ubuntu.com/mailman/
>> >     listinfo/ubuntu-us-co
>> >
>> >
>> >
>> >
>> >
>> > --
>> > Jim (Ubuntu geek extraordinaire)
>> > ----
>> > Please avoid sending me Word or PowerPoint attachments.
>> > See http://www.gnu.org/philosophy/no-word-attachments.html
>>
>> > --
>> > Ubuntu-us-co mailing list
>> > Ubuntu-us-co at lists.ubuntu.com
>> > Modify settings or unsubscribe at:
>> https://lists.ubuntu.com/mailman/listinfo/ubuntu-us-co
>>
>>
>> --
>> Ubuntu-us-co mailing list
>> Ubuntu-us-co at lists.ubuntu.com
>> Modify settings or unsubscribe at:
>> https://lists.ubuntu.com/mailman/listinfo/ubuntu-us-co
>>
>
>
>
> --
> Jim (Ubuntu geek extraordinaire)
> ----
> Please avoid sending me Word or PowerPoint attachments.
> See http://www.gnu.org/philosophy/no-word-attachments.html
>
> --
> Ubuntu-us-co mailing list
> Ubuntu-us-co at lists.ubuntu.com
> Modify settings or unsubscribe at:
> https://lists.ubuntu.com/mailman/listinfo/ubuntu-us-co
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.ubuntu.com/archives/ubuntu-us-co/attachments/20110820/9f3d1557/attachment.html>


More information about the Ubuntu-us-co mailing list