[CoLoCo] bash question

David Overcash funnylookinhat at gmail.com
Sat Aug 20 14:08:25 UTC 2011


The error is coming from the file where it reads a file... and because it
runs for a while, my guess is that it's looping a few files and then getting
the error on a specific one.  It definitely should be only reading .txt
files with those params.

Go ahead and try running the program with just one text file to verify it
works correctly.

i.e.
python averages.py AED.txt

On Sat, Aug 20, 2011 at 8:02 AM, Jim Hutchinson <jim at ubuntu-rocks.org>wrote:

> David,
>
> It errors out on the .py script itself. Shouldn't the *.txt tell it to skip
> any non .txt file? Guess I need to point it to a dir with just the files.
>
> Thanks.
>
> Kevin,
>
> As soon as I figure out if I have ruby on my laptop I'll give that a try.
> Thanks for the help and lesson. Much easier to understand what it's doing
> that way.
>
> Jim
>
>
> On Sat, Aug 20, 2011 at 7:19 AM, David Overcash <funnylookinhat at gmail.com>wrote:
>
>> Sounds like one of your files is corrupted...
>>
>> Right after "total=0" add this:
>>    print "%s" , filename
>>
>> That should print every file that works and then the final one that
>> doesn't ( if I'm counting lines correctly... it's still a bit early...  ;)
>>  )
>>
>> On Fri, Aug 19, 2011 at 11:14 PM, Jim Hutchinson <jim at ubuntu-rocks.org>wrote:
>>
>>> Thanks Neal,
>>>
>>> Gave that a try and got
>>>
>>> "SyntaxError: Non-ASCII character '\xc2' in file averages.py on line 14,
>>> but no encoding declared;"
>>>
>>> I just copied your code to a file and called it "averages.py" and ran it
>>> like you said:
>>>
>>> python averages.py *.txt > averages.out
>>>
>>> Seemed like it was thinking then gave the error. It created the output
>>> file but it's empty.
>>>
>>> Thanks,
>>> Jim
>>>
>>>
>>> On Fri, Aug 19, 2011 at 10:20 PM, Neal McBurnett <neal at bcn.boulder.co.us
>>> > wrote:
>>>
>>>> The the way you restated the problem is very different and much
>>>> easier.  Here is a program to do that.  Just put all 1000 files in one
>>>> directory, which I'll assume all end in ".txt", and say
>>>>
>>>>  python jim-averages.py *.txt > averages.out
>>>>
>>>> and it will put the results in "averages.out" (which does not end in
>>>> .txt....).
>>>> They are tab separated for easy loading into a spreadsheet.
>>>>
>>>> By the way, I showed how to run the one I sent before in that message.
>>>> I.e., just list all the files you want it to process on the command
>>>> line.
>>>>
>>>> >     Run:
>>>> >      python count_sum.py /tmp/qf /tmp/qy
>>>>
>>>> If you don't list any files, it doesn't have anything to do....
>>>>
>>>> Neal McBurnett                 http://neal.mcburnett.org/
>>>>
>>>> --- the program jim-averages.py ---
>>>> #! /usr/bin/env python
>>>> """
>>>> "Print the total of column 3, and the average value of column 3.
>>>> Skip the first line (a header)
>>>> """
>>>>
>>>> import sys
>>>>
>>>> FILES = sys.argv[1:]
>>>>
>>>> print "file\ttotal\taverage"
>>>>
>>>> for filename in FILES:
>>>>    total = 0
>>>>
>>>>    for n, line in enumerate(open(filename)):
>>>>        if n == 0:
>>>>            continue
>>>>
>>>>        total += int(line.split()[2])
>>>>
>>>>    print "%s\t%d\t%f" % (filename, total, total * 1.0 / n)
>>>> ---
>>>>
>>>> On Fri, Aug 19, 2011 at 10:05:17PM -0600, Jim Hutchinson wrote:
>>>> > Neal,
>>>> >
>>>> > I tried the script you attached. I ran it from a terminal by typing
>>>> >
>>>> > python count_sum.py
>>>> >
>>>> > It ran but gave no output and if it created a file I can't find it. I
>>>> suspect I
>>>> > have to have a file that it reads in first but not sure where to put
>>>> that in
>>>> > the script, the path to use, the location of the .py file, etc.
>>>> >
>>>> > Any suggestions?
>>>> >
>>>> > Thanks,
>>>> > Jim
>>>> >
>>>> > On Fri, Aug 19, 2011 at 7:37 PM, Neal McBurnett <
>>>> neal at bcn.boulder.co.us> wrote:
>>>> >
>>>> >     I've attached a program and two sample files that I think does the
>>>> >     rest of the stuff you asked for, and is a bit more idiomatic.
>>>> >
>>>> >     One of the test files has a unicode character in it, and the other
>>>> has
>>>> >     a latin-1 character in it, but neither gives an error like what
>>>> you
>>>> >     saw.  I'm wondering if your input file has an
>>>> internally-inconsistent
>>>> >     encoding problem.
>>>> >
>>>> >     I actually included the test files (and another copy of the
>>>> program)
>>>> >     in a zip file so the characters get thru with their varied
>>>> encodings.
>>>> >
>>>> >     Run:
>>>> >      python count_sum.py /tmp/qf /tmp/qy
>>>> >
>>>> >     and it produces this:
>>>> >
>>>> >     Writing totals to /tmp/qf-out
>>>> >     Writing totals to /tmp/qy-out
>>>> >
>>>> >     and for example, /tmp/qf-out contains:
>>>> >
>>>> >     6 blue
>>>> >     5 red
>>>> >
>>>> >     If that's not what you wanted, say what you want.
>>>> >
>>>> >     Neal McBurnett                 http://neal.mcburnett.org/
>>>> >
>>>> >     On Fri, Aug 19, 2011 at 04:07:52PM -0600, Jim Hutchinson wrote:
>>>> >     > Joey,
>>>> >     >
>>>> >     > Tried this in Python using files.txt in place of big2.log
>>>> (shouldn't
>>>> >     matter
>>>> >     > what I call it, right?) and got this error
>>>> >     >
>>>> >     > Non-ASCII character '\xc2' in file test.sh on line 6, but no
>>>> encoding
>>>> >     declared;
>>>> >     >
>>>> >     > I copied and pasted your script as written and saved to test.sh
>>>> and ran
>>>> >     it. I
>>>> >     > used full path in the files.txt file.
>>>> >     >
>>>> >     > Any ideas?
>>>> >     >
>>>> >     > Thanks,
>>>> >     > Jim
>>>> >     >
>>>> >     > On Fri, Aug 19, 2011 at 3:08 PM, Joey Stanford <
>>>> joey at canonical.com>
>>>> >     wrote:
>>>> >     >
>>>> >     >     I think the easier way is going to be with awk ... but
>>>> here's a
>>>> >     python
>>>> >     >     program that's roughly equivalent...just not looking for the
>>>> 3rd
>>>> >     field
>>>> >     >
>>>> >     >     #! /usr/bin/env python
>>>> >     >
>>>> >     >     data = open('big2.log')
>>>> >     >     totals = {}
>>>> >     >     for line in data:
>>>> >     >        line = line.strip()
>>>> >     >        if line:
>>>> >     >            pageid = line.split()[1]
>>>> >     >            pagecount = int(line.split()[0])
>>>> >     >            if pageid in totals:
>>>> >     >               totals[pageid] += pagecount
>>>> >     >            else:
>>>> >     >               totals[pageid] = pagecount
>>>> >     >
>>>> >     >     for key in totals:
>>>> >     >            print totals[key], key
>>>> >     >
>>>> >     >
>>>> >     >     On Fri, Aug 19, 2011 at 14:45, Jim Hutchinson <
>>>> jim at ubuntu-rocks.org>
>>>> >     wrote:
>>>> >     >     > Wondering if any of you script gurus can help with a small
>>>> problem.
>>>> >     I
>>>> >     >     have
>>>> >     >     > several text files containing 3 columns. I was to count
>>>> the number
>>>> >     of
>>>> >     >     > occurrences of the text in column 2 (or just count the
>>>> lines) and
>>>> >     sum
>>>> >     >     column
>>>> >     >     > 3 which is a number. I know how to do the latter with
>>>> something
>>>> >     like
>>>> >     >     >
>>>> >     >     > #!/bin/bash
>>>> >     >     >
>>>> >     >     > file="/home/test/file1.txt"
>>>> >     >     > cat ${file} | \
>>>> >     >     > while read name article count
>>>> >     >     > do
>>>> >     >     > sum=$(($sum + $count ))
>>>> >     >     > echo "$sum"
>>>> >     >     > done
>>>> >     >     >
>>>> >     >     > Although that prints each sum as it goes rather than just
>>>> the final
>>>> >     sum.
>>>> >     >     > I'm not sure how to count text (basically counting the
>>>> lines that
>>>> >     contain
>>>> >     >     > the numbers would work the same). Also, because each file
>>>> has a
>>>> >     header
>>>> >     >     row
>>>> >     >     > it's giving errors so I need to tell it to skip row 1.
>>>> >     >     > Finally, I want to automate the input of each file so
>>>> having it
>>>> >     read the
>>>> >     >     > list of text files from somewhere, process the file,
>>>> output to a
>>>> >     new file
>>>> >     >     > amending each time, and then repeat with the next one
>>>> until all
>>>> >     files are
>>>> >     >     > done.
>>>> >     >     > Any ideas?
>>>> >     >     > Thanks.
>>>> >
>>>> >     --
>>>> >     Ubuntu-us-co mailing list
>>>> >     Ubuntu-us-co at lists.ubuntu.com
>>>> >     Modify settings or unsubscribe at:
>>>> https://lists.ubuntu.com/mailman/
>>>> >     listinfo/ubuntu-us-co
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>> > --
>>>> > Jim (Ubuntu geek extraordinaire)
>>>> > ----
>>>> > Please avoid sending me Word or PowerPoint attachments.
>>>> > See http://www.gnu.org/philosophy/no-word-attachments.html
>>>>
>>>> > --
>>>> > Ubuntu-us-co mailing list
>>>> > Ubuntu-us-co at lists.ubuntu.com
>>>> > Modify settings or unsubscribe at:
>>>> https://lists.ubuntu.com/mailman/listinfo/ubuntu-us-co
>>>>
>>>>
>>>> --
>>>> Ubuntu-us-co mailing list
>>>> Ubuntu-us-co at lists.ubuntu.com
>>>> Modify settings or unsubscribe at:
>>>> https://lists.ubuntu.com/mailman/listinfo/ubuntu-us-co
>>>>
>>>
>>>
>>>
>>> --
>>> Jim (Ubuntu geek extraordinaire)
>>> ----
>>> Please avoid sending me Word or PowerPoint attachments.
>>> See http://www.gnu.org/philosophy/no-word-attachments.html
>>>
>>> --
>>> Ubuntu-us-co mailing list
>>> Ubuntu-us-co at lists.ubuntu.com
>>> Modify settings or unsubscribe at:
>>> https://lists.ubuntu.com/mailman/listinfo/ubuntu-us-co
>>>
>>>
>>
>> --
>> Ubuntu-us-co mailing list
>> Ubuntu-us-co at lists.ubuntu.com
>> Modify settings or unsubscribe at:
>> https://lists.ubuntu.com/mailman/listinfo/ubuntu-us-co
>>
>>
>
>
> --
> Jim (Ubuntu geek extraordinaire)
> ----
> Please avoid sending me Word or PowerPoint attachments.
> See http://www.gnu.org/philosophy/no-word-attachments.html
>
> --
> Ubuntu-us-co mailing list
> Ubuntu-us-co at lists.ubuntu.com
> Modify settings or unsubscribe at:
> https://lists.ubuntu.com/mailman/listinfo/ubuntu-us-co
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.ubuntu.com/archives/ubuntu-us-co/attachments/20110820/6b1b0f6c/attachment.html>


More information about the Ubuntu-us-co mailing list