I did not intend to start a discussion on the values of various languages but there it is. I'm not a programmer. I just know how to use a bit of bash here and there and need to solve a problem. With a bit of guidance I can do it in bash. With a lot of guidance I could do it in python, ruby, C, PHP, or whatever.<div> <br></div><div>Attached is a sample file. I have 1000 of them. I need to sum the counts in column 3 and average them. I'd like to output all the sums and averages to one file to make my life simple but not being much of a programmer I'll settle for screen output and I'll past into a spreadsheet from there.</div> <div><br></div><div>Using awk one line at a time I can get the results I need</div><div><br></div><div>awk '{sum=sum+$3} END {print sum}' file.txt</div><div><br></div><div>and </div><div><br></div><div>awk '{sum=sum+$3} END {print sum/NR}' file.txt</div> <div><br></div><div>Seem to work but doing this 1000 times is not something I'm looking forward to. Reading in each file automatically and appending the results to an output file would be preferable.</div><div><br></div> <div>If anyone can point me in the right direction (in any language) I'd appreciate it. I've been googling and got this far but the scripts I'm finding are not easy to decipher and my attempts to modify them have been unsuccessful.</div> <div><br></div><div>Thanks,</div><div>Jim<br><br><div class="gmail_quote">On Fri, Aug 19, 2011 at 7:37 PM, Neal McBurnett <span dir="ltr"><<a href="mailto:neal@bcn.boulder.co.us">neal@bcn.boulder.co.us</a>></span> wrote:<br> <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">I've attached a program and two sample files that I think does the<br> rest of the stuff you asked for, and is a bit more idiomatic.<br> <br> One of the test files has a unicode character in it, and the other has<br> a latin-1 character in it, but neither gives an error like what you<br> saw. I'm wondering if your input file has an internally-inconsistent<br> encoding problem.<br> <br> I actually included the test files (and another copy of the program)<br> in a zip file so the characters get thru with their varied encodings.<br> <br> Run:<br> python count_sum.py /tmp/qf /tmp/qy<br> <br> and it produces this:<br> <br> Writing totals to /tmp/qf-out<br> Writing totals to /tmp/qy-out<br> <br> and for example, /tmp/qf-out contains:<br> <br> 6 blue<br> 5 red<br> <br> If that's not what you wanted, say what you want.<br> <div class="im"><br> Neal McBurnett <a href="http://neal.mcburnett.org/" target="_blank">http://neal.mcburnett.org/</a><br> <br> On Fri, Aug 19, 2011 at 04:07:52PM -0600, Jim Hutchinson wrote:<br> </div><div><div></div><div class="h5">> Joey,<br> ><br> > Tried this in Python using files.txt in place of big2.log (shouldn't matter<br> > what I call it, right?) and got this error<br> ><br> > Non-ASCII character '\xc2' in file test.sh on line 6, but no encoding declared;<br> ><br> > I copied and pasted your script as written and saved to test.sh and ran it. I<br> > used full path in the files.txt file.<br> ><br> > Any ideas?<br> ><br> > Thanks,<br> > Jim<br> ><br> > On Fri, Aug 19, 2011 at 3:08 PM, Joey Stanford <<a href="mailto:joey@canonical.com">joey@canonical.com</a>> wrote:<br> ><br> > I think the easier way is going to be with awk ... but here's a python<br> > program that's roughly equivalent...just not looking for the 3rd field<br> ><br> > #! /usr/bin/env python<br> ><br> > data = open('big2.log')<br> > totals = {}<br> > for line in data:<br> > line = line.strip()<br> > if line:<br> > pageid = line.split()[1]<br> > pagecount = int(line.split()[0])<br> > if pageid in totals:<br> > totals[pageid] += pagecount<br> > else:<br> > totals[pageid] = pagecount<br> ><br> > for key in totals:<br> > print totals[key], key<br> ><br> ><br> > On Fri, Aug 19, 2011 at 14:45, Jim Hutchinson <<a href="mailto:jim@ubuntu-rocks.org">jim@ubuntu-rocks.org</a>> wrote:<br> > > Wondering if any of you script gurus can help with a small problem. I<br> > have<br> > > several text files containing 3 columns. I was to count the number of<br> > > occurrences of the text in column 2 (or just count the lines) and sum<br> > column<br> > > 3 which is a number. I know how to do the latter with something like<br> > ><br> > > #!/bin/bash<br> > ><br> > > file="/home/test/file1.txt"<br> > > cat ${file} | \<br> > > while read name article count<br> > > do<br> > > sum=$(($sum + $count ))<br> > > echo "$sum"<br> > > done<br> > ><br> > > Although that prints each sum as it goes rather than just the final sum.<br> > > I'm not sure how to count text (basically counting the lines that contain<br> > > the numbers would work the same). Also, because each file has a header<br> > row<br> > > it's giving errors so I need to tell it to skip row 1.<br> > > Finally, I want to automate the input of each file so having it read the<br> > > list of text files from somewhere, process the file, output to a new file<br> > > amending each time, and then repeat with the next one until all files are<br> > > done.<br> > > Any ideas?<br> > > Thanks.<br> </div></div><br>--<br> Ubuntu-us-co mailing list<br> <a href="mailto:Ubuntu-us-co@lists.ubuntu.com">Ubuntu-us-co@lists.ubuntu.com</a><br> Modify settings or unsubscribe at: <a href="https://lists.ubuntu.com/mailman/listinfo/ubuntu-us-co" target="_blank">https://lists.ubuntu.com/mailman/listinfo/ubuntu-us-co</a><br> <br></blockquote></div><br><br clear="all"><div><br></div>-- <br>Jim (Ubuntu geek extraordinaire)<br>----<br>Please avoid sending me Word or PowerPoint attachments.<br>See <a href="http://www.gnu.org/philosophy/no-word-attachments.html">http://www.gnu.org/philosophy/no-word-attachments.html</a><br> </div>