Scripting Question

Chris Mohler cr33dog at gmail.com
Sat Feb 14 08:46:04 UTC 2009


On Sat, Feb 14, 2009 at 9:52 PM, Patton Echols <p.echols at comcast.net> wrote:
> On 02/13/2009 06:41 PM, Chris Mohler wrote:
>> On Sat, Feb 14, 2009 at 8:18 PM, Patton Echols <p.echols at comcast.net> wrote:
>>
>>> I have a fairly massive flat file, comma delimited, that I want to
>>> extract info from.  Specifically, I want to extract the first and last
>>> name and email addresses for those who have them to a new file with just
>>> that info. (The windows database program that this comes from simply
>>> will not do it)  I can grep the file for the @ symbol to at least
>>> exclude the lines without an email address (or the @ symbol in the notes
>>> field)  But if I can figure this out, I can also adapt what I learn for
>>> the next time.  Can anyone point me in the right direction for my "light
>>> reading?"
>>>
>>
>> Maybe this will help (a good start anyway):
>> #===========================
>> #!/usr/bin/env python
>>
>> import csv
>>
>> # Open CSV of ZIP code data
>> file = open("your filename here", 'r')
>> csv = csv.reader(file)
>>
>> for row in csv:
>>     do something....
>> #=======================
>>
>> if you replace "do something" with "print row[0[", it will print the
>> first column, "print row[1]" the second column - you get the idea ;)
>>
>> If you get an error about csv - check that the python-csv package is
>> installed...
>>
>> Chris
>>
>>
>  Is there a place where I can find the syntax for such a thing?
>
> I like the idea of having "do something"  be:  pass one, print column 3,
> column 4, column 12, column 13 and then on pass two, print the rows
> where col 3 and 4 of the result have email addresses.

OK something like:
#===========================
#!/usr/bin/env python

import csv

file = open("your filename here", 'r')
csv = csv.reader(file)

i = 0
for row in csv:
    if ( i == 0):
        print row[2], row[3], row [11], row[12]
    else:
        print  row[2], [3]
    i = i + 1
#=======================

If you want to match email addresses only, 'import re' and then use a
regex (eg: "if re.match") on the column(s).  Python is pretty
user-friendly - and there are a lot of tutorials out there...

Of course, I'm biased ;)
http://xkcd.com/353/

Chris




More information about the ubuntu-users mailing list