scripting fun

Steve Lamb grey at dmiyu.org
Fri Jun 6 01:06:48 UTC 2008


Bart Silverstrim wrote:
> Okay...closer...combining what others have given (thank you everyone!), 

    Not wanting to start a holy war but shell scripting and "fun" aren't
normally words I join without some form of negative modifier in there.  As
you've run across issues with parameter passing and globbing just make the
entire exercise a thousand times harder than it needs to be.

    Python, Ruby, Perl, all better suited to this task simply because you
don't have to worry about the shell doing something wicked with your data as
it streams hither and yon.

    Just as an example here's a sloppy, off-the-cuff version of your script in
Python:

import os
log = '/var/log/apache-perl/access.log'
ipbin = '/sbin/iptables -L INPUT -v -n'
iptables = '/tmp/iptables'
newips = {}
oldips = {}

lines = open(log, 'r')
for line in lines:
    if line.count('slurp') > 0:
        newips[line.slpit()[0]] = True
lines.close()
os.system('%s -L INPUT -v -n > %s')
lines = open(iptables)
os.unlink(iptables)
for line in lines:
    if line.count('source') > 0:
        continue
    try:
        oldips[line.split()[7]] = True
    except IndexError:
        pass
iptables.close()
for ip in newips:
    if ip not in oldips:
        os.system(%s -A INPUT -s %s -j DROP' % (ipbin, ip))


    Yes, I admit it is a tad longer (and probably does have a lurking bug or
two since I did not run it) but what prompted me to write this message is
three fold:

1:
for line in lines:
    if line.count('slurp') > 0:
        newips[line.slpit()[0]] = True

vs.

grep -i slurp /var/log/apache-perl/access.log |awk '{print$1}' >
~/temp/tmp.txt
sort ~/temp/tmp.txt > ~/temp/tmp2.txt
uniq ~/temp/tmp2.txt > ~/temp/slurps.txt
-----
    This uses a dict as a uniq filter.  No need to worry about uniq vs sort -u
vs sort-un.  The key is the IP and setting same IP to True 80 times still
results in exactly 1 entry.


2:
for ip in newips:
    if ip not in oldips:

vs.

join -v2 < (sort <~/temp/blocked.txt | uniq ) < (sort <~/temp/slurps.txt
| uniq ) > ~/temp/newaddresses.txt
-----

    Look, I've been poking around at Linux and other Unix(-like) system for
almost 2 decades now.  Even so I look at that shell line and want to just bang
my head against the desk.  As you found out join seems to have some issues
with sort order.  Besides, shouldn't they already be sorted and uniqed from
previous runs?

    Regardless, look at the Python lines.  Dicts are hashes.  That means they
are inherently unsorted.  It's obvious I don't care about sort order.  Also
since I used the same trick of generating each key once I just iterate over
the newips (from the log file) and check to see if they are in the oldips
(from iptables).  Probably not the fastest way to do it in Python but, hey,
its rare I need to code for speed on utility scripts whereas readability and
quick, first off, runs count for a lot.  ;)

    The third reason is simple.  I saw your message and thought, "Wow, neat,
that'd be fun to write in Python."  Seriously.  I enjoy the mental exercise of
doodling with little tools like this in Python.  It's fun scripting versus, as
I read it, the ironic scripting 'fun'.  ;)

-- 
         Steve C. Lamb         | But who decides what they dream?
       PGP Key: 1FC01004       |   And dream I do...
-------------------------------+---------------------------------------------

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 252 bytes
Desc: OpenPGP digital signature
URL: <https://lists.ubuntu.com/archives/ubuntu-users/attachments/20080605/25167d24/attachment.sig>


More information about the ubuntu-users mailing list