AWK experts - how would I code around this in awk...
Chan Chung Hang Christopher
christopher.chan at bradbury.edu.hk
Fri Feb 19 13:13:05 UTC 2010
Steve Flynn wrote:
> On Fri, Feb 19, 2010 at 12:52 PM, Chan Chung Hang Christopher
> <christopher.chan at bradbury.edu.hk> wrote:
>> For performance, it is C, awk and then perl/python...hmm...not sure
>> where php sits. At least, that was how it was back in 2002-2006 when I
>> had to parse a few dozen hourly mail log files that were each over 1GB
>> in size.
> Ran the perl version (Thanks Karl!) through some test data this morning.
> Whilst the data itself is no good for this test, I wanted to see the
> effects of various values of n (the required record length).
> Results are as follows:
> File with 1,375,031 lines in it.
> N=10: 1m 18 seconds
> N = 100: 12 seconds
> N = 1000: 6 seconds
> N=32768: 5 seconds
> 32768 will be the limit, as this is a system limit on the MVS
> mainframe creating the file.
> I'm still waiting to get hold of a suitably large file to give it a
> proper workout but so far, it seems promising.
I did say 'parse' and not just strip newlines and carriage-returns and
watch a counter. :-D
Which is why it is good to test the algorithm out in perl/python and
where needed, take it to awk or C.
More information about the ubuntu-users