AWK experts - how would I code around this in awk...

Karl Auer kauer at biplane.com.au
Mon Feb 22 22:10:21 UTC 2010


On Mon, 2010-02-22 at 15:14 +0000, Dave Howorth wrote:
> >>  - it reads the entire contents of stdin into memory
> > 
> > This might be an issue. This AIX box does have 128 gig of RAM but
> 
> You need to qualify the type of memory. It reads stdin into *virtual*
> memory. It's the amount of disk you have that limits the size, not the
> amount of RAM. Just add swap space. You may prefer to use a 64 bit machine.

We have departed from the direct needs of the OP now, but in general a
single read plus a single write will have better performance than
swapping. That is, sequentially read the data you want off disk, process
it and write it out. Swapping tends to mean reading and writing the same
data several times.

> >>  - there may be problems with arbitrarily large variables
> 
> Are you aware of any limits?

No - that's why I said "may be". I don't know how big a Perl variable
can be, but the OP mentioned multi-gigabyte input files. Conventional
wisdon=m says "no limits" - my experience is that everything has limits.

I just wrote a little test program that started swapping at 1 gigabyte
(64-bit laptop with 4 gig, but only about 1 gig available at the time).

A real test would be to see how big Steve Flynn's variables can get :-)
Also, the swapping that happened was WAY slower than just writing a
gigabyte would have been - probably because allocating "RAM" is not as
straightforward as writing a particular gig to disk.

> >>  - a partial last record will not be terminated with a newline
> 
> I didn't see that in the req spec :-P  You can complicate the program
> with a print "\n" or somesuch if you must.

Just mentioning. It's the sort of thing that gets forgotten - an edge
case. The likelihood is that a partial record is not wanted anyway.

> \PS  I sent the prog for fun, don't take it too seriously.

Fun?!? :-)

Regards, K.

-- 
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Karl Auer (kauer at biplane.com.au)                   +61-2-64957160 (h)
http://www.biplane.com.au/~kauer/                  +61-428-957160 (mob)

GPG fingerprint: B386 7819 B227 2961 8301 C5A9 2EBC 754B CD97 0156
Old fingerprint: 07F3 1DF9 9D45 8BCD 7DD5 00CE 4A44 6A03 F43A 7DEF
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 197 bytes
Desc: This is a digitally signed message part
URL: <https://lists.ubuntu.com/archives/ubuntu-users/attachments/20100223/e813c3d3/attachment.sig>


More information about the ubuntu-users mailing list