AWK experts - how would I code around this in awk...

Alex Janssen alex at ourwoods.org
Fri Feb 19 00:40:32 UTC 2010


Steve Flynn wrote:
> I have a text file with lines of differing length.
>
> I want  to parse the entire file and make each line (for example) 20 bytes long.
>
> If a record is too short as in the following example: (rec1,2 and 3
> are just so I can refer to them easily - the example file should be
> just the numeric portion).
>
> rec1 123456789012345
> rec2 67890
> rec3 12345678901234567890
>
> ... then I need to append rec2 to rec1.
>
> Obviously after appending rec2 to rec1, the next line to be read
> should be rec3. After completion, the entire file would consist of two
> records in this example case, both 20 bytes long.
>
>
>
> I should point out that the complete file may well be in the hundreds
> of millions of records so holding the entire thing in memory is
> probably not a good idea.
>
> Any idea on how I would go about this in awk?
>
> If you believe awk to not be a good candidate for this, I'm open to
> suggestions on alternatives.
>
>
> (as a side note, this is for some data which I need to parse which has
> embedded CF/LF's in it, thus splitting what should be one record into
> perhaps multiples rows... I need a quick (and easy) way of stitching
> it back together.
>
>
>   
Nice bunch of solutions!
I love a good data processing problem!

-- 
Ourwoods.org
 Only two things are infinite, the universe and human stupidity, and I'm not sure about the former. - Albert Einstein (275)





More information about the ubuntu-users mailing list