Text Manipulation/Replacement
NoOp
glgxg at sbcglobal.net
Mon Sep 22 23:25:47 UTC 2008
On 09/22/2008 03:53 PM, Ubence Quevedo wrote:
>
> ----- Original Message ----
>> From: Chris Mohler <cr33dog at gmail.com>
>> To: "Ubuntu user technical support, not for general discussions" <ubuntu-users at lists.ubuntu.com>
>> Sent: Monday, September 22, 2008 3:22:43 PM
>> Subject: Re: Text Manipulation/Replacement
>>
>> On Mon, Sep 22, 2008 at 4:57 PM, Ubence Quevedo wrote:
>> > Hello All,
>> >
>> > I've used pdftotext to convert a pdf document to text and then used a
>> > combination of grep and awk to single out data and replace formatting
>> > that I didn't need.
>> >
>> > The output data eventually looks like this:
>> > 12,123456789
>> > ,0987654321
>> >
>> > But I want it to look like this:
>> > 12,123456789,0987654321
>> >
>> > I've tried many different things with awk, but I can't get it replace \r, with
>> just a ,
>>
>> Hmm - I've always had headaches dealing with newlines in sed and awk
>> (to a lesser extent - I'm more familiar with sed).
>>
>> How about perl?
>>
>> cat foo.txt | perl -pi -e 's/\n//g'
>>
>
> Hi Chris,
>
> This worked...kinda...but it ate all of the new lines, so I have one continuous line. I need to find all instances of "\n," and replace them with ",". That way it is very specific in what is found and replaced. I have very little perl knowledge, and my feeble attempt at modifying the perl command above failed miserably.
>
> Any other ideas?
>
> -Ubence
>
Perhaps a silly question... can you not open the pdf in Adobe Reader 8,
then copy & paste the text to OpenOffice Writer & accomplish what you want?
More information about the ubuntu-users
mailing list