Text Manipulation/Replacement
Karl Larsen
k5di at zianet.com
Tue Sep 23 11:34:28 UTC 2008
Ubence Quevedo wrote:
> On Sep 22, 2008, at 04:25 PM, NoOp wrote:
>
>
>> On 09/22/2008 03:53 PM, Ubence Quevedo wrote:
>>
>>> ----- Original Message ----
>>>
>>>> From: Chris Mohler <cr33dog at gmail.com>
>>>> To: "Ubuntu user technical support, not for general discussions" <ubuntu-users at lists.ubuntu.com
>>>>
>>>> Sent: Monday, September 22, 2008 3:22:43 PM
>>>> Subject: Re: Text Manipulation/Replacement
>>>>
>>>> On Mon, Sep 22, 2008 at 4:57 PM, Ubence Quevedo wrote:
>>>>
>>>>> Hello All,
>>>>>
>>>>> I've used pdftotext to convert a pdf document to text and then
>>>>> used a
>>>>> combination of grep and awk to single out data and replace
>>>>> formatting
>>>>> that I didn't need.
>>>>>
>>>>> The output data eventually looks like this:
>>>>> 12,123456789
>>>>> ,0987654321
>>>>>
>>>>> But I want it to look like this:
>>>>> 12,123456789,0987654321
>>>>>
>>>>> I've tried many different things with awk, but I can't get it
>>>>> replace \r, with
>>>>>
>>>> just a ,
>>>>
>>>> Hmm - I've always had headaches dealing with newlines in sed and awk
>>>> (to a lesser extent - I'm more familiar with sed).
>>>>
>>>> How about perl?
>>>>
>>>> cat foo.txt | perl -pi -e 's/\n//g'
>>>>
>>>>
>>> Hi Chris,
>>>
>>> This worked...kinda...but it ate all of the new lines, so I have
>>> one continuous line. I need to find all instances of "\n," and
>>> replace them with ",". That way it is very specific in what is
>>> found and replaced. I have very little perl knowledge, and my
>>> feeble attempt at modifying the perl command above failed miserably.
>>>
>>> Any other ideas?
>>>
>>> -Ubence
>>>
>>>
>> Perhaps a silly question... can you not open the pdf in Adobe Reader
>> 8,
>> then copy & paste the text to OpenOffice Writer & accomplish what
>> you want?
>>
>>
>> --
>> ubuntu-users mailing list
>> ubuntu-users at lists.ubuntu.com
>> Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-users
>>
>
> If that were an option, then yes. However, I'd prefer to keep this to
> the command line as much as possible. I could take the output file
> and transfer it to my Mac and use TextWrangler to do what I want, but
> I'd rather not [since anyone else that might be doing this procedure
> in the future wouldn't have access to a Mac].
>
> -Ubence
>
>
I think the use of Open Office Write is a viable way to do the job
and anyone on Linux or Windows already has this software, or can load
it. I had to load Open Office on my my wife's Windows because we have
Word from Microsoft Office 97 that has worked fine up to now. But last
week it failed to print. With the new Office at $450.00 in the college
book store, we d/l Open Office and it prints fine :-)
Karl
--
Karl F. Larsen, AKA K5DI
Linux User
#450462 http://counter.li.org.
PGP 4208 4D6E 595F 22B9 FF1C ECB6 4A3C 2C54 FE23 53A7
More information about the ubuntu-users
mailing list