Text Manipulation/Replacement

Karl Larsen k5di at zianet.com
Tue Sep 23 11:34:28 UTC 2008


Ubence Quevedo wrote:
> On Sep 22, 2008, at 04:25 PM, NoOp wrote:
>
>   
>> On 09/22/2008 03:53 PM, Ubence Quevedo wrote:
>>     
>>> ----- Original Message ----
>>>       
>>>> From: Chris Mohler <cr33dog at gmail.com>
>>>> To: "Ubuntu user technical support, not for general discussions" <ubuntu-users at lists.ubuntu.com 
>>>>         
>>>> Sent: Monday, September 22, 2008 3:22:43 PM
>>>> Subject: Re: Text Manipulation/Replacement
>>>>
>>>> On Mon, Sep 22, 2008 at 4:57 PM, Ubence Quevedo wrote:
>>>>         
>>>>> Hello All,
>>>>>
>>>>> I've used pdftotext to convert a pdf document to text and then  
>>>>> used a
>>>>> combination of grep and awk to single out data and replace  
>>>>> formatting
>>>>> that I didn't need.
>>>>>
>>>>> The output data eventually looks like this:
>>>>> 12,123456789
>>>>> ,0987654321
>>>>>
>>>>> But I want it to look like this:
>>>>> 12,123456789,0987654321
>>>>>
>>>>> I've tried many different things with awk, but I can't get it  
>>>>> replace \r, with
>>>>>           
>>>> just a ,
>>>>
>>>> Hmm - I've always had headaches dealing with newlines in sed and awk
>>>> (to a lesser extent - I'm more familiar with sed).
>>>>
>>>> How about perl?
>>>>
>>>> cat foo.txt | perl -pi -e 's/\n//g'
>>>>
>>>>         
>>> Hi Chris,
>>>
>>> This worked...kinda...but it ate all of the new lines, so I have  
>>> one continuous line.  I need to find all instances of "\n," and  
>>> replace them with ",".  That way it is very specific in what is  
>>> found and replaced.  I have very little perl knowledge, and my  
>>> feeble attempt at modifying the perl command above failed miserably.
>>>
>>> Any other ideas?
>>>
>>> -Ubence
>>>
>>>       
>> Perhaps a silly question... can you not open the pdf in Adobe Reader  
>> 8,
>> then copy & paste the text to OpenOffice Writer & accomplish what  
>> you want?
>>
>>
>> -- 
>> ubuntu-users mailing list
>> ubuntu-users at lists.ubuntu.com
>> Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-users
>>     
>
> If that were an option, then yes.  However, I'd prefer to keep this to  
> the command line as much as possible.  I could take the output file  
> and transfer it to my Mac and use TextWrangler to do what I want, but  
> I'd rather not [since anyone else that might be doing this procedure  
> in the future wouldn't have access to a Mac].
>
> -Ubence
>
>   
    I think the use of Open Office Write is a viable way to do the job 
and anyone on Linux or Windows already has this software, or can load 
it. I had to load Open Office on my my wife's Windows because we have 
Word from Microsoft Office 97 that has worked fine up to now. But last 
week it failed to print. With the new Office at $450.00 in the college 
book store, we d/l Open Office and it prints fine :-)


Karl


-- 

	Karl F. Larsen, AKA K5DI
	Linux User
	#450462   http://counter.li.org.
   PGP 4208 4D6E 595F 22B9 FF1C  ECB6 4A3C 2C54 FE23 53A7





More information about the ubuntu-users mailing list