Off-Topic: Parse an html file and transfer the text found
John Toliver
john.toliver at gmail.com
Wed Aug 6 15:18:29 UTC 2008
I want to send a pastebin because I think it's html with javascript
embedded, but I'm not sure......
On Wed, Aug 6, 2008 at 10:37, Derek Broughton <news at pointerstop.ca> wrote:
> John Toliver wrote:
>
>> So my question to start is which language should I use to pull the
>> data out of an html file? Is perl better for this application, or is
>> python better or some other language?
>
> Yes :-) Any language that has tools for parsing HTML that you're
> comfortable with would be good. If the files are guaranteed valid XHTML,
> you probably have even more choices probably, but certainly Perl or Python
> should be fine, and I'd use Python.
>>
>> I'm probably going to need to brush up on my regular expressions for
>> this but that's ok too.
>
> That's why if they're XHTML, it's easier - because then the files should
> parse with an XML parser and be really easy to extract the meaningful data
> from.
> --
> derek
>
>
> --
> ubuntu-users mailing list
> ubuntu-users at lists.ubuntu.com
> Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-users
>
--
I've discovered the key to success is to never give up. You either
learn the right way, or you run out of ways to do it wrong. A win/win
situation!
More information about the ubuntu-users
mailing list