[CoLoCo] [Off topic] Need help with writing script

Kevin Fries kfries6 at gmail.com
Mon Jul 18 21:52:45 UTC 2011


On 07/16/2011 12:28 AM, Jim Hutchinson wrote:
> Hey all,
>
> I'm looking for someone who would be able to help me with a little 
> problem that I think a simple bash, python, perl, etc. script might be 
> able to solve but my own skills are far too weak for this. I'm in a 
> bit of rush to get this done and would be willing to pay someone but 
> I'm hoping it's an easy task and therefore not too expensive :)
>
> Basically, I have a bunch of tab separated text files relating to 
> wikipedia articles and user edits that I need to combine and filter in 
> a couple different ways. If this sounds at all like something you can 
> do read on for a bit more explanation.
>
> Thanks for reading on... I have text files that are extracted from an 
> SQL query. The purpose of the query was to find the top (more than 10 
> edits) editors of about 150 wikipedia articles (so 150 text files - 
> one for each article - with a list of editors and edit counts) and 
> then catalog the history of edits for all those editors (about 1000 
> unique editors). What I need to do is combine the edit history for the 
> set of editors in each article and search for any duplication of 
> effort (meaning 2 or more editors worked on the same article) 
> excluding all other information and export that to a new text file. I 
> need to do this for each of the 150 articles so the output would be up 
> to 150 new files (although some may have no duplications). There are a 
> couple other things that need to be done but that should give you the 
> gist of the problem. If it sounds doable please let me know. I need to 
> have this done in a week or so and willing to pay for a rush job. 
> Attached is a zip file with one example article, the corresponding 
> user files and two manually filtered files that are an example of the 
> goal.

Hey Jim,

Did David get this for you?  I looked at the files too, and could not 
figure out exactly what you were seeking either, so I was hoping that 
you and David got together, but I was just checking.  From what I see, 
this could be banged out in a very short time, depending on exactly the 
results you were seeking.  Parsing those files is simple, its just a 
mater of how to loop through them to get the answers you seek.

Let me know if you still need me to look at this.

Kevin



More information about the Ubuntu-us-co mailing list