scripting / data problem.

Justin Gruenberg justin.gruenberg at gmail.com
Mon Nov 9 08:08:45 UTC 2009


On Sat, Nov 7, 2009 at 7:31 PM, Patton Echols <p.echols at comcast.net> wrote:
> The problem is that the various lists don't all have the same
> information so I can't just cat them together and sort with a "unique"
> operator.  That's a vague statement.  Here is what I mean by file and field:
>

I assume you're going to need to get this data back out and into the
original applications, eventually.

The basic strategy I'd take is to import everything into seperate
tables in mysql.  Create additional tables that you will export out
of.  Massage the data from the import tables into your output tables,
merging and correcting as you can (this may be really easy or really
hard depending on how clean your data is).  Chances are you're going
to need some cheap labor to clean the data up (got interns?) depending
on the amount of data.

I'd also suggest adding a unique ID to each record so that if you have
to do this merge again, this will be a bit easier to handle.




More information about the ubuntu-users mailing list