Problems sorting a CSV with sort

Johnny Rosenberg gurus.knugum at gmail.com
Thu Aug 8 20:40:42 UTC 2013


The problem is that some of the fields contain commas, but they are
inside double quotes.

Example:
sort -t, -k1,1 -k3,3 -k2,2 SomeFile.csv > OutputFile.csv

A line could look something like this:
This is the first field,"This is, well, the second field",The third
field could look like this

That line has three fields:
1: This is the first field
2: "This is, well, the second field"
3: The third field could look like this

But sort consider it to have five fields:
1: This is the first field
2: "This is
3:  well
4:  the second field
5: The third field could look like this

How would you solve this?

One idea is, that when I create SomeFile.scv in the first place, I
create a TAB separated file instead. Then I sort it and finally I
replace all TABs with commas…
But that means an extra step when I create SomeFile.csv, which is not
optimal (unless I automate the creation of the file).

Other ideas?




Johnny Rosenberg




More information about the ubuntu-users mailing list