There seems to be some disagreement, at Apple Computer, about exactly what the definition of the word “ignore” is. From the “sort” man page:
-d Sort in `phone directory’ order: ignore all characters except letters, digits and blanks when sorting.
What does that suggest to you? Well, let’s compare it to the GNU “sort” man page:
-d, —dictionary-order
consider only blanks and alphanumeric characters
So you’d THINK, right, that sorting with these two options would be equivalent, right?
Nope!
Here’s a simple list:
- 192.168.2.4 foo
- 192.168.2.42 foo
How should these things be sorted when the -d option is in effect? You’ve got a conundrum: is a space sorted BEFORE a number or AFTER a number?
Curse you, alphabet! You’re never around when I need you!
And, of course, BSD and GNU answer that question differently. On GNU, the answer is AFTER, on BSD the answer is BEFORE! Oh goody.
Here’s a better way if you need the sorting results to be the same on both BSD and GNU: replace all spaces with something else non-alpha-numeric that isn’t used in the file (such as an underscore, or an ellipsis, or an em-dash). Then sort with -ds (no last-minute saving throws!), then replace the underscore (or whatever) with a space again.
And if you need it to be consistent on OSX platforms too, make it a -dfs sort (so that capitals and lower-case are considered the same).
Comments (1)
Brilliant.
Never thought of replacing all spaces with a more manageable character.
Thanks
Posted by Mr. Jaggs | January 9, 2009 9:39 AM
Posted on January 9, 2009 09:39