9base

revived minimalist port of Plan 9 userland to Unix
git clone git://git.suckless.org/9base
Log | Files | Refs | README | LICENSE

sort.1 (4727B)


      1 .TH SORT 1 
      2 .SH NAME
      3 sort \- sort and/or merge files
      4 .SH SYNOPSIS
      5 .B sort
      6 [
      7 .BI -cmuMbdf\&inrwt x
      8 ]
      9 [
     10 .BI + pos1
     11 [
     12 .BI - pos2
     13 ] ...
     14 ] ...
     15 [
     16 .B -k
     17 .I pos1
     18 [
     19 .I ,pos2
     20 ]
     21 ] ...
     22 .br
     23 \h'0.5in'
     24 [
     25 .B -o
     26 .I output
     27 ]
     28 [
     29 .B -T
     30 .I dir
     31 \&...
     32 ]
     33 [
     34 .I option
     35 \&...
     36 ]
     37 [
     38 .I file
     39 \&...
     40 ]
     41 .SH DESCRIPTION
     42 .I Sort\^
     43 sorts
     44 lines of all the
     45 .I files
     46 together and writes the result on
     47 the standard output.
     48 If no input files are named, the standard input is sorted.
     49 .PP
     50 The default sort key is an entire line.
     51 Default ordering is
     52 lexicographic by runes.
     53 The ordering is affected globally by the following options,
     54 one or more of which may appear.
     55 .TP
     56 .B -M
     57 Compare as months.
     58 The first three
     59 non-white space characters
     60 of the field
     61 are folded
     62 to upper case
     63 and compared
     64 so that
     65 .L JAN
     66 precedes
     67 .LR FEB ,
     68 etc.
     69 Invalid fields
     70 compare low to
     71 .LR JAN .
     72 .TP
     73 .B -b
     74 Ignore leading white space (spaces and tabs) in field comparisons.
     75 .TP
     76 .B -d
     77 `Phone directory' order:
     78 only letters,
     79 accented letters,
     80 digits and white space
     81 are significant in comparisons.
     82 .TP
     83 .B -f
     84 Fold lower case
     85 letters onto upper case.
     86 Accented characters are folded to their
     87 non-accented upper case form.
     88 .TP
     89 .B -i
     90 Ignore characters outside the
     91 .SM ASCII
     92 range 040-0176
     93 in non-numeric comparisons.
     94 .TP
     95 .B -w
     96 Like
     97 .BR -i ,
     98 but ignore only tabs and spaces.
     99 .TP
    100 .B -n
    101 An initial numeric string,
    102 consisting of optional white space,
    103 optional plus or minus sign,
    104 and zero or more digits with optional decimal point,
    105 is sorted by arithmetic value.
    106 .TP
    107 .B -g
    108 Numbers, like
    109 .B -n 
    110 but with optional 
    111 .BR e -style
    112 exponents, are sorted by value.
    113 .TP
    114 .B -r
    115 Reverse the sense of comparisons.
    116 .TP
    117 .BI -t x\^
    118 `Tab character' separating fields is
    119 .IR x .
    120 .PP
    121 The notation
    122 .BI + "pos1\| " - pos2\^
    123 restricts a sort key to a field beginning at
    124 .I pos1\^
    125 and ending just before
    126 .IR pos2 .
    127 .I Pos1\^
    128 and
    129 .I pos2\^
    130 each have the form
    131 .IB m . n\f1,
    132 optionally followed by one or more of the flags
    133 .BR Mbdfginr ,
    134 where
    135 .I m\^
    136 tells a number of fields to skip from the beginning of the line and
    137 .I n\^
    138 tells a number of characters to skip further.
    139 If any flags are present they override all the global
    140 ordering options for this key.
    141 A missing
    142 .BI \&. n\^
    143 means
    144 .BR \&.0 ;
    145 a missing
    146 .BI - pos2\^
    147 means the end of the line.
    148 Under the
    149 .BI -t x\^
    150 option, fields are strings separated by
    151 .IR x ;
    152 otherwise fields are
    153 non-empty strings separated by white space.
    154 White space before a field
    155 is part of the field, except under option
    156 .BR -b .
    157 A
    158 .B b
    159 flag may be attached independently to
    160 .IR pos1
    161 and
    162 .IR pos2.
    163 .PP
    164 The notation
    165 .B -k
    166 .IR pos1 [, pos2 ]
    167 is how POSIX
    168 .I sort
    169 defines fields:
    170 .I pos1
    171 and
    172 .I pos2
    173 have the same format but different meanings.
    174 The value of
    175 .I  m\^
    176 is origin 1 instead of origin 0
    177 and a missing
    178 .BI \&. n\^
    179 in
    180 .I pos2
    181 is the end of the field.
    182 .PP
    183 When there are multiple sort keys, later keys
    184 are compared only after all earlier keys
    185 compare equal.
    186 Lines that otherwise compare equal are ordered
    187 with all bytes significant.
    188 .PP
    189 These option arguments are also understood:
    190 .TP \w'\fL-z\fIrecsize\fLXX'u
    191 .B -c
    192 Check that the single input file is sorted according to the ordering rules;
    193 give no output unless the file is out of sort.
    194 .TP
    195 .B -m
    196 Merge; assume the input files are already sorted.
    197 .TP
    198 .B -u
    199 Suppress all but one in each
    200 set of equal lines.
    201 Ignored bytes
    202 and bytes outside keys
    203 do not participate in
    204 this comparison.
    205 .TP
    206 .B -o
    207 The next argument is the name of an output file
    208 to use instead of the standard output.
    209 This file may be the same as one of the inputs.
    210 .TP
    211 .BI -T dir
    212 Put temporary files in
    213 .I dir
    214 rather than in
    215 .BR /var/tmp .
    216 .ne 4
    217 .SH EXAMPLES
    218 .TP
    219 .L sort -u +0f +0 list
    220 Print in alphabetical order all the unique spellings
    221 in a list of words
    222 where capitalized words differ from uncapitalized.
    223 .TP
    224 .L sort -t: +1 /adm/users
    225 Print the users file
    226 sorted by user name
    227 (the second colon-separated field).
    228 .TP
    229 .L sort -umM dates
    230 Print the first instance of each month in an already sorted file.
    231 Options
    232 .B -um
    233 with just one input file make the choice of a
    234 unique representative from a set of equal lines predictable.
    235 .TP
    236 .L
    237 grep -n '^' input | sort -t: +1f +0n | sed 's/[0-9]*://'
    238 A stable sort: input lines that compare equal will 
    239 come out in their original order.
    240 .SH FILES
    241 .BI /var/tmp/sort. <pid>.<ordinal>
    242 .SH SOURCE
    243 .B \*9/src/cmd/sort.c
    244 .SH SEE ALSO
    245 .IR uniq (1),
    246 .IR look (1)
    247 .SH DIAGNOSTICS
    248 .I Sort
    249 comments and exits with non-null status for various trouble
    250 conditions and for disorder discovered under option
    251 .BR -c .
    252 .SH BUGS
    253 An external null character can be confused
    254 with an internally generated end-of-field character.
    255 The result can make a sub-field not sort
    256 less than a longer field.
    257 .PP
    258 Some of the options, e.g.
    259 .B -i
    260 and
    261 .BR -M ,
    262 are hopelessly provincial.