miscellaneous.tex (10583B)
1 \chapter{Miscellaneous} 2 \label{chap:Miscellaneous} 3 4 In this chapter, we will learn some miscellaneous 5 functions. It might seem counterintuitive to start 6 with miscellanea, but it is probably a good idea 7 to read this before arithmetics and more advanced 8 topics. You may read \secref{sec:Marshalling} 9 later. Before reading this chapter you should 10 have read \chapref{chap:Get started}. 11 12 13 \vspace{1cm} 14 \minitoc 15 16 17 \newpage 18 \section{Assignment} 19 \label{sec:Assignment} 20 21 To be able to do anything useful, we must assign 22 values to integers. There are three functions for 23 this: {\tt zseti}, {\tt zsetu}, and {\tt zsets}. 24 The last letter in the names of these function 25 describe the data type of the input, `i', `u', 26 and `s' stand for `integer', `unsigned integer', 27 and `string`, respectively. These resemble the 28 rules for the format strings in the family of 29 {\tt printf}-functions. `Integer' of course refer 30 to `signed integer'; for integer types in C, 31 part from {\tt char}, the keyword {\tt signed} 32 is implicit. 33 34 Consider {\tt zseti}, 35 36 \begin{alltt} 37 \textcolor{c}{z_t two;} 38 \textcolor{c}{zinit(two);} 39 zseti(two, 2); 40 \end{alltt} 41 42 \noindent 43 assignes {\tt two} the value 2. The data type of 44 the second parameter of {\tt zseti} is {\tt int64\_t}. 45 It will accept any integer value in the range 46 $[-2^{63},~2^{63} - 1] = [-9223372036854775808,~9223372036854775807]$, 47 independently of the machine.\footnote{{\tt int64\_t} 48 is defined to be a signed 64-bit integer using two's 49 complement representation.} If this range so not wide 50 enough, it may be possible to use {\tt zsetu}. Its 51 second parameter of the type {\tt uint64\_t}, and thus 52 its range is $[0,~2^{64} - 1] = [0,~18446744073709551615]$. 53 If a need negative value is desired, {\tt zsetu} can be 54 combined with {\tt zneg} \psecref{sec:Sign manipulation}. 55 56 For enormous constants or textual input, {\tt zsets} 57 can be used. {\tt zsets} will accept any numerical 58 value encoded in decimal ASCII, that only contain 59 digits, \emph{not} decimal points, whitespace, 60 apostrophes, et cetera. However, an optional plus 61 sign or, for negative numbers, an ASCII minus sign 62 may be used as the very first character. Note that 63 a proper UCS minus sign is not supported. 64 65 Using what we have learned so far, and {\tt zstr} 66 which we will learn about in \secref{sec:String output}, 67 we can construct a simple program that calculates the 68 sum of a set of numbers. 69 70 \begin{alltt} 71 \textcolor{c}{#include <stdio.h> 72 #include <stdlib.h> 73 #include <zahl.h>} 74 75 int 76 main(int argc, char *argv[]) \{ 77 z_t sum, temp; 78 \textcolor{c}{jmp_buf failenv; 79 char *sbuf, *argv0 = *argv; 80 if (setjmp(failenv)) \{ 81 zperror(argv0); 82 return 1; 83 \} 84 zsetup(failenv); 85 zinit(sum); 86 zinit(term);} 87 zsetu(sum, 0); 88 for (argv++; *argv; argv++) \{ 89 zsets(term, *argv); 90 zadd(sum, sum, term); 91 \} 92 \textcolor{c}{printf("\%s\textbackslash{}n", (sbuf = zstr(sum, NULL, 0))); 93 free(sbuf); 94 zfree(sum); 95 zfree(term); 96 zunsetup(); 97 return 0;} 98 \} 99 \end{alltt} 100 101 Another form of assignment available in libzahl is 102 copy-assignment. This done using {\tt zset}. As 103 easily observable, {\tt zset} is named like 104 {\tt zseti}, {\tt zsetu}, and {\tt zsets}, but 105 without the input-type suffix. The lack of a 106 input-type suffix means that the input type is 107 {\tt z\_t}. {\tt zset} copies value of second 108 parameter into the reference in the first. For 109 example, if {\tt v}, of the type {\tt z\_t}, has 110 value 10, then {\tt a} will too after the instruction 111 112 \begin{alltt} 113 zset(a, v); 114 \end{alltt} 115 116 {\tt zset} does not necessarily make an exact 117 copy of the input. If, in the example above, the 118 {\tt a->alloced} is greater than or equal to 119 {\tt v->used}, {\tt a->alloced} and {\tt a->chars} 120 are preserved, of course, the content of 121 {\tt a->chars} is overridden. If however, 122 {\tt a->alloced} is less then {\tt v->used}, 123 {\tt a->alloced} is assigned a minimal value at 124 least as great as {\tt v->used} that is a power 125 of 2, and {\tt a->chars} is updated accordingly 126 as described in \secref{sec:Integer structure}. 127 This of course does not apply if {\tt v} has the 128 value 0; in such cases {\tt a->sign} is simply 129 set to 0. 130 131 {\tt zset}, {\tt zseti}, {\tt zsetu}, and 132 {\tt zsets} require that the output-parameter 133 has been initialised with {\tt zinit} or an 134 equally acceptable method as described in 135 \secref{sec:Create an integer}. 136 137 {\tt zset} is often unnecessary, of course 138 there are cases where it is needed. In some case 139 {\tt zswap} is enough, and advantageous. 140 {\tt zswap} is defined as 141 142 \begin{alltt} 143 \textcolor{c}{static inline} void 144 zswap(z_t a, z_t b) 145 \{ 146 z_t t; 147 *t = *a; 148 *a = *b; 149 *b = *t; 150 \} 151 \end{alltt} 152 153 \noindent 154 however its implementation is optimised to be 155 around three times as fast. It just swaps the members 156 of the parameters, and thereby the values. There 157 is no rewriting of {\tt .chars} involved; thus 158 it runs in constant time. It also does not 159 require that any argument has been initialised. 160 After the call, {\tt a} will be initialised 161 if and only if {\tt b} was initialised, and 162 vice versa. 163 164 165 \newpage 166 \section{String output} 167 \label{sec:String output} 168 169 Few useful things can be done without creating 170 textual output of calculations. To convert a 171 {\tt z\_t} to ASCII string in decimal, we use the 172 function {\tt zstr}, declared as 173 174 \begin{alltt} 175 char *zstr(z_t a, char *buf, size_t n); 176 \end{alltt} 177 178 \noindent 179 {\tt zstr} will store the string it creates into 180 {\tt buf} and return {\tt buf}. However, if {\tt buf} 181 is {\tt NULL}, a new memory segment is allocated 182 and returned. {\tt n} should be at least the length 183 of the resulting string sans NUL termiantion, but 184 not larger than the allocation size of {\tt buf} 185 minus 1 byte for NUL termiantion. If {\tt buf} is 186 {\tt NULL}, {\tt n} may be 0. However if {\tt buf} 187 is not {\tt NULL}, it is unsafe to let {\tt n} be 188 0, unless {\tt buf} has been allocated by {\tt zstr} 189 for a value of {\tt a} at least as larger as the 190 value of {\tt a} in the new call to {\tt zstr}. 191 Combining non-\texttt{NULL} {\tt buf} with 0 {\tt n} 192 is unsafe because {\tt zstr} will use a very fast 193 formula for calculating a value that is at least 194 as large as the resulting output length, rather 195 than the exact length. 196 197 The length of the string output by {\tt zstr} can 198 be predicted by {\tt zstr\_length}, decleared as 199 200 \begin{alltt} 201 size_t zstr_length(z_t a, unsigned long long int radix); 202 \end{alltt} 203 204 \noindent 205 It will calculated the length of {\tt a} represented 206 in radix {\tt radix}, sans NUL termination. If 207 {\tt radix} is 10, the length for a decimal 208 representation is calculated. 209 210 Sometimes it is possible to never allocate a {\tt buf} 211 for {\tt zstr}. For example, in an implementation 212 of {\tt factor}, you can reuse the string of the 213 value to factorise, since all of its factors are 214 guaranteed to be no longer than the factored value. 215 216 \begin{alltt} 217 void 218 factor(char *value) 219 \{ 220 size_t n = strlen(value); 221 z_t product, factor; 222 zsets(product, value); 223 printf("\%s:", value); 224 while (next_factor(product, factor)) 225 printf(" \%s", zstr(factor, value, n)); 226 printf("\verb|\|n"); 227 \} 228 \end{alltt} 229 230 Other times it is possible to allocate just 231 once, for example of creating a sorted output. 232 In such cases, the allocation can be done almost 233 transparently. 234 235 \begin{alltt} 236 void 237 output_presorted_decending(z_t *list, size_t n) 238 \{ 239 char *buf = NULL; 240 while (n--) 241 printf("\%s\verb|\|n", (buf = zstr(*list++, buf, 0))); 242 \} 243 \end{alltt} 244 245 \noindent 246 Note, this example assumes that all values are 247 non-negative. 248 249 250 251 \newpage 252 \section{Comparison} 253 \label{sec:Comparison} 254 255 libzahl defines four functions for comparing 256 integers: {\tt zcmp}, {\tt zcmpi}, {\tt zcmpu}, 257 and {\tt zcmpmag}. These follow the same naming 258 convention as {\tt zset}, {\tt zseti}, and 259 {\tt zsetu}, as described in \secref{sec:Assignment}. 260 {\tt zcmpmag} compares the absolute value, the 261 magnitude, rather than the proper value. These 262 functions are declared as 263 264 \begin{alltt} 265 int zcmp(z_t a, z_t b); 266 int zcmpi(z_t a, int64_t b); 267 int zcmpu(z_t a, uint64_t b); 268 int zcmpmag(z_t a, z_t b); 269 \end{alltt} 270 271 \noindent 272 They behave similar to {\tt memcmp} and 273 {\tt strcmp}.\footnote{And {\tt wmemcmp} and 274 {\tt wcscmp} if you are into that mess.} 275 The return value is defined 276 277 \vspace{1em} 278 \( 279 \mbox{sgn}(a - b) = 280 \left \lbrace \begin{array}{rl} 281 -1 & \quad \textrm{if}~a < b \\ 282 0 & \quad \textrm{if}~a = b \\ 283 +1 & \quad \textrm{if}~a > b 284 \end{array} \right . 285 \) 286 \vspace{1em} 287 288 \noindent 289 for {\tt zcmp}, {\tt zcmpi}, and {\tt zcmpu}. 290 The return for {\tt zcmpmag} value is defined 291 292 \vspace{1em} 293 \( 294 \mbox{sgn}(\lvert a \rvert - \lvert b \rvert) = 295 \left \lbrace \begin{array}{rl} 296 -1 & \quad \textrm{if}~\lvert a \rvert < \lvert b \rvert \\ 297 0 & \quad \textrm{if}~\lvert a \rvert = \lvert b \rvert \\ 298 +1 & \quad \textrm{if}~\lvert a \rvert > \lvert b \rvert 299 \end{array} \right . 300 \) 301 \vspace{1em} 302 303 \noindent 304 It is discouraged, stylistically, to compare against 305 $-1$ and $+1$, rather, you should always compare 306 against $0$. Think of it as returning $a - b$, or 307 $\lvert a \rvert - \lvert b \rvert$ in the case of 308 {\tt zcmpmag}. 309 310 311 \newpage 312 \section{Marshalling} 313 \label{sec:Marshalling} 314 315 libzahl is designed to provide efficient communication 316 for multi-processes applications, including running on 317 multiple nodes on a cluster computer. However, these 318 facilities require that it is known that all processes 319 run the same version of libzahl, and run on compatible 320 microarchitectures, that is, the processors must have 321 endianness, and the intrinsic integer types in C must 322 have the same widths on all processors. When this is not 323 the case, string conversion can be used (see \secref{sec:Assignment} 324 and \secref{sec:String output}), but when it is the case 325 {\tt zsave} and {\tt zload} can be used. {\tt zsave} and 326 {\tt zload} are declared as 327 328 \begin{alltt} 329 size_t zsave(z_t a, char *buf); 330 size_t zload(z_t a, const char *buf); 331 \end{alltt} 332 333 \noindent 334 {\tt zsave} stores a version- and microarchitecture-dependent 335 binary representation of {\tt a} in {\tt buf}, and returns 336 the number of bytes written to {\tt buf}. If {\tt buf} is 337 {\tt NULL}, the numbers bytes that would have be written is 338 returned. {\tt zload} unmarshals an integers from {\tt buf}, 339 created with {\tt zsave}, into {\tt a}, and returns the number 340 of read bytes. {\tt zload} returns the value returned by 341 {\tt zsave}.