miscellaneous.tex - libzahl - big integer library

miscellaneous.tex (10583B)
      1 \chapter{Miscellaneous}
      2 \label{chap:Miscellaneous}
      3 
      4 In this chapter, we will learn some miscellaneous
      5 functions. It might seem counterintuitive to start
      6 with miscellanea, but it is probably a good idea
      7 to read this before arithmetics and more advanced
      8 topics. You may read \secref{sec:Marshalling}
      9 later. Before reading this chapter you should
     10 have read \chapref{chap:Get started}.
     11 
     12 
     13 \vspace{1cm}
     14 \minitoc
     15 
     16 
     17 \newpage
     18 \section{Assignment}
     19 \label{sec:Assignment}
     20 
     21 To be able to do anything useful, we must assign
     22 values to integers. There are three functions for
     23 this: {\tt zseti}, {\tt zsetu}, and {\tt zsets}.
     24 The last letter in the names of these function
     25 describe the data type of the input, `i', `u',
     26 and `s' stand for `integer', `unsigned integer',
     27 and `string`, respectively. These resemble the
     28 rules for the format strings in the family of
     29 {\tt printf}-functions. `Integer' of course refer
     30 to `signed integer'; for integer types in C,
     31 part from {\tt char}, the keyword {\tt signed}
     32 is implicit.
     33 
     34 Consider {\tt zseti},
     35 
     36 \begin{alltt}
     37    \textcolor{c}{z_t two;}
     38    \textcolor{c}{zinit(two);}
     39    zseti(two, 2);
     40 \end{alltt}
     41 
     42 \noindent
     43 assignes {\tt two} the value 2. The data type of
     44 the second parameter of {\tt zseti} is {\tt int64\_t}.
     45 It will accept any integer value in the range
     46 $[-2^{63},~2^{63} - 1] = [-9223372036854775808,~9223372036854775807]$,
     47 independently of the machine.\footnote{{\tt int64\_t}
     48 is defined to be a signed 64-bit integer using two's
     49 complement representation.} If this range so not wide
     50 enough, it may be possible to use {\tt zsetu}. Its
     51 second parameter of the type {\tt uint64\_t}, and thus
     52 its range is $[0,~2^{64} - 1] = [0,~18446744073709551615]$.
     53 If a need negative value is desired, {\tt zsetu} can be
     54 combined with {\tt zneg} \psecref{sec:Sign manipulation}.
     55 
     56 For enormous constants or textual input, {\tt zsets}
     57 can be used. {\tt zsets} will accept any numerical
     58 value encoded in decimal ASCII, that only contain
     59 digits, \emph{not} decimal points, whitespace,
     60 apostrophes, et cetera. However, an optional plus
     61 sign or, for negative numbers, an ASCII minus sign
     62 may be used as the very first character. Note that
     63 a proper UCS minus sign is not supported.
     64 
     65 Using what we have learned so far, and {\tt zstr}
     66 which we will learn about in \secref{sec:String output},
     67 we can construct a simple program that calculates the
     68 sum of a set of numbers.
     69 
     70 \begin{alltt}
     71    \textcolor{c}{#include <stdio.h>
     72    #include <stdlib.h>
     73    #include <zahl.h>}
     74 
     75    int
     76    main(int argc, char *argv[]) \{
     77        z_t sum, temp;
     78        \textcolor{c}{jmp_buf failenv;
     79        char *sbuf, *argv0 = *argv;
     80        if (setjmp(failenv)) \{
     81            zperror(argv0);
     82            return 1;
     83        \}
     84        zsetup(failenv);
     85        zinit(sum);
     86        zinit(term);}
     87        zsetu(sum, 0);
     88        for (argv++; *argv; argv++) \{
     89            zsets(term, *argv);
     90            zadd(sum, sum, term);
     91        \}
     92        \textcolor{c}{printf("\%s\textbackslash{}n", (sbuf = zstr(sum, NULL, 0)));
     93        free(sbuf);
     94        zfree(sum);
     95        zfree(term);
     96        zunsetup();
     97        return 0;}
     98    \}
     99 \end{alltt}
    100 
    101 Another form of assignment available in libzahl is
    102 copy-assignment. This done using {\tt zset}. As
    103 easily observable, {\tt zset} is named like
    104 {\tt zseti}, {\tt zsetu}, and {\tt zsets}, but
    105 without the input-type suffix. The lack of a
    106 input-type suffix means that the input type is
    107 {\tt z\_t}. {\tt zset} copies value of second
    108 parameter into the reference in the first. For
    109 example, if {\tt v}, of the type {\tt z\_t}, has
    110 value 10, then {\tt a} will too after the instruction
    111 
    112 \begin{alltt}
    113    zset(a, v);
    114 \end{alltt}
    115 
    116 {\tt zset} does not necessarily make an exact
    117 copy of the input. If, in the example above, the
    118 {\tt a->alloced} is greater than or equal to
    119 {\tt v->used}, {\tt a->alloced} and {\tt a->chars}
    120 are preserved, of course, the content of
    121 {\tt a->chars} is overridden. If however,
    122 {\tt a->alloced} is less then {\tt v->used},
    123 {\tt a->alloced} is assigned a minimal value at
    124 least as great as {\tt v->used} that is a power
    125 of 2, and {\tt a->chars} is updated accordingly
    126 as described in \secref{sec:Integer structure}.
    127 This of course does not apply if {\tt v} has the
    128 value 0; in such cases {\tt a->sign} is simply
    129 set to 0.
    130 
    131 {\tt zset}, {\tt zseti}, {\tt zsetu}, and
    132 {\tt zsets} require that the output-parameter
    133 has been initialised with {\tt zinit} or an
    134 equally acceptable method as described in
    135 \secref{sec:Create an integer}.
    136 
    137 {\tt zset} is often unnecessary, of course
    138 there are cases where it is needed. In some case
    139 {\tt zswap} is enough, and advantageous.
    140 {\tt zswap} is defined as
    141 
    142 \begin{alltt}
    143    \textcolor{c}{static inline} void
    144    zswap(z_t a, z_t b)
    145    \{
    146        z_t t;
    147        *t = *a;
    148        *a = *b;
    149        *b = *t;
    150    \}
    151 \end{alltt}
    152 
    153 \noindent
    154 however its implementation is optimised to be
    155 around three times as fast. It just swaps the members
    156 of the parameters, and thereby the values. There
    157 is no rewriting of {\tt .chars} involved; thus
    158 it runs in constant time. It also does not
    159 require that any argument has been initialised.
    160 After the call, {\tt a} will be initialised
    161 if and only if {\tt b} was initialised, and
    162 vice versa.
    163 
    164 
    165 \newpage
    166 \section{String output}
    167 \label{sec:String output}
    168 
    169 Few useful things can be done without creating
    170 textual output of calculations. To convert a
    171 {\tt z\_t} to ASCII string in decimal, we use the
    172 function {\tt zstr}, declared as
    173 
    174 \begin{alltt}
    175    char *zstr(z_t a, char *buf, size_t n);
    176 \end{alltt}
    177 
    178 \noindent
    179 {\tt zstr} will store the string it creates into
    180 {\tt buf} and return {\tt buf}. However, if {\tt buf}
    181 is {\tt NULL}, a new memory segment is allocated
    182 and returned. {\tt n} should be at least the length
    183 of the resulting string sans NUL termiantion, but
    184 not larger than the allocation size of {\tt buf}
    185 minus 1 byte for NUL termiantion. If {\tt buf} is
    186 {\tt NULL}, {\tt n} may be 0. However if {\tt buf}
    187 is not {\tt NULL}, it is unsafe to let {\tt n} be
    188 0, unless {\tt buf} has been allocated by {\tt zstr}
    189 for a value of {\tt a} at least as larger as the
    190 value of {\tt a} in the new call to {\tt zstr}.
    191 Combining non-\texttt{NULL} {\tt buf} with 0 {\tt n}
    192 is unsafe because {\tt zstr} will use a very fast
    193 formula for calculating a value that is at least
    194 as large as the resulting output length, rather
    195 than the exact length.
    196 
    197 The length of the string output by {\tt zstr} can
    198 be predicted by {\tt zstr\_length}, decleared as
    199 
    200 \begin{alltt}
    201    size_t zstr_length(z_t a, unsigned long long int radix);
    202 \end{alltt}
    203 
    204 \noindent
    205 It will calculated the length of {\tt a} represented
    206 in radix {\tt radix}, sans NUL termination. If
    207 {\tt radix} is 10, the length for a decimal
    208 representation is calculated.
    209 
    210 Sometimes it is possible to never allocate a {\tt buf}
    211 for {\tt zstr}. For example, in an implementation
    212 of {\tt factor}, you can reuse the string of the
    213 value to factorise, since all of its factors are
    214 guaranteed to be no longer than the factored value.
    215 
    216 \begin{alltt}
    217    void
    218    factor(char *value)
    219    \{
    220        size_t n = strlen(value);
    221        z_t product, factor;
    222        zsets(product, value);
    223        printf("\%s:", value);
    224        while (next_factor(product, factor))
    225            printf(" \%s", zstr(factor, value, n));
    226        printf("\verb|\|n");
    227    \}
    228 \end{alltt}
    229 
    230 Other times it is possible to allocate just
    231 once, for example of creating a sorted output.
    232 In such cases, the allocation can be done almost
    233 transparently.
    234 
    235 \begin{alltt}
    236    void
    237    output_presorted_decending(z_t *list, size_t n)
    238    \{
    239        char *buf = NULL;
    240        while (n--)
    241            printf("\%s\verb|\|n", (buf = zstr(*list++, buf, 0)));
    242    \}
    243 \end{alltt}
    244 
    245 \noindent
    246 Note, this example assumes that all values are
    247 non-negative.
    248 
    249 
    250 
    251 \newpage
    252 \section{Comparison}
    253 \label{sec:Comparison}
    254 
    255 libzahl defines four functions for comparing
    256 integers: {\tt zcmp}, {\tt zcmpi}, {\tt zcmpu},
    257 and {\tt zcmpmag}. These follow the same naming
    258 convention as {\tt zset}, {\tt zseti}, and
    259 {\tt zsetu}, as described in \secref{sec:Assignment}.
    260 {\tt zcmpmag} compares the absolute value, the
    261 magnitude, rather than the proper value. These
    262 functions are declared as
    263 
    264 \begin{alltt}
    265    int zcmp(z_t a, z_t b);
    266    int zcmpi(z_t a, int64_t b);
    267    int zcmpu(z_t a, uint64_t b);
    268    int zcmpmag(z_t a, z_t b);
    269 \end{alltt}
    270 
    271 \noindent
    272 They behave similar to {\tt memcmp} and
    273 {\tt strcmp}.\footnote{And {\tt wmemcmp} and
    274 {\tt wcscmp} if you are into that mess.}
    275 The return value is defined
    276 
    277 \vspace{1em}
    278 \(
    279     \mbox{sgn}(a - b) =
    280     \left \lbrace \begin{array}{rl}
    281         -1 & \quad \textrm{if}~a < b \\
    282          0 & \quad \textrm{if}~a = b \\
    283         +1 & \quad \textrm{if}~a > b
    284     \end{array} \right .
    285 \)
    286 \vspace{1em}
    287 
    288 \noindent
    289 for {\tt zcmp}, {\tt zcmpi}, and {\tt zcmpu}.
    290 The return for {\tt zcmpmag} value is defined
    291 
    292 \vspace{1em}
    293 \(
    294     \mbox{sgn}(\lvert a \rvert - \lvert b \rvert) =
    295     \left \lbrace \begin{array}{rl}
    296         -1 & \quad \textrm{if}~\lvert a \rvert < \lvert b \rvert \\
    297          0 & \quad \textrm{if}~\lvert a \rvert = \lvert b \rvert \\
    298         +1 & \quad \textrm{if}~\lvert a \rvert > \lvert b \rvert
    299     \end{array} \right .
    300 \)
    301 \vspace{1em}
    302 
    303 \noindent
    304 It is discouraged, stylistically, to compare against
    305 $-1$ and $+1$, rather, you should always compare
    306 against $0$. Think of it as returning $a - b$, or
    307 $\lvert a \rvert - \lvert b \rvert$ in the case of
    308 {\tt zcmpmag}.
    309 
    310 
    311 \newpage
    312 \section{Marshalling}
    313 \label{sec:Marshalling}
    314 
    315 libzahl is designed to provide efficient communication
    316 for multi-processes applications, including running on
    317 multiple nodes on a cluster computer. However, these
    318 facilities require that it is known that all processes
    319 run the same version of libzahl, and run on compatible
    320 microarchitectures, that is, the processors must have
    321 endianness, and the intrinsic integer types in C must
    322 have the same widths on all processors. When this is not
    323 the case, string conversion can be used (see \secref{sec:Assignment}
    324 and \secref{sec:String output}), but when it is the case
    325 {\tt zsave} and {\tt zload} can be used. {\tt zsave} and
    326 {\tt zload} are declared as
    327 
    328 \begin{alltt}
    329    size_t zsave(z_t a, char *buf);
    330    size_t zload(z_t a, const char *buf);
    331 \end{alltt}
    332 
    333 \noindent
    334 {\tt zsave} stores a version- and microarchitecture-dependent
    335 binary representation of {\tt a} in {\tt buf}, and returns
    336 the number of bytes written to {\tt buf}. If {\tt buf} is
    337 {\tt NULL}, the numbers bytes that would have be written is
    338 returned. {\tt zload} unmarshals an integers from {\tt buf},
    339 created with {\tt zsave}, into {\tt a}, and returns the number
    340 of read bytes. {\tt zload} returns the value returned by
    341 {\tt zsave}.
	libzahl big integer library
	git clone git://git.suckless.org/libzahl
	Log \| Files \| Refs \| README \| LICENSE