libzahls-design.tex - libzahl - big integer library

libzahls-design.tex (7474B)
      1 \chapter{libzahl's design}
      2 \label{chap:libzahl's design}
      3 
      4 In this chapter, the design of libzahl is discussed.
      5 
      6 \vspace{1cm}
      7 \minitoc
      8 
      9 
     10 \newpage
     11 \section{Memory pool}
     12 \label{sec:Memory pool}
     13 
     14 Allocating memory dynamically is an expensive operation.
     15 To improve performance, libzahl never deallocates memory
     16 before the library is uninitialised, instead it pools
     17 memory, that is no longer needed, for reuse.
     18 
     19 Because of the memory pooling, this is a pattern to the
     20 allocation sizes. In an allocation, a power of two
     21 elements, plus a few elements that are discussed in
     22 \secref{sec:Integer structure}, are allocated. That is,
     23 the number multiplied by the size of an element.
     24 Powers of two (growth factor 2) is not the most memory
     25 efficient way to do this, but it is the simplest and
     26 performance efficient. This power of two (sans the few
     27 extra elements) is used to calculate — getting the index
     28 of the only set bit — the index of the bucket in
     29 which the allocation is stored when pooled. The buckets
     30 are dynamic arrays with the growth factor 1.5. The
     31 growth factor 1.5 is often used for dynamic arrays, it
     32 is a good compromise between memory usage and performance.
     33 
     34 libzahl also avoids allocating memory by having a set
     35 of temporary variables predefined.
     36 
     37 
     38 \newpage
     39 \section{Error handling}
     40 \label{sec:Error handling}
     41 
     42 In C, it is traditional to return a sentiel value
     43 in case an error has occurred, and set the value
     44 of a global variable to describe the error that
     45 has occurred. The programmer can choose whether to
     46 check for errors, ignore errors where it does not
     47 matter, or simple ignore errors altogether and let
     48 the program eventually crash. This is a simple
     49 technique that gives the programmer a better
     50 understanding of what can happen. A great advantage
     51 C has over most programming languages.
     52 
     53 Another technique is to use long jumps on error.
     54 This technique is not too common, but is has one
     55 significant advantage. Error-checks need only be
     56 preformed where the error can first be detected.
     57 There is no need to check the return value at every
     58 function return. This leads to cleaner code, if
     59 there are many functions that can raise exceptional
     60 conditions, and greater performance under some
     61 conditions. This is why this technique is sometimes
     62 used in high-performance libraries. libzahl uses
     63 this technique.
     64 
     65 Rather than writing
     66 
     67 \begin{alltt}
     68    if (zadd(a, b, c))
     69        goto out;
     70 \end{alltt}
     71 
     72 \noindent
     73 or a bit cleaner, if there are a lot of calls,
     74 
     75 \begin{alltt}
     76    #define TRY(...) do if (__VA_ARGS__) goto out; while (0)
     77    \textcolor{c}{/* \textrm{\ldots} */}
     78    TRY(zadd(a, b, c));
     79 \end{alltt}
     80 
     81 \noindent
     82 we write
     83 
     84 \begin{alltt}
     85    jmp_buf env;
     86    if (setjmp(env))
     87        goto out;
     88    zsetup(env);
     89    \textcolor{c}{/* \textrm{\ldots} */}
     90    zadd(a, b, c);
     91 \end{alltt}
     92 
     93 You only need to call {\tt setjmp} and {\tt zsetup}
     94 once, but can update the return point by calling
     95 them once more.
     96 
     97 If you don't need to check for errors, you can
     98 disable error detection at compile-time. By defining
     99 the {\tt ZAHL\_UNSAFE} C preprocessor definition
    100 when compiling libzahl, and when compiling your
    101 software that uses libzahl.
    102 
    103 
    104 \newpage
    105 \section{Integer structure}
    106 \label{sec:Integer structure}
    107 
    108 The data type used to represent a big integer with
    109 libzahl is {\tt z\_t},\footnote{This name actually
    110 violates the naming convention; it should be {\tt Z},
    111 or {\tt Zahl} to avoid single-letter names. But this
    112 violation is common-place.} defined as
    113 
    114 \begin{alltt}
    115    typedef struct zahl z_t[1];
    116 \end{alltt}
    117 
    118 \noindent
    119 where {\tt struct zahl} is defined as
    120 
    121 \begin{alltt}
    122    struct zahl \{
    123        int sign;            \textcolor{c}{/* \textrm{\emph{not} short for `signum'} */}
    124        size_t used;
    125        size_t alloced;      \textcolor{c}{/* \textrm{short for `allocated'} */}
    126        zahl_char_t *chars;  \textcolor{c}{/* \textrm{short for `characters'} */}
    127    \};
    128 \end{alltt}
    129 
    130 \noindent
    131 where {\tt zahl\_char\_t} is defined as
    132 
    133 \begin{alltt}
    134    typedef uint64_t zahl_char_t;
    135 \end{alltt}
    136 
    137 \noindent
    138 As a user, try not to think about anything else than
    139 
    140 \begin{alltt}
    141    typedef \textcolor{c}{/* \textrm{ignore what is here} */} z_t[1];
    142 \end{alltt}
    143 
    144 \noindent
    145 details can change in future versions of libzahl.
    146 
    147 {\tt z\_t} is defined as a single-element array. This
    148 is often called a reference, or a call-by-reference.
    149 There are some flexibility issues with this, why
    150 {\tt struct zahl} has beed added, but for most uses
    151 with big integers, it makes things simpler. Particularly,
    152 you need not work prepend {\tt \&} to variable when making
    153 function calls, but the existence of {\tt struct zahl}
    154 allows you do so if you so choose.
    155 
    156 The {\tt .sign} member, is either $-1$, 0, or 1,
    157 when the integer is negative, zero, or positive,
    158 respectively. Whenever {\tt .sign} is 0, the value
    159 of {\tt .used} and {\tt .chars} are undefined.
    160 
    161 {\tt .used} holds to the number of elements used in
    162 {\tt .chars}, and {\tt .alloced} holds the allocation
    163 side of {\tt .chars} measured in elements minus a few
    164 extra elements that are always added to the allocation.
    165 {\tt .chars} is a little-endian array of 64-bit digits,
    166 these 64-bit digits are called `characters' in libzahl.
    167 {\tt .chars} holds the absolute value of the
    168 represented value.
    169 
    170 Unless {\tt .sign} is 0, {\tt .chars} always contains
    171 four extra elements, refered to as fluff. These are
    172 merely allocated so functions can assume that they can
    173 always manipulate groups of four characters, and need
    174 not care about cases where the number of characters is
    175 not a multiple of four. There are of course a few cases
    176 when the precise number of characters is important.
    177 
    178 
    179 \newpage
    180 \section{Parameters}
    181 \label{sec:Parameters}
    182 
    183 The general order of parameters in libzahl functions
    184 are: output integers, input integers, input data,
    185 output data, parametric values. For example, in
    186 addition, the out parameter is the first parameter.
    187 But for marshalling and unmarshalling the buffer
    188 is last. For random number generation the order is:
    189 output, device, distribution, distribution parameters.
    190 Whilst the distribution parameters are big integers,
    191 they are not considered input integers. The order
    192 of the input parameters are that of the order you
    193 would write them using mathematical notation, this
    194 also holds true if you include the output parameter
    195 (as long as there is exactly one output), for example
    196 
    197 \vspace{1em}
    198 $a \gets b^c \mod d$
    199 \vspace{1em}
    200 
    201 \noindent
    202 is written
    203 
    204 \begin{verbatim}
    205    zmodpow(a, b, c, d);
    206 \end{verbatim}
    207 
    208 \noindent
    209 or
    210 
    211 \begin{verbatim}
    212    zmodpowu(a, b, c, d);
    213 \end{verbatim}
    214 
    215 Like any self respecting bignum library, libzahl
    216 supports using the same big integer reference as
    217 for output as input, as long as all the output
    218 parameters are mutually unique. For example
    219 
    220 \begin{alltt}
    221    a += b;
    222 \end{alltt}
    223 
    224 \noindent
    225 or
    226 
    227 \begin{alltt}
    228    a = a + b;
    229 \end{alltt}
    230 
    231 \noindent
    232 is written, using libzahl, as
    233 
    234 \begin{alltt}
    235    zadd(a, a, b);
    236 \end{alltt}
    237 
    238 For commutative functions, like {\tt zadd}, the
    239 implementation is optimised to assume that this
    240 order is more likely to be used than the alternative.
    241 That is, we should, for example, write
    242 
    243 \begin{alltt}
    244    zadd(a, a, b);
    245 \end{alltt}
    246 
    247 \noindent
    248 rather than
    249 
    250 \begin{alltt}
    251    zadd(a, b, a);
    252 \end{alltt}
    253 
    254 \noindent
    255 This assumption is not made for non-commutative
    256 functions.
    257 
    258 When writting your own functions, be aware,
    259 input parameters are generally not declared {\tt const}
    260 in libzahl. Currently, some functions actually make
    261 modifications (that do not affect the value) to
    262 input parameters.
	libzahl big integer library
	git clone git://git.suckless.org/libzahl
	Log \| Files \| Refs \| README \| LICENSE