# libzahl

big integer library
git clone git://git.suckless.org/libzahl

libzahls-design.tex (7474B)

      1 \chapter{libzahl's design}
2 \label{chap:libzahl's design}
3
4 In this chapter, the design of libzahl is discussed.
5
6 \vspace{1cm}
7 \minitoc
8
9
10 \newpage
11 \section{Memory pool}
12 \label{sec:Memory pool}
13
14 Allocating memory dynamically is an expensive operation.
15 To improve performance, libzahl never deallocates memory
16 before the library is uninitialised, instead it pools
17 memory, that is no longer needed, for reuse.
18
19 Because of the memory pooling, this is a pattern to the
20 allocation sizes. In an allocation, a power of two
21 elements, plus a few elements that are discussed in
22 \secref{sec:Integer structure}, are allocated. That is,
23 the number multiplied by the size of an element.
24 Powers of two (growth factor 2) is not the most memory
25 efficient way to do this, but it is the simplest and
26 performance efficient. This power of two (sans the few
27 extra elements) is used to calculate — getting the index
28 of the only set bit — the index of the bucket in
29 which the allocation is stored when pooled. The buckets
30 are dynamic arrays with the growth factor 1.5. The
31 growth factor 1.5 is often used for dynamic arrays, it
32 is a good compromise between memory usage and performance.
33
34 libzahl also avoids allocating memory by having a set
35 of temporary variables predefined.
36
37
38 \newpage
39 \section{Error handling}
40 \label{sec:Error handling}
41
42 In C, it is traditional to return a sentiel value
43 in case an error has occurred, and set the value
44 of a global variable to describe the error that
45 has occurred. The programmer can choose whether to
46 check for errors, ignore errors where it does not
47 matter, or simple ignore errors altogether and let
48 the program eventually crash. This is a simple
49 technique that gives the programmer a better
50 understanding of what can happen. A great advantage
51 C has over most programming languages.
52
53 Another technique is to use long jumps on error.
54 This technique is not too common, but is has one
55 significant advantage. Error-checks need only be
56 preformed where the error can first be detected.
57 There is no need to check the return value at every
58 function return. This leads to cleaner code, if
59 there are many functions that can raise exceptional
60 conditions, and greater performance under some
61 conditions. This is why this technique is sometimes
62 used in high-performance libraries. libzahl uses
63 this technique.
64
65 Rather than writing
66
67 \begin{alltt}
69        goto out;
70 \end{alltt}
71
72 \noindent
73 or a bit cleaner, if there are a lot of calls,
74
75 \begin{alltt}
76    #define TRY(...) do if (__VA_ARGS__) goto out; while (0)
77    \textcolor{c}{/* \textrm{\ldots} */}
79 \end{alltt}
80
81 \noindent
82 we write
83
84 \begin{alltt}
85    jmp_buf env;
86    if (setjmp(env))
87        goto out;
88    zsetup(env);
89    \textcolor{c}{/* \textrm{\ldots} */}
91 \end{alltt}
92
93 You only need to call {\tt setjmp} and {\tt zsetup}
94 once, but can update the return point by calling
95 them once more.
96
97 If you don't need to check for errors, you can
98 disable error detection at compile-time. By defining
99 the {\tt ZAHL\_UNSAFE} C preprocessor definition
100 when compiling libzahl, and when compiling your
101 software that uses libzahl.
102
103
104 \newpage
105 \section{Integer structure}
106 \label{sec:Integer structure}
107
108 The data type used to represent a big integer with
109 libzahl is {\tt z\_t},\footnote{This name actually
110 violates the naming convention; it should be {\tt Z},
111 or {\tt Zahl} to avoid single-letter names. But this
112 violation is common-place.} defined as
113
114 \begin{alltt}
115    typedef struct zahl z_t[1];
116 \end{alltt}
117
118 \noindent
119 where {\tt struct zahl} is defined as
120
121 \begin{alltt}
122    struct zahl \{
123        int sign;            \textcolor{c}{/* \textrm{\emph{not} short for signum'} */}
124        size_t used;
125        size_t alloced;      \textcolor{c}{/* \textrm{short for allocated'} */}
126        zahl_char_t *chars;  \textcolor{c}{/* \textrm{short for characters'} */}
127    \};
128 \end{alltt}
129
130 \noindent
131 where {\tt zahl\_char\_t} is defined as
132
133 \begin{alltt}
134    typedef uint64_t zahl_char_t;
135 \end{alltt}
136
137 \noindent
138 As a user, try not to think about anything else than
139
140 \begin{alltt}
141    typedef \textcolor{c}{/* \textrm{ignore what is here} */} z_t[1];
142 \end{alltt}
143
144 \noindent
145 details can change in future versions of libzahl.
146
147 {\tt z\_t} is defined as a single-element array. This
148 is often called a reference, or a call-by-reference.
149 There are some flexibility issues with this, why
150 {\tt struct zahl} has beed added, but for most uses
151 with big integers, it makes things simpler. Particularly,
152 you need not work prepend {\tt \&} to variable when making
153 function calls, but the existence of {\tt struct zahl}
154 allows you do so if you so choose.
155
156 The {\tt .sign} member, is either $-1$, 0, or 1,
157 when the integer is negative, zero, or positive,
158 respectively. Whenever {\tt .sign} is 0, the value
159 of {\tt .used} and {\tt .chars} are undefined.
160
161 {\tt .used} holds to the number of elements used in
162 {\tt .chars}, and {\tt .alloced} holds the allocation
163 side of {\tt .chars} measured in elements minus a few
164 extra elements that are always added to the allocation.
165 {\tt .chars} is a little-endian array of 64-bit digits,
166 these 64-bit digits are called characters' in libzahl.
167 {\tt .chars} holds the absolute value of the
168 represented value.
169
170 Unless {\tt .sign} is 0, {\tt .chars} always contains
171 four extra elements, refered to as fluff. These are
172 merely allocated so functions can assume that they can
173 always manipulate groups of four characters, and need
174 not care about cases where the number of characters is
175 not a multiple of four. There are of course a few cases
176 when the precise number of characters is important.
177
178
179 \newpage
180 \section{Parameters}
181 \label{sec:Parameters}
182
183 The general order of parameters in libzahl functions
184 are: output integers, input integers, input data,
185 output data, parametric values. For example, in
186 addition, the out parameter is the first parameter.
187 But for marshalling and unmarshalling the buffer
188 is last. For random number generation the order is:
189 output, device, distribution, distribution parameters.
190 Whilst the distribution parameters are big integers,
191 they are not considered input integers. The order
192 of the input parameters are that of the order you
193 would write them using mathematical notation, this
194 also holds true if you include the output parameter
195 (as long as there is exactly one output), for example
196
197 \vspace{1em}
198 $a \gets b^c \mod d$
199 \vspace{1em}
200
201 \noindent
202 is written
203
204 \begin{verbatim}
205    zmodpow(a, b, c, d);
206 \end{verbatim}
207
208 \noindent
209 or
210
211 \begin{verbatim}
212    zmodpowu(a, b, c, d);
213 \end{verbatim}
214
215 Like any self respecting bignum library, libzahl
216 supports using the same big integer reference as
217 for output as input, as long as all the output
218 parameters are mutually unique. For example
219
220 \begin{alltt}
221    a += b;
222 \end{alltt}
223
224 \noindent
225 or
226
227 \begin{alltt}
228    a = a + b;
229 \end{alltt}
230
231 \noindent
232 is written, using libzahl, as
233
234 \begin{alltt}
236 \end{alltt}
237
238 For commutative functions, like {\tt zadd}, the
239 implementation is optimised to assume that this
240 order is more likely to be used than the alternative.
241 That is, we should, for example, write
242
243 \begin{alltt}
245 \end{alltt}
246
247 \noindent
248 rather than
249
250 \begin{alltt}