Update libgrapheme-page and add manuals - sites - public wiki contents of suckless.org

commit c0322961a34af28595d3f6e21f92d5af3313063e
parent 97acbace106b469625f5c6a9363f5ddbe49199d6
Author: Laslo Hunhold <dev@frign.de>
Date:   Thu,  6 Oct 2022 22:08:10 +0200

Update libgrapheme-page and add manuals

Signed-off-by: Laslo Hunhold <dev@frign.de>

Diffstat:
M libs.suckless.org/libgrapheme/index.md  | 135 ++++++++++++++++++++++++++++++++++++++++++++++++-------------------------------
A libs.suckless.org/libgrapheme/man/grapheme_decode_utf8\(3\)/index.md  | 80 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
A libs.suckless.org/libgrapheme/man/grapheme_encode_utf8\(3\)/index.md  | 87 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
A libs.suckless.org/libgrapheme/man/grapheme_is_character_break\(3\)/index.md  | 69 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
A libs.suckless.org/libgrapheme/man/grapheme_is_lowercase\(3\)/index.md  | 39 +++++++++++++++++++++++++++++++++++++++
A libs.suckless.org/libgrapheme/man/grapheme_is_lowercase_utf8\(3\)/index.md  | 38 ++++++++++++++++++++++++++++++++++++++
A libs.suckless.org/libgrapheme/man/grapheme_is_titlecase\(3\)/index.md  | 39 +++++++++++++++++++++++++++++++++++++++
A libs.suckless.org/libgrapheme/man/grapheme_is_titlecase_utf8\(3\)/index.md  | 38 ++++++++++++++++++++++++++++++++++++++
A libs.suckless.org/libgrapheme/man/grapheme_is_uppercase\(3\)/index.md  | 39 +++++++++++++++++++++++++++++++++++++++
A libs.suckless.org/libgrapheme/man/grapheme_is_uppercase_utf8\(3\)/index.md  | 38 ++++++++++++++++++++++++++++++++++++++
A libs.suckless.org/libgrapheme/man/grapheme_next_character_break\(3\)/index.md  | 42 ++++++++++++++++++++++++++++++++++++++++++
A libs.suckless.org/libgrapheme/man/grapheme_next_character_break_utf8\(3\)/index.md  | 77 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
A libs.suckless.org/libgrapheme/man/grapheme_next_line_break\(3\)/index.md  | 39 +++++++++++++++++++++++++++++++++++++++
A libs.suckless.org/libgrapheme/man/grapheme_next_line_break_utf8\(3\)/index.md  | 75 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
A libs.suckless.org/libgrapheme/man/grapheme_next_sentence_break\(3\)/index.md  | 40 ++++++++++++++++++++++++++++++++++++++++
A libs.suckless.org/libgrapheme/man/grapheme_next_sentence_break_utf8\(3\)/index.md  | 77 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
A libs.suckless.org/libgrapheme/man/grapheme_next_word_break\(3\)/index.md  | 39 +++++++++++++++++++++++++++++++++++++++
A libs.suckless.org/libgrapheme/man/grapheme_next_word_break_utf8\(3\)/index.md  | 75 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
A libs.suckless.org/libgrapheme/man/grapheme_to_lowercase\(3\)/index.md  | 40 ++++++++++++++++++++++++++++++++++++++++
A libs.suckless.org/libgrapheme/man/grapheme_to_lowercase_utf8\(3\)/index.md  | 39 +++++++++++++++++++++++++++++++++++++++
A libs.suckless.org/libgrapheme/man/grapheme_to_titlecase\(3\)/index.md  | 40 ++++++++++++++++++++++++++++++++++++++++
A libs.suckless.org/libgrapheme/man/grapheme_to_titlecase_utf8\(3\)/index.md  | 39 +++++++++++++++++++++++++++++++++++++++
A libs.suckless.org/libgrapheme/man/grapheme_to_uppercase\(3\)/index.md  | 40 ++++++++++++++++++++++++++++++++++++++++
A libs.suckless.org/libgrapheme/man/grapheme_to_uppercase_utf8\(3\)/index.md  | 39 +++++++++++++++++++++++++++++++++++++++
A libs.suckless.org/libgrapheme/man/libgrapheme\(7\)/index.md  | 122 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

25 files changed, 1372 insertions(+), 53 deletions(-)
diff --git a/libs.suckless.org/libgrapheme/index.md b/libs.suckless.org/libgrapheme/index.md
@@ -1,60 +1,61 @@
 ![libgrapheme](libgrapheme.svg)
 
-libgrapheme is an extremely simple C99 library providing utilities for
-properly handling Unicode strings made up of user-perceived characters
-('grapheme clusters') according to the Unicode standard. While providing
-convenience functions to operate on UTF-8-encoded strings, you can also
-use libgrapheme for any other encoding as well.
-
-The necessary lookup-tables and test-data are automatically generated
-from the Unicode standard data, ensuring correctness and validation.
-A specialized 'Heisenstate' state-handling combined with
-O(log(n))-binary-search on the lookup-tables and data-recycling provides
-great processing-performance in the order of millions of codepoints per
-second.
+libgrapheme is an extremely simple freestanding C99 library providing
+utilities for properly handling strings according to the latest
+Unicode standard 15.0.0. It offers fully Unicode compliant
+
+* __grapheme cluster__ (i.e. user-perceived character) __segmentation__
+* __word segmentation__
+* __sentence segmentation__
+* detection of permissible __line break opportunities__
+* __case detection__ (lower-, upper- and title-case)
+* __case conversion__ (to lower-, upper- and title-case)
+
+on UTF-8 strings and codepoint arrays, which both can also be
+null-terminated.
+
+The necessary lookup-tables are automatically generated from the Unicode
+standard data (contained in the tarball) and heavily compressed. Over
+10,000 automatically generated conformance tests and over 150 unit tests
+ensure conformance and correctness.
 
 There is no complicated build-system involved and it's all done using
-one POSIX-compliant Makefile. All you need is a C99 compiler, because
-the data-generators are also written in C99.
+one POSIX-compliant Makefile. All you need is a C99 compiler, given
+the lookup-table-generators and compressors are also written in C99.
+The resulting library is freestanding and thus not even dependent on a
+standard library to be present at runtime.
 
-Motivation
-----------
-The goal of this project is to be a suckless and statically linkable
-alternative to the existing bloated, complicated and overscoped solutions
-for Unicode string handling (ICU, GNU's libunistring, etc.), motivating
-more hackers to properly handle Unicode strings in their projects and
-allowing this even in embedded applications.
+Development
+-----------
+You can [browse](//git.suckless.org/libgrapheme) the source code
+repository or get a copy with the following command:
 
-The problem can be easily seen when looking at the sizes of the respective
-libraries: The ICU library (libicudata.a, libicui18n.a, libicuio.a,
-libicutest.a, libicutu.a, libicuuc.a) is around 38MB and libunistring
-(libunistring.a) is around 2MB, which is unacceptable for static
-linking. Both take many minutes to compile even on a good computer and
-require a lot of dependencies, including Python for ICU. On
-the other hand libgrapheme (libgrapheme.a) only weighs in at around 40K
-and is compiled (including Unicode data parsing) in fractions of a
-second, requiring nothing but a C99 compiler and make(1).
+	git clone https://git.suckless.org/libgrapheme
 
-While ICU and libunistring offer a lot of functions and the weight mostly
-comes from locale-data provided by the Unicode standard, which is applied
-implementation-specifically (!) for some things, the same standard always
-defines a sane 'default' behaviour as an alternative in such cases that
-is satisfying in 99% of the cases and which you can rely on.
+Download
+--------
+libgrapheme follows the semantic versioning scheme.
 
-For some languages, for instance, it is necessary to have a dictionary
-on hand to always accurately determine when a word begins and ends. The
-defaults provided by the standard, though, already do a good job
-respecting the language's boundaries in the general case and are not too
-taxing in terms of performance.
+* [libgrapheme-1.0.0](//dl.suckless.org/libgrapheme/libgrapheme-1.tar.gz) (2021-12-22)
 
-Handling user-perceived characters is not locale-dependent, though, and
-does not require locale-data.
 
 Getting Started
 ---------------
-Installing libgrapheme will install the header grapheme.h and both the
-static library libgrapheme.a and the dynamic library libgrapheme.so in
-the respective folders. Access the manual under libgrapheme(7) by typing
+Installing libgrapheme via
+
+	make install
+
+will install the header grapheme.h and both the static library
+libgrapheme.a and the dynamic library libgrapheme.so (with symlinks) in
+the respective folders. The conformance and unit tests can be run with
+
+	make test
+
+and comparative benchmarks against libutf8proc can be run with
+
+	make benchmark
+
+You can access the manual via libgrapheme(7) by typing
 
 	man libgrapheme
 
@@ -109,16 +110,44 @@ and the output is
 	 6 bytes | நி
 	 1 bytes | !
 
-Development
------------
-You can [browse](//git.suckless.org/libgrapheme) the source code
-repository or get a copy with the following command:
 
-	git clone https://git.suckless.org/libgrapheme
+Motivation
+----------
+The goal of this project is to be a suckless and statically linkable
+alternative to the existing bloated, complicated, overscoped and/or
+incorrect solutions for Unicode string handling (ICU, GNU's
+libunistring, libutf8proc, etc.), motivating more hackers to properly
+handle Unicode strings in their projects and allowing this even in
+embedded applications.
 
-Download
---------
-* [libgrapheme-1](//dl.suckless.org/libgrapheme/libgrapheme-1.tar.gz) (2021-12-22)
+The problem can be easily seen when looking at the sizes of the respective
+libraries: The ICU library (libicudata.a, libicui18n.a, libicuio.a,
+libicutest.a, libicutu.a, libicuuc.a) is around 38MB and libunistring
+(libunistring.a) is around 2MB, which is unacceptable for static
+linking. Both take many minutes to compile even on a good computer and
+require a lot of dependencies, including Python for ICU. On
+the other hand libgrapheme (libgrapheme.a) only weighs in at around 300K
+and is compiled (including Unicode data parsing and compression) in
+under a second, requiring nothing but a C99 compiler and POSIX make(1).
+
+Some libraries, like libutf8proc and libunistring, are incorrect by
+basing their API on assumptions that haven't been true for years
+(e.g. offering stateless grapheme cluster segmentation even though the
+underlying algorithm is not stateless). As an additional factor,
+libutf8proc's UTF-8-decoder is unsafe, as it allows overlong encodings
+that can be easily used for exploits.
+
+While ICU and libunistring offer a lot of functions and the weight mostly
+comes from locale-data provided by the Unicode standard, which is applied
+implementation-specifically (!) for some things, the same standard always
+defines a sane 'default' behaviour as an alternative in such cases that
+is satisfying in 99% of the cases and which you can rely on.
+
+For some languages, for instance, it is necessary to have a dictionary
+on hand to always accurately determine when a word begins and ends. The
+defaults provided by the standard, though, already do a great job
+respecting the language's boundaries in the general case and are not too
+taxing in terms of performance.
 
 Author
 ------
diff --git a/libs.suckless.org/libgrapheme/man/grapheme_decode_utf8\(3\)/index.md b/libs.suckless.org/libgrapheme/man/grapheme_decode_utf8\(3\)/index.md
@@ -0,0 +1,80 @@
+	GRAPHEME_DECODE_UTF8(3)	   Library Functions Manual    GRAPHEME_DECODE_UTF8(3)
+	
+	NAME
+	     grapheme_decode_utf8 – decode first codepoint in UTF-8-encoded string
+	
+	SYNOPSIS
+	     #include <grapheme.h>
+	
+	     size_t
+	     grapheme_decode_utf8(const char *str, size_t len, uint_least32_t *cp);
+	
+	DESCRIPTION
+	     The grapheme_decode_utf8() function decodes the first codepoint in the
+	     UTF-8-encoded string str of length len.  If the UTF-8-sequence is invalid
+	     (overlong encoding, unexpected byte, string ends unexpectedly, empty
+	     string, etc.) the decoding is stopped at the last processed byte and the
+	     decoded codepoint set to GRAPHEME_INVALID_CODEPOINT.
+	
+	     If cp is not NULL the decoded codepoint is stored in the memory pointed
+	     to by cp.
+	
+	     Given NUL has a unique 1 byte representation, it is safe to operate on
+	     NUL-terminated strings by setting len to SIZE_MAX (stdint.h is already
+	     included by grapheme.h) and terminating when cp is 0 (see EXAMPLES for an
+	     example).
+	
+	RETURN VALUES
+	     The grapheme_decode_utf8() function returns the number of processed bytes
+	     and 0 if str is NULL or len is 0.	If the string ends unexpectedly in a
+	     multibyte sequence, the desired length (that is larger than len) is
+	     returned.
+	
+	EXAMPLES
+	     /* cc (-static) -o example example.c -lgrapheme */
+	     #include <grapheme.h>
+	     #include <inttypes.h>
+	     #include <stdio.h>
+	
+	     void
+	     print_cps(const char *str, size_t len)
+	     {
+		     size_t ret, off;
+		     uint_least32_t cp;
+	
+		     for (off = 0; off < len; off += ret) {
+			     if ((ret = grapheme_decode_utf8(str + off,
+							     len - off, &cp)) > (len - off)) {
+				     /*
+				      * string ended unexpectedly in the middle of a
+				      * multibyte sequence and we have the choice
+				      * here to possibly expand str by ret - len + off
+				      * bytes to get a full sequence, but we just
+				      * bail out in this case.
+				      */
+				     break;
+			     }
+			     printf("%"PRIxLEAST32"\n", cp);
+		     }
+	     }
+	
+	     void
+	     print_cps_nul_terminated(const char *str)
+	     {
+		     size_t ret, off;
+		     uint_least32_t cp;
+	
+		     for (off = 0; (ret = grapheme_decode_utf8(str + off,
+							       SIZE_MAX, &cp)) > 0 &&
+			  cp != 0; off += ret) {
+			     printf("%"PRIxLEAST32"\n", cp);
+		     }
+	     }
+	
+	SEE ALSO
+	     grapheme_encode_utf8(3), libgrapheme(7)
+	
+	AUTHORS
+	     Laslo Hunhold <dev@frign.de>
+	
+	suckless.org			  2022-10-06			  suckless.org
diff --git a/libs.suckless.org/libgrapheme/man/grapheme_encode_utf8\(3\)/index.md b/libs.suckless.org/libgrapheme/man/grapheme_encode_utf8\(3\)/index.md
@@ -0,0 +1,87 @@
+	GRAPHEME_ENCODE_UTF8(3)	   Library Functions Manual    GRAPHEME_ENCODE_UTF8(3)
+	
+	NAME
+	     grapheme_encode_utf8 – encode codepoint into UTF-8 string
+	
+	SYNOPSIS
+	     #include <grapheme.h>
+	
+	     size_t
+	     grapheme_encode_utf8(uint_least32_t cp, char *str, size_t len);
+	
+	DESCRIPTION
+	     The grapheme_encode_utf8() function encodes the codepoint cp into a
+	     UTF-8-string.  If str is not NULL and len is large enough it writes the
+	     UTF-8-string to the memory pointed to by str.  Otherwise no data is
+	     written.
+	
+	RETURN VALUES
+	     The grapheme_encode_utf8() function returns the length (in bytes) of the
+	     UTF-8-string resulting from encoding cp, even if len is not large enough
+	     or str is NULL.
+	
+	EXAMPLES
+	     /* cc (-static) -o example example.c -lgrapheme */
+	     #include <grapheme.h>
+	     #include <stddef.h>
+	     #include <stdlib.h>
+	
+	     size_t
+	     cps_to_utf8(const uint_least32_t *cp, size_t cplen, char *str, size_t len)
+	     {
+		     size_t i, off, ret;
+	
+		     for (i = 0, off = 0; i < cplen; i++, off += ret) {
+			     if ((ret = grapheme_encode_utf8(cp[i], str + off,
+							     len - off)) > (len - off)) {
+				     /* buffer too small */
+				     break;
+			     }
+		     }
+	
+		     return off;
+	     }
+	
+	     size_t
+	     cps_bytelen(const uint_least32_t *cp, size_t cplen)
+	     {
+		     size_t i, len;
+	
+		     for (i = 0, len = 0; i < cplen; i++) {
+			     len += grapheme_encode_utf8(cp[i], NULL, 0);
+		     }
+	
+		     return len;
+	     }
+	
+	     char *
+	     cps_to_utf8_alloc(const uint_least32_t *cp, size_t cplen)
+	     {
+		     char *str;
+		     size_t len, i, ret, off;
+	
+		     len = cps_bytelen(cp, cplen);
+	
+		     if (!(str = malloc(len))) {
+			     return NULL;
+		     }
+	
+		     for (i = 0, off = 0; i < cplen; i++, off += ret) {
+			     if ((ret = grapheme_encode_utf8(cp[i], str + off,
+							     len - off)) > (len - off)) {
+				     /* buffer too small */
+				     break;
+			     }
+		     }
+		     str[off] = '\0';
+	
+		     return str;
+	     }
+	
+	SEE ALSO
+	     grapheme_decode_utf8(3), libgrapheme(7)
+	
+	AUTHORS
+	     Laslo Hunhold <dev@frign.de>
+	
+	suckless.org			  2022-10-06			  suckless.org
diff --git a/libs.suckless.org/libgrapheme/man/grapheme_is_character_break\(3\)/index.md b/libs.suckless.org/libgrapheme/man/grapheme_is_character_break\(3\)/index.md
@@ -0,0 +1,69 @@
+	GRAPHEME_IS_CHARACTER_BREAK(3)			      Library Functions Manual
+	
+	NAME
+	     grapheme_is_character_break – test for a grapheme cluster break between
+	     two codepoints
+	
+	SYNOPSIS
+	     #include <grapheme.h>
+	
+	     size_t
+	     grapheme_is_character_break(uint_least32_t cp1, uint_least32_t cp2,
+		 uint_least16_t *state);
+	
+	DESCRIPTION
+	     The grapheme_is_character_break() function determines if there is a
+	     grapheme cluster break (see libgrapheme(7)) between the two codepoints
+	     cp1 and cp2.  By specification this decision depends on a state that can
+	     at most be completely reset after detecting a break and must be reset
+	     every time one deviates from sequential processing.
+	
+	     If state is NULL grapheme_is_character_break() behaves as if it was
+	     called with a fully reset state.
+	
+	RETURN VALUES
+	     The grapheme_is_character_break() function returns true if there is a
+	     grapheme cluster break between the codepoints cp1 and cp2 and false if
+	     there is not.
+	
+	EXAMPLES
+	     /* cc (-static) -o example example.c -lgrapheme */
+	     #include <grapheme.h>
+	     #include <stdint.h>
+	     #include <stdio.h>
+	     #include <stdlib.h>
+	
+	     int
+	     main(void)
+	     {
+		     uint_least16_t state = 0;
+		     uint_least32_t s1[] = ..., s2[] = ...; /* two input arrays */
+		     size_t i;
+	
+		     for (i = 0; i + 1 < sizeof(s1) / sizeof(*s1); i++) {
+			     if (grapheme_is_character_break(s[i], s[i + 1], &state)) {
+				     printf("break in s1 at offset %zu0, i);
+			     }
+		     }
+		     memset(&state, 0, sizeof(state)); /* reset state */
+		     for (i = 0; i + 1 < sizeof(s2) / sizeof(*s2); i++) {
+			     if (grapheme_is_character_break(s[i], s[i + 1], &state)) {
+				     printf("break in s2 at offset %zu0, i);
+			     }
+		     }
+	
+		     return 0;
+	     }
+	
+	SEE ALSO
+	     grapheme_next_character_break(3), grapheme_next_character_break_utf8(3),
+	     libgrapheme(7)
+	
+	STANDARDS
+	     grapheme_is_character_break() is compliant with the Unicode 15.0.0
+	     specification.
+	
+	AUTHORS
+	     Laslo Hunhold <dev@frign.de>
+	
+	suckless.org			  2022-10-06			  suckless.org
diff --git a/libs.suckless.org/libgrapheme/man/grapheme_is_lowercase\(3\)/index.md b/libs.suckless.org/libgrapheme/man/grapheme_is_lowercase\(3\)/index.md
@@ -0,0 +1,39 @@
+	GRAPHEME_IS_LOWERCASE(3)   Library Functions Manual   GRAPHEME_IS_LOWERCASE(3)
+	
+	NAME
+	     grapheme_is_lowercase – check if codepoint array is lowercase
+	
+	SYNOPSIS
+	     #include <grapheme.h>
+	
+	     size_t
+	     grapheme_is_lowercase(const uint_least32_t *str, size_t len,
+		 size_t *caselen);
+	
+	DESCRIPTION
+	     The grapheme_is_lowercase() function checks if the codepoint array str is
+	     lowercase and writes the length of the matching lowercase-sequence to the
+	     integer pointed to by caselen, unless caselen is set to NULL.
+	
+	     If len is set to SIZE_MAX (stdint.h is already included by grapheme.h)
+	     the codepoint array src is interpreted to be NUL-terminated and
+	     processing stops when a NUL-byte is encountered.
+	
+	     For UTF-8-encoded input data grapheme_is_lowercase_utf8(3) can be used
+	     instead.
+	
+	RETURN VALUES
+	     The grapheme_is_lowercase() function returns true if the codepoint array
+	     str is lowercase, otherwise false.
+	
+	SEE ALSO
+	     grapheme_is_lowercase_utf8(3), libgrapheme(7)
+	
+	STANDARDS
+	     grapheme_is_lowercase() is compliant with the Unicode 15.0.0
+	     specification.
+	
+	AUTHORS
+	     Laslo Hunhold <dev@frign.de>
+	
+	suckless.org			  2022-10-06			  suckless.org
diff --git a/libs.suckless.org/libgrapheme/man/grapheme_is_lowercase_utf8\(3\)/index.md b/libs.suckless.org/libgrapheme/man/grapheme_is_lowercase_utf8\(3\)/index.md
@@ -0,0 +1,38 @@
+	GRAPHEME_IS_LOWERCASE_UTF8(3)			      Library Functions Manual
+	
+	NAME
+	     grapheme_is_lowercase_utf8 – check if UTF-8-encoded string is lowercase
+	
+	SYNOPSIS
+	     #include <grapheme.h>
+	
+	     size_t
+	     grapheme_is_lowercase_utf8(const char *str, size_t len, size_t *caselen);
+	
+	DESCRIPTION
+	     The grapheme_is_lowercase_utf8() function checks if the UTF-8-encoded
+	     string str is lowercase and writes the length of the matching lowercase-
+	     sequence to the integer pointed to by caselen, unless caselen is set to
+	     NULL.
+	
+	     If len is set to SIZE_MAX (stdint.h is already included by grapheme.h)
+	     the UTF-8-encoded string src is interpreted to be NUL-terminated and
+	     processing stops when a NUL-byte is encountered.
+	
+	     For non-UTF-8 input data grapheme_is_lowercase(3) can be used instead.
+	
+	RETURN VALUES
+	     The grapheme_is_lowercase_utf8() function returns true if the
+	     UTF-8-encoded string str is lowercase, otherwise false.
+	
+	SEE ALSO
+	     grapheme_is_lowercase(3), libgrapheme(7)
+	
+	STANDARDS
+	     grapheme_is_lowercase_utf8() is compliant with the Unicode 15.0.0
+	     specification.
+	
+	AUTHORS
+	     Laslo Hunhold <dev@frign.de>
+	
+	suckless.org			  2022-10-06			  suckless.org
diff --git a/libs.suckless.org/libgrapheme/man/grapheme_is_titlecase\(3\)/index.md b/libs.suckless.org/libgrapheme/man/grapheme_is_titlecase\(3\)/index.md
@@ -0,0 +1,39 @@
+	GRAPHEME_IS_TITLECASE(3)   Library Functions Manual   GRAPHEME_IS_TITLECASE(3)
+	
+	NAME
+	     grapheme_is_titlecase – check if codepoint array is titlecase
+	
+	SYNOPSIS
+	     #include <grapheme.h>
+	
+	     size_t
+	     grapheme_is_titlecase(const uint_least32_t *str, size_t len,
+		 size_t *caselen);
+	
+	DESCRIPTION
+	     The grapheme_is_titlecase() function checks if the codepoint array str is
+	     titlecase and writes the length of the matching titlecase-sequence to the
+	     integer pointed to by caselen, unless caselen is set to NULL.
+	
+	     If len is set to SIZE_MAX (stdint.h is already included by grapheme.h)
+	     the codepoint array src is interpreted to be NUL-terminated and
+	     processing stops when a NUL-byte is encountered.
+	
+	     For UTF-8-encoded input data grapheme_is_titlecase_utf8(3) can be used
+	     instead.
+	
+	RETURN VALUES
+	     The grapheme_is_titlecase() function returns true if the codepoint array
+	     str is titlecase, otherwise false.
+	
+	SEE ALSO
+	     grapheme_is_titlecase_utf8(3), libgrapheme(7)
+	
+	STANDARDS
+	     grapheme_is_titlecase() is compliant with the Unicode 15.0.0
+	     specification.
+	
+	AUTHORS
+	     Laslo Hunhold <dev@frign.de>
+	
+	suckless.org			  2022-10-06			  suckless.org
diff --git a/libs.suckless.org/libgrapheme/man/grapheme_is_titlecase_utf8\(3\)/index.md b/libs.suckless.org/libgrapheme/man/grapheme_is_titlecase_utf8\(3\)/index.md
@@ -0,0 +1,38 @@
+	GRAPHEME_IS_TITLECASE_UTF8(3)			      Library Functions Manual
+	
+	NAME
+	     grapheme_is_titlecase_utf8 – check if UTF-8-encoded string is titlecase
+	
+	SYNOPSIS
+	     #include <grapheme.h>
+	
+	     size_t
+	     grapheme_is_titlecase_utf8(const char *str, size_t len, size_t *caselen);
+	
+	DESCRIPTION
+	     The grapheme_is_titlecase_utf8() function checks if the UTF-8-encoded
+	     string str is titlecase and writes the length of the matching titlecase-
+	     sequence to the integer pointed to by caselen, unless caselen is set to
+	     NULL.
+	
+	     If len is set to SIZE_MAX (stdint.h is already included by grapheme.h)
+	     the UTF-8-encoded string src is interpreted to be NUL-terminated and
+	     processing stops when a NUL-byte is encountered.
+	
+	     For non-UTF-8 input data grapheme_is_titlecase(3) can be used instead.
+	
+	RETURN VALUES
+	     The grapheme_is_titlecase_utf8() function returns true if the
+	     UTF-8-encoded string str is titlecase, otherwise false.
+	
+	SEE ALSO
+	     grapheme_is_titlecase(3), libgrapheme(7)
+	
+	STANDARDS
+	     grapheme_is_titlecase_utf8() is compliant with the Unicode 15.0.0
+	     specification.
+	
+	AUTHORS
+	     Laslo Hunhold <dev@frign.de>
+	
+	suckless.org			  2022-10-06			  suckless.org
diff --git a/libs.suckless.org/libgrapheme/man/grapheme_is_uppercase\(3\)/index.md b/libs.suckless.org/libgrapheme/man/grapheme_is_uppercase\(3\)/index.md
@@ -0,0 +1,39 @@
+	GRAPHEME_IS_UPPERCASE(3)   Library Functions Manual   GRAPHEME_IS_UPPERCASE(3)
+	
+	NAME
+	     grapheme_is_uppercase – check if codepoint array is uppercase
+	
+	SYNOPSIS
+	     #include <grapheme.h>
+	
+	     size_t
+	     grapheme_is_uppercase(const uint_least32_t *str, size_t len,
+		 size_t *caselen);
+	
+	DESCRIPTION
+	     The grapheme_is_uppercase() function checks if the codepoint array str is
+	     uppercase and writes the length of the matching uppercase-sequence to the
+	     integer pointed to by caselen, unless caselen is set to NULL.
+	
+	     If len is set to SIZE_MAX (stdint.h is already included by grapheme.h)
+	     the codepoint array src is interpreted to be NUL-terminated and
+	     processing stops when a NUL-byte is encountered.
+	
+	     For UTF-8-encoded input data grapheme_is_uppercase_utf8(3) can be used
+	     instead.
+	
+	RETURN VALUES
+	     The grapheme_is_uppercase() function returns true if the codepoint array
+	     str is uppercase, otherwise false.
+	
+	SEE ALSO
+	     grapheme_is_uppercase_utf8(3), libgrapheme(7)
+	
+	STANDARDS
+	     grapheme_is_uppercase() is compliant with the Unicode 15.0.0
+	     specification.
+	
+	AUTHORS
+	     Laslo Hunhold <dev@frign.de>
+	
+	suckless.org			  2022-10-06			  suckless.org
diff --git a/libs.suckless.org/libgrapheme/man/grapheme_is_uppercase_utf8\(3\)/index.md b/libs.suckless.org/libgrapheme/man/grapheme_is_uppercase_utf8\(3\)/index.md
@@ -0,0 +1,38 @@
+	GRAPHEME_IS_LOWERCASE_UTF8(3)			      Library Functions Manual
+	
+	NAME
+	     grapheme_is_lowercase_utf8 – check if UTF-8-encoded string is lowercase
+	
+	SYNOPSIS
+	     #include <grapheme.h>
+	
+	     size_t
+	     grapheme_is_lowercase_utf8(const char *str, size_t len, size_t *caselen);
+	
+	DESCRIPTION
+	     The grapheme_is_lowercase_utf8() function checks if the UTF-8-encoded
+	     string str is lowercase and writes the length of the matching lowercase-
+	     sequence to the integer pointed to by caselen, unless caselen is set to
+	     NULL.
+	
+	     If len is set to SIZE_MAX (stdint.h is already included by grapheme.h)
+	     the UTF-8-encoded string src is interpreted to be NUL-terminated and
+	     processing stops when a NUL-byte is encountered.
+	
+	     For non-UTF-8 input data grapheme_is_lowercase(3) can be used instead.
+	
+	RETURN VALUES
+	     The grapheme_is_lowercase_utf8() function returns true if the
+	     UTF-8-encoded string str is lowercase, otherwise false.
+	
+	SEE ALSO
+	     grapheme_is_lowercase(3), libgrapheme(7)
+	
+	STANDARDS
+	     grapheme_is_lowercase_utf8() is compliant with the Unicode 15.0.0
+	     specification.
+	
+	AUTHORS
+	     Laslo Hunhold <dev@frign.de>
+	
+	suckless.org			  2022-10-06			  suckless.org
diff --git a/libs.suckless.org/libgrapheme/man/grapheme_next_character_break\(3\)/index.md b/libs.suckless.org/libgrapheme/man/grapheme_next_character_break\(3\)/index.md
@@ -0,0 +1,42 @@
+	GRAPHEME_NEXT_CHARACTER_BREAK(3)		      Library Functions Manual
+	
+	NAME
+	     grapheme_next_character_break – determine codepoint-offset to next
+	     grapheme cluster break
+	
+	SYNOPSIS
+	     #include <grapheme.h>
+	
+	     size_t
+	     grapheme_next_character_break(const uint_least32_t *str, size_t len);
+	
+	DESCRIPTION
+	     The grapheme_next_character_break() function computes the offset (in
+	     codepoints) to the next grapheme cluster break (see libgrapheme(7)) in
+	     the codepoint array str of length len.  If a grapheme cluster begins at
+	     str this offset is equal to the length of said grapheme cluster.
+	
+	     If len is set to SIZE_MAX (stdint.h is already included by grapheme.h)
+	     the string str is interpreted to be NUL-terminated and processing stops
+	     when a codepoint with the value 0 is encountered.
+	
+	     For UTF-8-encoded input data grapheme_next_character_break_utf8(3) can be
+	     used instead.
+	
+	RETURN VALUES
+	     The grapheme_next_character_break() function returns the offset (in
+	     codepoints) to the next grapheme cluster break in str or 0 if str is
+	     NULL.
+	
+	SEE ALSO
+	     grapheme_is_character_break(3), grapheme_next_character_break_utf8(3),
+	     libgrapheme(7)
+	
+	STANDARDS
+	     grapheme_next_character_break() is compliant with the Unicode 15.0.0
+	     specification.
+	
+	AUTHORS
+	     Laslo Hunhold <dev@frign.de>
+	
+	suckless.org			  2022-10-06			  suckless.org
diff --git a/libs.suckless.org/libgrapheme/man/grapheme_next_character_break_utf8\(3\)/index.md b/libs.suckless.org/libgrapheme/man/grapheme_next_character_break_utf8\(3\)/index.md
@@ -0,0 +1,77 @@
+	GRAPHEME_NEXT_CHARACTER_BREAK_UTF8(3)		      Library Functions Manual
+	
+	NAME
+	     grapheme_next_character_break_utf8 – determine byte-offset to next
+	     grapheme cluster break
+	
+	SYNOPSIS
+	     #include <grapheme.h>
+	
+	     size_t
+	     grapheme_next_character_break_utf8(const char *str, size_t len);
+	
+	DESCRIPTION
+	     The grapheme_next_character_break_utf8() function computes the offset (in
+	     bytes) to the next grapheme cluster break (see libgrapheme(7)) in the
+	     UTF-8-encoded string str of length len.  If a grapheme cluster begins at
+	     str this offset is equal to the length of said grapheme cluster.
+	
+	     If len is set to SIZE_MAX (stdint.h is already included by grapheme.h)
+	     the string str is interpreted to be NUL-terminated and processing stops
+	     when a NUL-byte is encountered.
+	
+	     For non-UTF-8 input data grapheme_is_character_break(3) and
+	     grapheme_next_character_break(3) can be used instead.
+	
+	RETURN VALUES
+	     The grapheme_next_character_break_utf8() function returns the offset (in
+	     bytes) to the next grapheme cluster break in str or 0 if str is NULL.
+	
+	EXAMPLES
+	     /* cc (-static) -o example example.c -lgrapheme */
+	     #include <grapheme.h>
+	     #include <stdint.h>
+	     #include <stdio.h>
+	
+	     int
+	     main(void)
+	     {
+		     /* UTF-8 encoded input */
+		     char *s = "T\xC3\xABst \xF0\x9F\x91\xA8\xE2\x80\x8D\xF0"
+			       "\x9F\x91\xA9\xE2\x80\x8D\xF0\x9F\x91\xA6 \xF0"
+			       "\x9F\x87\xBA\xF0\x9F\x87\xB8 \xE0\xA4\xA8\xE0"
+			       "\xA5\x80 \xE0\xAE\xA8\xE0\xAE\xBF!";
+		     size_t ret, len, off;
+	
+		     printf("Input: \"%s\"\n", s);
+	
+		     /* print each grapheme cluster with byte-length */
+		     printf("grapheme clusters in NUL-delimited input:\n");
+		     for (off = 0; s[off] != '\0'; off += ret) {
+			     ret = grapheme_next_character_break_utf8(s + off, SIZE_MAX);
+			     printf("%2zu bytes | %.*s\n", ret, (int)ret, s + off, ret);
+		     }
+		     printf("\n");
+	
+		     /* do the same, but this time string is length-delimited */
+		     len = 17;
+		     printf("grapheme clusters in input delimited to %zu bytes:\n", len);
+		     for (off = 0; off < len; off += ret) {
+			     ret = grapheme_next_character_break_utf8(s + off, len - off);
+			     printf("%2zu bytes | %.*s\n", ret, (int)ret, s + off, ret);
+		     }
+	
+		     return 0;
+	     }
+	
+	SEE ALSO
+	     grapheme_next_character_break(3), libgrapheme(7)
+	
+	STANDARDS
+	     grapheme_next_character_break_utf8() is compliant with the Unicode 15.0.0
+	     specification.
+	
+	AUTHORS
+	     Laslo Hunhold <dev@frign.de>
+	
+	suckless.org			  2022-10-06			  suckless.org
diff --git a/libs.suckless.org/libgrapheme/man/grapheme_next_line_break\(3\)/index.md b/libs.suckless.org/libgrapheme/man/grapheme_next_line_break\(3\)/index.md
@@ -0,0 +1,39 @@
+	GRAPHEME_NEXT_LINE_BREAK(3)			      Library Functions Manual
+	
+	NAME
+	     grapheme_next_line_break – determine codepoint-offset to next possible
+	     line break
+	
+	SYNOPSIS
+	     #include <grapheme.h>
+	
+	     size_t
+	     grapheme_next_line_break(const uint_least32_t *str, size_t len);
+	
+	DESCRIPTION
+	     The grapheme_next_line_break() function computes the offset (in
+	     codepoints) to the next possible line break (see libgrapheme(7)) in the
+	     codepoint array str of length len.
+	
+	     If len is set to SIZE_MAX (stdint.h is already included by grapheme.h)
+	     the string str is interpreted to be NUL-terminated and processing stops
+	     when a codepoint with the value 0 is encountered.
+	
+	     For UTF-8-encoded input data grapheme_next_line_break_utf8(3) can be used
+	     instead.
+	
+	RETURN VALUES
+	     The grapheme_next_line_break() function returns the offset (in
+	     codepoints) to the next possible line break in str or 0 if str is NULL.
+	
+	SEE ALSO
+	     grapheme_next_line_break_utf8(3), libgrapheme(7)
+	
+	STANDARDS
+	     grapheme_next_line_break() is compliant with the Unicode 15.0.0
+	     specification.
+	
+	AUTHORS
+	     Laslo Hunhold <dev@frign.de>
+	
+	suckless.org			  2022-10-06			  suckless.org
diff --git a/libs.suckless.org/libgrapheme/man/grapheme_next_line_break_utf8\(3\)/index.md b/libs.suckless.org/libgrapheme/man/grapheme_next_line_break_utf8\(3\)/index.md
@@ -0,0 +1,75 @@
+	GRAPHEME_NEXT_LINE_BREAK_UTF8(3)		      Library Functions Manual
+	
+	NAME
+	     grapheme_next_line_break_utf8 – determine byte-offset to next possible
+	     line break
+	
+	SYNOPSIS
+	     #include <grapheme.h>
+	
+	     size_t
+	     grapheme_next_line_break_utf8(const char *str, size_t len);
+	
+	DESCRIPTION
+	     The grapheme_next_line_break_utf8() function computes the offset (in
+	     bytes) to the next possible line break (see libgrapheme(7)) in the
+	     UTF-8-encoded string str of length len.
+	
+	     If len is set to SIZE_MAX (stdint.h is already included by grapheme.h)
+	     the string str is interpreted to be NUL-terminated and processing stops
+	     when a NUL-byte is encountered.
+	
+	     For non-UTF-8 input data grapheme_next_line_break(3) can be used instead.
+	
+	RETURN VALUES
+	     The grapheme_next_line_break_utf8() function returns the offset (in
+	     bytes) to the next possible line break in str or 0 if str is NULL.
+	
+	EXAMPLES
+	     /* cc (-static) -o example example.c -lgrapheme */
+	     #include <grapheme.h>
+	     #include <stdint.h>
+	     #include <stdio.h>
+	
+	     int
+	     main(void)
+	     {
+		     /* UTF-8 encoded input */
+		     char *s = "T\xC3\xABst \xF0\x9F\x91\xA8\xE2\x80\x8D\xF0"
+			       "\x9F\x91\xA9\xE2\x80\x8D\xF0\x9F\x91\xA6 \xF0"
+			       "\x9F\x87\xBA\xF0\x9F\x87\xB8 \xE0\xA4\xA8\xE0"
+			       "\xA5\x80 \xE0\xAE\xA8\xE0\xAE\xBF!";
+		     size_t ret, len, off;
+	
+		     printf("Input: \"%s\"\n", s);
+	
+		     /* print each possible line with byte-length */
+		     printf("possible lines in NUL-delimited input:\n");
+		     for (off = 0; s[off] != '\0'; off += ret) {
+			     ret = grapheme_next_line_break_utf8(s + off, SIZE_MAX);
+			     printf("%2zu bytes | %.*s\n", ret, (int)ret, s + off, ret);
+		     }
+		     printf("\n");
+	
+		     /* do the same, but this time string is length-delimited */
+		     len = 17;
+		     printf("possible lines in input delimited to %zu bytes:\n", len);
+		     for (off = 0; off < len; off += ret) {
+			     ret = grapheme_next_line_break_utf8(s + off, len - off);
+			     printf("%2zu bytes | %.*s\n", ret, (int)ret, s + off, ret);
+		     }
+	
+		     return 0;
+	     }
+	
+	SEE ALSO
+	     grapheme_next_line_break(3), libgrapheme(7)
+	
+	STANDARDS
+	     grapheme_next_line_break_utf8() is compliant with the Unicode 15.0.0
+	     specification.
+	
+	AUTHORS
+	     Laslo Hunhold <dev@frign.de>
+	
+	suckless.org			  2022-10-06			  suckless.org
diff --git a/libs.suckless.org/libgrapheme/man/grapheme_next_sentence_break\(3\)/index.md b/libs.suckless.org/libgrapheme/man/grapheme_next_sentence_break\(3\)/index.md
@@ -0,0 +1,40 @@
+	GRAPHEME_NEXT_SENTENCE_BREAK(3)			      Library Functions Manual
+	
+	NAME
+	     grapheme_next_sentence_break – determine codepoint-offset to next
+	     sentence break
+	
+	SYNOPSIS
+	     #include <grapheme.h>
+	
+	     size_t
+	     grapheme_next_sentence_break(const uint_least32_t *str, size_t len);
+	
+	DESCRIPTION
+	     The grapheme_next_sentence_break() function computes the offset (in
+	     codepoints) to the next sentence break (see libgrapheme(7)) in the
+	     codepoint array str of length len.	 If a sentence begins at str this
+	     offset is equal to the length of said sentence.
+	
+	     If len is set to SIZE_MAX (stdint.h is already included by grapheme.h)
+	     the string str is interpreted to be NUL-terminated and processing stops
+	     when a codepoint with the value 0 is encountered.
+	
+	     For UTF-8-encoded input data grapheme_next_sentence_break_utf8(3) can be
+	     used instead.
+	
+	RETURN VALUES
+	     The grapheme_next_sentence_break() function returns the offset (in
+	     codepoints) to the next sentence break in str or 0 if str is NULL.
+	
+	SEE ALSO
+	     grapheme_next_sentence_break_utf8(3), libgrapheme(7)
+	
+	STANDARDS
+	     grapheme_next_sentence_break() is compliant with the Unicode 15.0.0
+	     specification.
+	
+	AUTHORS
+	     Laslo Hunhold <dev@frign.de>
+	
+	suckless.org			  2022-10-06			  suckless.org
diff --git a/libs.suckless.org/libgrapheme/man/grapheme_next_sentence_break_utf8\(3\)/index.md b/libs.suckless.org/libgrapheme/man/grapheme_next_sentence_break_utf8\(3\)/index.md
@@ -0,0 +1,77 @@
+	GRAPHEME_NEXT_SENTENCE_BREAK_UTF8(3)		      Library Functions Manual
+	
+	NAME
+	     grapheme_next_sentence_break_utf8 – determine byte-offset to next
+	     sentence break
+	
+	SYNOPSIS
+	     #include <grapheme.h>
+	
+	     size_t
+	     grapheme_next_sentence_break_utf8(const char *str, size_t len);
+	
+	DESCRIPTION
+	     The grapheme_next_sentence_break_utf8() function computes the offset (in
+	     bytes) to the next sentence break (see libgrapheme(7)) in the
+	     UTF-8-encoded string str of length len.  If a sentence begins at str this
+	     offset is equal to the length of said sentence.
+	
+	     If len is set to SIZE_MAX (stdint.h is already included by grapheme.h)
+	     the string str is interpreted to be NUL-terminated and processing stops
+	     when a NUL-byte is encountered.
+	
+	     For non-UTF-8 input data grapheme_next_sentence_break(3) can be used
+	     instead.
+	
+	RETURN VALUES
+	     The grapheme_next_sentence_break_utf8() function returns the offset (in
+	     bytes) to the next sentence break in str or 0 if str is NULL.
+	
+	EXAMPLES
+	     /* cc (-static) -o example example.c -lgrapheme */
+	     #include <grapheme.h>
+	     #include <stdint.h>
+	     #include <stdio.h>
+	
+	     int
+	     main(void)
+	     {
+		     /* UTF-8 encoded input */
+		     char *s = "T\xC3\xABst \xF0\x9F\x91\xA8\xE2\x80\x8D\xF0"
+			       "\x9F\x91\xA9\xE2\x80\x8D\xF0\x9F\x91\xA6 \xF0"
+			       "\x9F\x87\xBA\xF0\x9F\x87\xB8 \xE0\xA4\xA8\xE0"
+			       "\xA5\x80 \xE0\xAE\xA8\xE0\xAE\xBF!";
+		     size_t ret, len, off;
+	
+		     printf("Input: \"%s\"\n", s);
+	
+		     /* print each sentence with byte-length */
+		     printf("sentences in NUL-delimited input:\n");
+		     for (off = 0; s[off] != '\0'; off += ret) {
+			     ret = grapheme_next_sentence_break_utf8(s + off, SIZE_MAX);
+			     printf("%2zu bytes | %.*s\n", ret, (int)ret, s + off, ret);
+		     }
+		     printf("\n");
+	
+		     /* do the same, but this time string is length-delimited */
+		     len = 17;
+		     printf("sentences in input delimited to %zu bytes:\n", len);
+		     for (off = 0; off < len; off += ret) {
+			     ret = grapheme_next_sentence_break_utf8(s + off, len - off);
+			     printf("%2zu bytes | %.*s\n", ret, (int)ret, s + off, ret);
+		     }
+	
+		     return 0;
+	     }
+	
+	SEE ALSO
+	     grapheme_next_sentence_break(3), libgrapheme(7)
+	
+	STANDARDS
+	     grapheme_next_sentence_break_utf8() is compliant with the Unicode 15.0.0
+	     specification.
+	
+	AUTHORS
+	     Laslo Hunhold <dev@frign.de>
+	
+	suckless.org			  2022-10-06			  suckless.org
diff --git a/libs.suckless.org/libgrapheme/man/grapheme_next_word_break\(3\)/index.md b/libs.suckless.org/libgrapheme/man/grapheme_next_word_break\(3\)/index.md
@@ -0,0 +1,39 @@
+	GRAPHEME_NEXT_WORD_BREAK(3)			      Library Functions Manual
+	
+	NAME
+	     grapheme_next_word_break – determine codepoint-offset to next word break
+	
+	SYNOPSIS
+	     #include <grapheme.h>
+	
+	     size_t
+	     grapheme_next_word_break(const uint_least32_t *str, size_t len);
+	
+	DESCRIPTION
+	     The grapheme_next_word_break() function computes the offset (in
+	     codepoints) to the next word break (see libgrapheme(7)) in the codepoint
+	     array str of length len.  If a word begins at str this offset is equal to
+	     the length of said word.
+	
+	     If len is set to SIZE_MAX (stdint.h is already included by grapheme.h)
+	     the string str is interpreted to be NUL-terminated and processing stops
+	     when a codepoint with the value 0 is encountered.
+	
+	     For UTF-8-encoded input data grapheme_next_word_break_utf8(3) can be used
+	     instead.
+	
+	RETURN VALUES
+	     The grapheme_next_word_break() function returns the offset (in
+	     codepoints) to the next word break in str or 0 if str is NULL.
+	
+	SEE ALSO
+	     grapheme_next_word_break_utf8(3), libgrapheme(7)
+	
+	STANDARDS
+	     grapheme_next_word_break() is compliant with the Unicode 15.0.0
+	     specification.
+	
+	AUTHORS
+	     Laslo Hunhold <dev@frign.de>
+	
+	suckless.org			  2022-10-06			  suckless.org
diff --git a/libs.suckless.org/libgrapheme/man/grapheme_next_word_break_utf8\(3\)/index.md b/libs.suckless.org/libgrapheme/man/grapheme_next_word_break_utf8\(3\)/index.md
@@ -0,0 +1,75 @@
+	GRAPHEME_NEXT_WORD_BREAK_UTF8(3)		      Library Functions Manual
+	
+	NAME
+	     grapheme_next_word_break_utf8 – determine byte-offset to next word break
+	
+	SYNOPSIS
+	     #include <grapheme.h>
+	
+	     size_t
+	     grapheme_next_word_break_utf8(const char *str, size_t len);
+	
+	DESCRIPTION
+	     The grapheme_next_word_break_utf8() function computes the offset (in
+	     bytes) to the next word break (see libgrapheme(7)) in the UTF-8-encoded
+	     string str of length len.	If a word begins at str this offset is equal
+	     to the length of said word.
+	
+	     If len is set to SIZE_MAX (stdint.h is already included by grapheme.h)
+	     the string str is interpreted to be NUL-terminated and processing stops
+	     when a NUL-byte is encountered.
+	
+	     For non-UTF-8 input data grapheme_next_word_break(3) can be used instead.
+	
+	RETURN VALUES
+	     The grapheme_next_word_break_utf8() function returns the offset (in
+	     bytes) to the next word break in str or 0 if str is NULL.
+	
+	EXAMPLES
+	     /* cc (-static) -o example example.c -lgrapheme */
+	     #include <grapheme.h>
+	     #include <stdint.h>
+	     #include <stdio.h>
+	
+	     int
+	     main(void)
+	     {
+		     /* UTF-8 encoded input */
+		     char *s = "T\xC3\xABst \xF0\x9F\x91\xA8\xE2\x80\x8D\xF0"
+			       "\x9F\x91\xA9\xE2\x80\x8D\xF0\x9F\x91\xA6 \xF0"
+			       "\x9F\x87\xBA\xF0\x9F\x87\xB8 \xE0\xA4\xA8\xE0"
+			       "\xA5\x80 \xE0\xAE\xA8\xE0\xAE\xBF!";
+		     size_t ret, len, off;
+	
+		     printf("Input: \"%s\"\n", s);
+	
+		     /* print each word with byte-length */
+		     printf("words in NUL-delimited input:\n");
+		     for (off = 0; s[off] != '\0'; off += ret) {
+			     ret = grapheme_next_word_break_utf8(s + off, SIZE_MAX);
+			     printf("%2zu bytes | %.*s\n", ret, (int)ret, s + off, ret);
+		     }
+		     printf("\n");
+	
+		     /* do the same, but this time string is length-delimited */
+		     len = 17;
+		     printf("words in input delimited to %zu bytes:\n", len);
+		     for (off = 0; off < len; off += ret) {
+			     ret = grapheme_next_word_break_utf8(s + off, len - off);
+			     printf("%2zu bytes | %.*s\n", ret, (int)ret, s + off, ret);
+		     }
+	
+		     return 0;
+	     }
+	
+	SEE ALSO
+	     grapheme_next_word_break(3), libgrapheme(7)
+	
+	STANDARDS
+	     grapheme_next_word_break_utf8() is compliant with the Unicode 15.0.0
+	     specification.
+	
+	AUTHORS
+	     Laslo Hunhold <dev@frign.de>
+	
+	suckless.org			  2022-10-06			  suckless.org
diff --git a/libs.suckless.org/libgrapheme/man/grapheme_to_lowercase\(3\)/index.md b/libs.suckless.org/libgrapheme/man/grapheme_to_lowercase\(3\)/index.md
@@ -0,0 +1,40 @@
+	GRAPHEME_TO_LOWERCASE(3)   Library Functions Manual   GRAPHEME_TO_LOWERCASE(3)
+	
+	NAME
+	     grapheme_to_lowercase – convert codepoint array to lowercase
+	
+	SYNOPSIS
+	     #include <grapheme.h>
+	
+	     size_t
+	     grapheme_to_lowercase(const uint_least32_t *src, size_t srclen,
+		 uint_least32_t *dest, size_t destlen);
+	
+	DESCRIPTION
+	     The grapheme_to_lowercase() function converts the codepoint array str to
+	     lowercase and writes the result to dest up to destlen, unless dest is set
+	     to NULL.
+	
+	     If srclen is set to SIZE_MAX (stdint.h is already included by grapheme.h)
+	     the codepoint array src is interpreted to be NUL-terminated and
+	     processing stops when a NUL-byte is encountered.
+	
+	     For UTF-8-encoded input data grapheme_to_lowercase_utf8(3) can be used
+	     instead.
+	
+	RETURN VALUES
+	     The grapheme_to_lowercase() function returns the number of codepoints in
+	     the array resulting from converting src to lowercase, even if destlen is
+	     not large enough or dest is NULL.
+	
+	SEE ALSO
+	     grapheme_to_lowercase_utf8(3), libgrapheme(7)
+	
+	STANDARDS
+	     grapheme_to_lowercase() is compliant with the Unicode 15.0.0
+	     specification.
+	
+	AUTHORS
+	     Laslo Hunhold <dev@frign.de>
+	
+	suckless.org			  2022-10-06			  suckless.org
diff --git a/libs.suckless.org/libgrapheme/man/grapheme_to_lowercase_utf8\(3\)/index.md b/libs.suckless.org/libgrapheme/man/grapheme_to_lowercase_utf8\(3\)/index.md
@@ -0,0 +1,39 @@
+	GRAPHEME_TO_LOWERCASE_UTF8(3)			      Library Functions Manual
+	
+	NAME
+	     grapheme_to_lowercase_utf8 – convert UTF-8-encoded string to lowercase
+	
+	SYNOPSIS
+	     #include <grapheme.h>
+	
+	     size_t
+	     grapheme_to_lowercase_utf8(const char *src, size_t srclen, char *dest,
+		 size_t destlen);
+	
+	DESCRIPTION
+	     The grapheme_to_lowercase_utf8() function converts the UTF-8-encoded
+	     string str to lowercase and writes the result to dest up to destlen,
+	     unless dest is set to NULL.
+	
+	     If srclen is set to SIZE_MAX (stdint.h is already included by grapheme.h)
+	     the UTF-8-encoded string src is interpreted to be NUL-terminated and
+	     processing stops when a NUL-byte is encountered.
+	
+	     For non-UTF-8 input data grapheme_to_lowercase(3) can be used instead.
+	
+	RETURN VALUES
+	     The grapheme_to_lowercase_utf8() function returns the number of bytes in
+	     the array resulting from converting src to lowercase, even if destlen is
+	     not large enough or dest is NULL.
+	
+	SEE ALSO
+	     grapheme_to_lowercase(3), libgrapheme(7)
+	
+	STANDARDS
+	     grapheme_to_lowercase_utf8() is compliant with the Unicode 15.0.0
+	     specification.
+	
+	AUTHORS
+	     Laslo Hunhold <dev@frign.de>
+	
+	suckless.org			  2022-10-06			  suckless.org
diff --git a/libs.suckless.org/libgrapheme/man/grapheme_to_titlecase\(3\)/index.md b/libs.suckless.org/libgrapheme/man/grapheme_to_titlecase\(3\)/index.md
@@ -0,0 +1,40 @@
+	GRAPHEME_TO_TITLECASE(3)   Library Functions Manual   GRAPHEME_TO_TITLECASE(3)
+	
+	NAME
+	     grapheme_to_titlecase – convert codepoint array to titlecase
+	
+	SYNOPSIS
+	     #include <grapheme.h>
+	
+	     size_t
+	     grapheme_to_titlecase(const uint_least32_t *src, size_t srclen,
+		 uint_least32_t *dest, size_t destlen);
+	
+	DESCRIPTION
+	     The grapheme_to_titlecase() function converts the codepoint array str to
+	     titlecase and writes the result to dest up to destlen, unless dest is set
+	     to NULL.
+	
+	     If srclen is set to SIZE_MAX (stdint.h is already included by grapheme.h)
+	     the codepoint array src is interpreted to be NUL-terminated and
+	     processing stops when a NUL-byte is encountered.
+	
+	     For UTF-8-encoded input data grapheme_to_titlecase_utf8(3) can be used
+	     instead.
+	
+	RETURN VALUES
+	     The grapheme_to_titlecase() function returns the number of codepoints in
+	     the array resulting from converting src to titlecase, even if destlen is
+	     not large enough or dest is NULL.
+	
+	SEE ALSO
+	     grapheme_to_titlecase_utf8(3), libgrapheme(7)
+	
+	STANDARDS
+	     grapheme_to_titlecase() is compliant with the Unicode 15.0.0
+	     specification.
+	
+	AUTHORS
+	     Laslo Hunhold <dev@frign.de>
+	
+	suckless.org			  2022-10-06			  suckless.org
diff --git a/libs.suckless.org/libgrapheme/man/grapheme_to_titlecase_utf8\(3\)/index.md b/libs.suckless.org/libgrapheme/man/grapheme_to_titlecase_utf8\(3\)/index.md
@@ -0,0 +1,39 @@
+	GRAPHEME_TO_TITLECASE_UTF8(3)			      Library Functions Manual
+	
+	NAME
+	     grapheme_to_titlecase_utf8 – convert UTF-8-encoded string to titlecase
+	
+	SYNOPSIS
+	     #include <grapheme.h>
+	
+	     size_t
+	     grapheme_to_titlecase_utf8(const char *src, size_t srclen, char *dest,
+		 size_t destlen);
+	
+	DESCRIPTION
+	     The grapheme_to_titlecase_utf8() function converts the UTF-8-encoded
+	     string str to titlecase and writes the result to dest up to destlen,
+	     unless dest is set to NULL.
+	
+	     If srclen is set to SIZE_MAX (stdint.h is already included by grapheme.h)
+	     the UTF-8-encoded string src is interpreted to be NUL-terminated and
+	     processing stops when a NUL-byte is encountered.
+	
+	     For non-UTF-8 input data grapheme_to_titlecase(3) can be used instead.
+	
+	RETURN VALUES
+	     The grapheme_to_titlecase_utf8() function returns the number of bytes in
+	     the array resulting from converting src to titlecase, even if destlen is
+	     not large enough or dest is NULL.
+	
+	SEE ALSO
+	     grapheme_to_titlecase(3), libgrapheme(7)
+	
+	STANDARDS
+	     grapheme_to_titlecase_utf8() is compliant with the Unicode 15.0.0
+	     specification.
+	
+	AUTHORS
+	     Laslo Hunhold <dev@frign.de>
+	
+	suckless.org			  2022-10-06			  suckless.org
diff --git a/libs.suckless.org/libgrapheme/man/grapheme_to_uppercase\(3\)/index.md b/libs.suckless.org/libgrapheme/man/grapheme_to_uppercase\(3\)/index.md
@@ -0,0 +1,40 @@
+	GRAPHEME_TO_UPPERCASE(3)   Library Functions Manual   GRAPHEME_TO_UPPERCASE(3)
+	
+	NAME
+	     grapheme_to_uppercase – convert codepoint array to uppercase
+	
+	SYNOPSIS
+	     #include <grapheme.h>
+	
+	     size_t
+	     grapheme_to_uppercase(const uint_least32_t *src, size_t srclen,
+		 uint_least32_t *dest, size_t destlen);
+	
+	DESCRIPTION
+	     The grapheme_to_uppercase() function converts the codepoint array str to
+	     uppercase and writes the result to dest up to destlen, unless dest is set
+	     to NULL.
+	
+	     If srclen is set to SIZE_MAX (stdint.h is already included by grapheme.h)
+	     the codepoint array src is interpreted to be NUL-terminated and
+	     processing stops when a NUL-byte is encountered.
+	
+	     For UTF-8-encoded input data grapheme_to_uppercase_utf8(3) can be used
+	     instead.
+	
+	RETURN VALUES
+	     The grapheme_to_uppercase() function returns the number of codepoints in
+	     the array resulting from converting src to uppercase, even if destlen is
+	     not large enough or dest is NULL.
+	
+	SEE ALSO
+	     grapheme_to_uppercase_utf8(3), libgrapheme(7)
+	
+	STANDARDS
+	     grapheme_to_uppercase() is compliant with the Unicode 15.0.0
+	     specification.
+	
+	AUTHORS
+	     Laslo Hunhold <dev@frign.de>
+	
+	suckless.org			  2022-10-06			  suckless.org
diff --git a/libs.suckless.org/libgrapheme/man/grapheme_to_uppercase_utf8\(3\)/index.md b/libs.suckless.org/libgrapheme/man/grapheme_to_uppercase_utf8\(3\)/index.md
@@ -0,0 +1,39 @@
+	GRAPHEME_TO_LOWERCASE_UTF8(3)			      Library Functions Manual
+	
+	NAME
+	     grapheme_to_lowercase_utf8 – convert UTF-8-encoded string to lowercase
+	
+	SYNOPSIS
+	     #include <grapheme.h>
+	
+	     size_t
+	     grapheme_to_lowercase_utf8(const char *src, size_t srclen, char *dest,
+		 size_t destlen);
+	
+	DESCRIPTION
+	     The grapheme_to_lowercase_utf8() function converts the UTF-8-encoded
+	     string str to lowercase and writes the result to dest up to destlen,
+	     unless dest is set to NULL.
+	
+	     If srclen is set to SIZE_MAX (stdint.h is already included by grapheme.h)
+	     the UTF-8-encoded string src is interpreted to be NUL-terminated and
+	     processing stops when a NUL-byte is encountered.
+	
+	     For non-UTF-8 input data grapheme_to_lowercase(3) can be used instead.
+	
+	RETURN VALUES
+	     The grapheme_to_lowercase_utf8() function returns the number of bytes in
+	     the array resulting from converting src to lowercase, even if destlen is
+	     not large enough or dest is NULL.
+	
+	SEE ALSO
+	     grapheme_to_lowercase(3), libgrapheme(7)
+	
+	STANDARDS
+	     grapheme_to_lowercase_utf8() is compliant with the Unicode 15.0.0
+	     specification.
+	
+	AUTHORS
+	     Laslo Hunhold <dev@frign.de>
+	
+	suckless.org			  2022-10-06			  suckless.org
diff --git a/libs.suckless.org/libgrapheme/man/libgrapheme\(7\)/index.md b/libs.suckless.org/libgrapheme/man/libgrapheme\(7\)/index.md
@@ -0,0 +1,122 @@
+	LIBGRAPHEME(7)	       Miscellaneous Information Manual		LIBGRAPHEME(7)
+	
+	NAME
+	     libgrapheme – unicode string library
+	
+	SYNOPSIS
+	     #include <grapheme.h>
+	
+	DESCRIPTION
+	     The libgrapheme library provides functions to properly handle Unicode
+	     strings according to the Unicode specification in regard to character,
+	     word, sentence and line segmentation and case detection and conversion.
+	
+	     Unicode strings are made up of user-perceived characters (so-called
+	     “grapheme clusters”, see MOTIVATION) that are composed of one or more
+	     Unicode codepoints, which in turn are encoded in one or more bytes in an
+	     encoding like UTF-8.
+	
+	     There is a widespread misconception that it was enough to simply
+	     determine codepoints in a string and treat them as user-perceived
+	     characters to be Unicode compliant.  While this may work in some cases,
+	     this assumption quickly breaks, especially for non-Western languages and
+	     decomposed Unicode strings where user-perceived characters are usually
+	     represented using multiple codepoints.
+	
+	     Despite this complicated multilevel structure of Unicode strings,
+	     libgrapheme provides methods to work with them at the byte-level (i.e.
+	     UTF-8 ‘char’ arrays) while also offering codepoint-level methods.
+	     Additionally, it is a “freestanding” library (see ISO/IEC 9899:1999
+	     section 4.6) and thus does not depend on a standard library. This makes
+	     it easy to use in bare metal environments.
+	
+	     Every documented function's manual page provides a self-contained example
+	     illustrating the possible usage.
+	
+	SEE ALSO
+	     grapheme_decode_utf8(3), grapheme_encode_utf8(3),
+	     grapheme_is_character_break(3), grapheme_is_lowercase(3),
+	     grapheme_is_lowercase_utf8(3), grapheme_is_titlecase(3),
+	     grapheme_is_titlecase_utf8(3), grapheme_is_uppercase(3),
+	     grapheme_is_uppercase_utf8(3), grapheme_next_character_break(3),
+	     grapheme_next_character_break_utf8(3), grapheme_next_line_break(3),
+	     grapheme_next_line_break_utf8(3), grapheme_next_sentence_break(3),
+	     grapheme_next_sentence_break_utf8(3), grapheme_next_word_break(3),
+	     grapheme_next_word_break_utf8(3), grapheme_to_lowercase(3),
+	     grapheme_to_lowercase_utf8(3), grapheme_to_titlecase(3),
+	     grapheme_to_titlecase_utf8(3) grapheme_to_uppercase(3),
+	     grapheme_to_uppercase_utf8(3),
+	
+	STANDARDS
+	     libgrapheme is compliant with the Unicode 15.0.0 specification.
+	
+	MOTIVATION
+	     The idea behind every character encoding scheme like ASCII or Unicode is
+	     to express abstract characters (which can be thought of as shapes making
+	     up a written language). ASCII for instance, which comprises the range 0
+	     to 127, assigns the number 65 (0x41) to the abstract character ‘A’.  This
+	     number is called a “codepoint”, and all codepoints of an encoding make up
+	     its so-called “code space”.
+	
+	     Unicode's code space is much larger, ranging from 0 to 0x10FFFF, but its
+	     first 128 codepoints are identical to ASCII's. The additional code points
+	     are needed as Unicode's goal is to express all writing systems of the
+	     world.  To give an example, the abstract character ‘Ä’ is not expressable
+	     in ASCII, given no ASCII codepoint has been assigned to it.  It can be
+	     expressed in Unicode, though, with the codepoint 196 (0xC4).
+	
+	     One may assume that this process is straightfoward, but as more and more
+	     codepoints were assigned to abstract characters, the Unicode Consortium
+	     (that defines the Unicode standard) was facing a problem: Many (mostly
+	     non-European) languages have such a large amount of abstract characters
+	     that it would exhaust the available Unicode code space if one tried to
+	     assign a codepoint to each abstract character.  The solution to that
+	     problem is best introduced with an example: Consider the abstract
+	     character ‘Ǟ’, which is ‘A’ with an umlaut and a macron added to it.  In
+	     this sense, one can consider ‘Ǟ’ as a two-fold modification (namely “add
+	     umlaut” and “add macron”) of the “base character” ‘A’.
+	
+	     The Unicode Consortium adapted this idea by assigning codepoints to
+	     modifications.  For example, the codepoint 0x308 represents adding an
+	     umlaut and 0x304 represents adding a macron, and thus, the codepoint
+	     sequence “0x41 0x308 0x304”, namely the base character ‘A’ followed by
+	     the umlaut and macron modifiers, represents the abstract character ‘Ǟ’.
+	     As a side-note, the single codepoint 0x1DE was also assigned to ‘Ǟ’,
+	     which is a good example for the fact that there can be multiple
+	     representations of a single abstract character in Unicode.
+	
+	     Expressing a single abstract character with multiple codepoints solved
+	     the code space exhaustion-problem, and the concept has been greatly
+	     expanded since its first introduction (emojis, joiners, etc.). A sequence
+	     (which can also have the length 1) of codepoints that belong together
+	     this way and represents an abstract character is called a “grapheme
+	     cluster”.
+	
+	     In many applications it is necessary to count the number of user-
+	     perceived characters, i.e. grapheme clusters, in a string.	 A good
+	     example for this is a terminal text editor, which needs to properly align
+	     characters on a grid.  This is pretty simple with ASCII-strings, where
+	     you just count the number of bytes (as each byte is a codepoint and each
+	     codepoint is a grapheme cluster).	With Unicode-strings, it is a common
+	     mistake to simply adapt the ASCII-approach and count the number of code
+	     points.  This is wrong, as, for example, the sequence “0x41 0x308 0x304”,
+	     while made up of 3 codepoints, is a single grapheme cluster and
+	     represents the user-perceived character ‘Ǟ’.
+	
+	     The proper way to segment a string into user-perceived characters is to
+	     segment it into its grapheme clusters by applying the Unicode grapheme
+	     cluster breaking algorithm (UAX #29).  It is based on a complex ruleset
+	     and lookup-tables and determines if a grapheme cluster ends or is
+	     continued between two codepoints.	Libraries like ICU and libunistring,
+	     which also offer this functionality, are often bloated, not correct,
+	     difficult to use or not reasonably statically linkable.
+	
+	     Analogously, the standard provides algorithms to separate strings by
+	     words, sentences and lines, convert cases and compare strings.  The
+	     motivation behind libgrapheme is to make unicode handling suck less and
+	     abide by the UNIX philosophy.
+	
+	AUTHORS
+	     Laslo Hunhold <dev@frign.de>
+	
+	suckless.org			  2022-10-06			  suckless.org

	sites public wiki contents of suckless.org
	git clone git://git.suckless.org/sites
	Log \| Files \| Refs

M	libs.suckless.org/libgrapheme/index.md	\|	135	++++++++++++++++++++++++++++++++++++++++++++++++-------------------------------
A	libs.suckless.org/libgrapheme/man/grapheme_decode_utf8\(3\)/index.md	\|	80	+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
A	libs.suckless.org/libgrapheme/man/grapheme_encode_utf8\(3\)/index.md	\|	87	+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
A	libs.suckless.org/libgrapheme/man/grapheme_is_character_break\(3\)/index.md	\|	69	+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
A	libs.suckless.org/libgrapheme/man/grapheme_is_lowercase\(3\)/index.md	\|	39	+++++++++++++++++++++++++++++++++++++++
A	libs.suckless.org/libgrapheme/man/grapheme_is_lowercase_utf8\(3\)/index.md	\|	38	++++++++++++++++++++++++++++++++++++++
A	libs.suckless.org/libgrapheme/man/grapheme_is_titlecase\(3\)/index.md	\|	39	+++++++++++++++++++++++++++++++++++++++
A	libs.suckless.org/libgrapheme/man/grapheme_is_titlecase_utf8\(3\)/index.md	\|	38	++++++++++++++++++++++++++++++++++++++
A	libs.suckless.org/libgrapheme/man/grapheme_is_uppercase\(3\)/index.md	\|	39	+++++++++++++++++++++++++++++++++++++++
A	libs.suckless.org/libgrapheme/man/grapheme_is_uppercase_utf8\(3\)/index.md	\|	38	++++++++++++++++++++++++++++++++++++++
A	libs.suckless.org/libgrapheme/man/grapheme_next_character_break\(3\)/index.md	\|	42	++++++++++++++++++++++++++++++++++++++++++
A	libs.suckless.org/libgrapheme/man/grapheme_next_character_break_utf8\(3\)/index.md	\|	77	+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
A	libs.suckless.org/libgrapheme/man/grapheme_next_line_break\(3\)/index.md	\|	39	+++++++++++++++++++++++++++++++++++++++
A	libs.suckless.org/libgrapheme/man/grapheme_next_line_break_utf8\(3\)/index.md	\|	75	+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
A	libs.suckless.org/libgrapheme/man/grapheme_next_sentence_break\(3\)/index.md	\|	40	++++++++++++++++++++++++++++++++++++++++
A	libs.suckless.org/libgrapheme/man/grapheme_next_sentence_break_utf8\(3\)/index.md	\|	77	+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
A	libs.suckless.org/libgrapheme/man/grapheme_next_word_break\(3\)/index.md	\|	39	+++++++++++++++++++++++++++++++++++++++
A	libs.suckless.org/libgrapheme/man/grapheme_next_word_break_utf8\(3\)/index.md	\|	75	+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
A	libs.suckless.org/libgrapheme/man/grapheme_to_lowercase\(3\)/index.md	\|	40	++++++++++++++++++++++++++++++++++++++++
A	libs.suckless.org/libgrapheme/man/grapheme_to_lowercase_utf8\(3\)/index.md	\|	39	+++++++++++++++++++++++++++++++++++++++
A	libs.suckless.org/libgrapheme/man/grapheme_to_titlecase\(3\)/index.md	\|	40	++++++++++++++++++++++++++++++++++++++++
A	libs.suckless.org/libgrapheme/man/grapheme_to_titlecase_utf8\(3\)/index.md	\|	39	+++++++++++++++++++++++++++++++++++++++
A	libs.suckless.org/libgrapheme/man/grapheme_to_uppercase\(3\)/index.md	\|	40	++++++++++++++++++++++++++++++++++++++++
A	libs.suckless.org/libgrapheme/man/grapheme_to_uppercase_utf8\(3\)/index.md	\|	39	+++++++++++++++++++++++++++++++++++++++
A	libs.suckless.org/libgrapheme/man/libgrapheme\(7\)/index.md	\|	122	+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++