MAN.9FRONT.ORG RTFM


     RUNECOMP(2)                                           RUNECOMP(2)

     NAME
          runecomp, runedecomp, fullrunenorm, runegbreak, runewbreak,
          utfcomp, utfdecomp, fullutfnorm, utfgbreak, utfwbreak -
          multi-rune graphemes

     SYNOPSIS
          #include <u.h>
          #include <libc.h>

          int    runecomp(Rune *dst, Rune *src, int max)

          int    runedecomp(Rune *dst, Rune *src, int max)

          Rune*  fullrunenorm(Rune *s, int n)

          Rune*  runegbreak(Rune *s)

          Rune*  runewbreak(Rune *s)

          int    utfcomp(char *dst, char *src, int max)

          int    utfdecomp(char *dst, char *src, int max)

          char*  fullutfnorm(char *s, int n)

          char*  utfgbreak(char *s)

          char*  utfwbreak(char *s)

     DESCRIPTION
          These routines help in handling graphemes that may span mul-
          tiple runes.  These routines are for use in font rendering
          and advanced text search; most programs do not need to per-
          form normalization.

          Runecomp, runedecomp, utfcomp, and utfdecomp perform Uni-
          code® normalization on src, storing the result in dst. No
          more than max elements will be written, and the resulting
          string will always be null terminated. The return value is
          always the total number of elements required to store the
          transformation. If this value is larger than the supplied
          max the caller can assume the result has been truncated.
          Runecomp and utfcomp perform NFC normalization while
          runedecomp and utfdecomp perform NFD normalization.

          Fullrunenorm, and fullutfnorm determine if enough elements
          are present in s to perform normalization. If enough are
          present, a pointer is returned to the first element that
          begins the next context. Otherwise s is returned. No more
          then n elements will be read. In order to find the boundary,

     RUNECOMP(2)                                           RUNECOMP(2)

          the first element of the next context must be peeked.

          Runegbreak and utfgbreak search s for the next grapheme
          break opportunity.  If none is found before the end of the
          string, s is returned.

          Runewbreak and utfwbreak search s for the next word break
          opportunity.  If none is found before the end of the string,
          s is returned.

     SOURCE
          /sys/src/libc/port/mkrunetype.c
          /sys/src/libc/port/runenorm.c
          /sys/src/libc/port/runebreak.c

     SEE ALSO
          Unicode® Standard Annex #15
          Unicode® Standard Annex #29
          rune(2), utf(6), tcs(1)

     HISTORY
          This implementation was written for 9front (March, 2023).