Module unicode

Source
Expand description

Utilities for Unicode, including:

  • Conversion of scalars to surrogates and vice versa
  • UCS-2 encoding and decoding from scalars

Constants§

UNICODE_HISTORIC_MAX
UNICODE_MAX

Functions§

combine_surrogates 🔒
Combines surrogate pairs in an iterator of code units into Unicode scalar values. Unpaired surrogates are left as-is.
js_like_slice_utf16
scalars_to_string_lossy
scalars_to_unpaired_surrogates
Converts a slice of Unicode scalar values (u32) to unpaired surrogates. Codepoints above U+FFFF are split into surrogate pairs. Returns an error for invalid codepoints.
string_to_scalars
u16_vec_to_le_bytes
ucs2decode
Decodes a slice of UCS-2/UTF-16 code units (u16) into Unicode scalar values (u32). Surrogate pairs are combined into scalars; unpaired surrogates are left as-is.
ucs2encode
Encodes an array of Unicode scalar values (u32) to UTF-16/UCS-2 code units (Vec<u16>). Surrogate pairs are generated for codepoints above U+FFFF. Returns an error for invalid codepoints.
unpaired_surrogates_to_scalars
Decodes a slice of code units (u32), combining surrogate pairs into scalars. Unpaired surrogates are left as-is.