#character-encoding #unicode #charset #unicode-encoding #charset-encoding

textcode

Text encoding/decoding library. Supports: UTF-8, ISO6937, ISO8859, GB2312

8 releases

Uses new Rust 2024

0.3.1 Dec 15, 2025
0.3.0 Dec 15, 2025
0.2.2 May 2, 2022
0.2.1 Nov 3, 2020
0.1.0 Nov 18, 2019

#254 in Encoding

Download history 318/week @ 2025-10-15 320/week @ 2025-10-22 189/week @ 2025-10-29 185/week @ 2025-11-05 146/week @ 2025-11-12 292/week @ 2025-11-19 254/week @ 2025-11-26 141/week @ 2025-12-03 118/week @ 2025-12-10 177/week @ 2025-12-17 95/week @ 2025-12-24 84/week @ 2025-12-31 100/week @ 2026-01-07 164/week @ 2026-01-14 220/week @ 2026-01-21 309/week @ 2026-01-28

809 downloads per month
Used in 10 crates (4 directly)

MIT license

460KB
8K SLoC

textcode

docs

Intro

Textcode is a library for text encoding/decoding.

The library uses non-strict conversion: invalid or unmappable characters are replaced with ?.

⚠️ Breaking change in v0.3.0

The library API has been completely redesigned:

Old API (v0.2.x): module-based functions

use textcode::iso8859_5;

let mut text = String::new();
iso8859_5::decode(b"\xbf\xe0\xd8\xd2\xd5\xe2!", &mut text);

let mut bytes = Vec::new();
iso8859_5::encode("Привет!", &mut bytes);

New API (v0.3.x): generic functions with codec types

use textcode::{Iso8859_5, decode, encode};

let text = decode::<Iso8859_5>(b"\xbf\xe0\xd8\xd2\xd5\xe2!");

let bytes = encode::<Iso8859_5>("Привет!");

Charsets

  • UTF-8
  • UTF-16 - Decoding BE and LE with BOM, encoding BE without BOM
  • iso-6937 - Latin superset of ISO/IEC 6937 with Euro and letters with diacritics
  • iso-8859-1 - Western European
  • iso-8859-2 - Central European
  • iso-8859-3 - South European
  • iso-8859-4 - North European
  • iso-8859-5 - Cyrillic
  • iso-8859-6 - Arabic
  • iso-8859-7 - Greek
  • iso-8859-8 - Hebrew
  • iso-8859-9 - Turkish
  • iso-8859-10 - Nordic
  • iso-8859-11 - Thai
  • iso-8859-13 - Baltic Rim
  • iso-8859-14 - Celtic
  • iso-8859-15 - Western European
  • iso-8859-16 - South-Eastern European
  • gb2312 - Simplified Chinese
  • Geo - DVB single-byte Georgian character encoding (Magti TV)

Example

use textcode::{Iso8859_5, decode, encode};

const UTF8: &str = "Привет!";
const ISO8859_5: &[u8] = &[0xbf, 0xe0, 0xd8, 0xd2, 0xd5, 0xe2, 0x21];

let text = decode::<Iso8859_5>(ISO8859_5);
assert_eq!(text, UTF8);

let bytes = encode::<Iso8859_5>(UTF8);
assert_eq!(bytes, ISO8859_5);

No runtime deps