edlang/unicode_width/index.html
2024-05-05 09:43:20 +00:00

65 lines
11 KiB
HTML
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

<!DOCTYPE html><html lang="en"><head><meta charset="utf-8"><meta name="viewport" content="width=device-width, initial-scale=1.0"><meta name="generator" content="rustdoc"><meta name="description" content="Determine displayed width of `char` and `str` types according to Unicode Standard Annex #11, other portions of the Unicode standard, and common implementations of POSIX `wcwidth()`. See the Rules for determining width section for the exact rules."><title>unicode_width - Rust</title><script> if (window.location.protocol !== "file:") document.write(`<link rel="preload" as="font" type="font/woff2" crossorigin href="../static.files/SourceSerif4-Regular-46f98efaafac5295.ttf.woff2"><link rel="preload" as="font" type="font/woff2" crossorigin href="../static.files/FiraSans-Regular-018c141bf0843ffd.woff2"><link rel="preload" as="font" type="font/woff2" crossorigin href="../static.files/FiraSans-Medium-8f9a781e4970d388.woff2"><link rel="preload" as="font" type="font/woff2" crossorigin href="../static.files/SourceCodePro-Regular-562dcc5011b6de7d.ttf.woff2"><link rel="preload" as="font" type="font/woff2" crossorigin href="../static.files/SourceCodePro-Semibold-d899c5a5c4aeb14a.ttf.woff2">`)</script><link rel="stylesheet" href="../static.files/normalize-76eba96aa4d2e634.css"><link rel="stylesheet" href="../static.files/rustdoc-e935ef01ae1c1829.css"><meta name="rustdoc-vars" data-root-path="../" data-static-root-path="../static.files/" data-current-crate="unicode_width" data-themes="" data-resource-suffix="" data-rustdoc-version="1.78.0 (9b00956e5 2024-04-29)" data-channel="1.78.0" data-search-js="search-42d8da7a6b9792c2.js" data-settings-js="settings-4313503d2e1961c2.js" ><script src="../static.files/storage-4c98445ec4002617.js"></script><script defer src="../crates.js"></script><script defer src="../static.files/main-12cf3b4f4f9dc36d.js"></script><noscript><link rel="stylesheet" href="../static.files/noscript-04d5337699b92874.css"></noscript><link rel="icon" href="https://unicode-rs.github.io/unicode-rs_sm.png"></head><body class="rustdoc mod crate"><!--[if lte IE 11]><div class="warning">This old browser is unsupported and will most likely display funky things.</div><![endif]--><nav class="mobile-topbar"><button class="sidebar-menu-toggle" title="show sidebar"></button><a class="logo-container" href="../unicode_width/index.html"><img src="https://unicode-rs.github.io/unicode-rs_sm.png" alt=""></a></nav><nav class="sidebar"><div class="sidebar-crate"><a class="logo-container" href="../unicode_width/index.html"><img src="https://unicode-rs.github.io/unicode-rs_sm.png" alt="logo"></a><h2><a href="../unicode_width/index.html">unicode_width</a><span class="version">0.1.12</span></h2></div><div class="sidebar-elems"><ul class="block">
<li><a id="all-types" href="all.html">All Items</a></li></ul><section><ul class="block"><li><a href="#constants">Constants</a></li><li><a href="#traits">Traits</a></li></ul></section></div></nav><div class="sidebar-resizer"></div>
<main><div class="width-limiter"><nav class="sub"><form class="search-form"><span></span><div id="sidebar-button" tabindex="-1"><a href="../unicode_width/all.html" title="show sidebar"></a></div><input class="search-input" name="search" aria-label="Run search in the documentation" autocomplete="off" spellcheck="false" placeholder="Click or press S to search, ? for more options…" type="search"><div id="help-button" tabindex="-1"><a href="../help.html" title="help">?</a></div><div id="settings-menu" tabindex="-1"><a href="../settings.html" title="settings"><img width="22" height="22" alt="Change settings" src="../static.files/wheel-7b819b6101059cd0.svg"></a></div></form></nav><section id="main-content" class="content"><div class="main-heading"><h1>Crate <a class="mod" href="#">unicode_width</a><button id="copy-path" title="Copy item path to clipboard"><img src="../static.files/clipboard-7571035ce49a181d.svg" width="19" height="18" alt="Copy item path"></button></h1><span class="out-of-band"><a class="src" href="../src/unicode_width/lib.rs.html#11-177">source</a> · <button id="toggle-all-docs" title="collapse all docs">[<span>&#x2212;</span>]</button></span></div><details class="toggle top-doc" open><summary class="hideme"><span>Expand description</span></summary><div class="docblock"><p>Determine displayed width of <code>char</code> and <code>str</code> types according to
<a href="http://www.unicode.org/reports/tr11/">Unicode Standard Annex #11</a>,
other portions of the Unicode standard, and common implementations of
POSIX <a href="https://pubs.opengroup.org/onlinepubs/9699919799/"><code>wcwidth()</code></a>.
See the <a href="#rules-for-determining-width">Rules for determining width</a> section
for the exact rules.</p>
<p>This crate is <code>#![no_std]</code>.</p>
<div class="example-wrap"><pre class="rust rust-example-rendered"><code><span class="kw">use </span>unicode_width::UnicodeWidthStr;
<span class="kw">let </span>teststr = <span class="string">", !"</span>;
<span class="kw">let </span>width = UnicodeWidthStr::width(teststr);
<span class="macro">println!</span>(<span class="string">"{}"</span>, teststr);
<span class="macro">println!</span>(<span class="string">"The above string is {} columns wide."</span>, width);
<span class="kw">let </span>width = teststr.width_cjk();
<span class="macro">println!</span>(<span class="string">"The above string is {} columns wide (CJK)."</span>, width);</code></pre></div>
<h2 id="rules-for-determining-width"><a class="doc-anchor" href="#rules-for-determining-width">§</a>Rules for determining width</h2>
<p>This crate currently uses the following rules to determine the width of a
character or string, in order of decreasing precedence. These may be tweaked in the future.</p>
<ol>
<li><a href="https://unicode.org/reports/tr51/#def_emoji_presentation_sequence">Emoji presentation sequences</a>
have width 2. (The width of a string may therefore differ from the sum of the widths of its characters.)</li>
<li><a href="https://util.unicode.org/UnicodeJsps/character.jsp?a=00AD"><code>'\u{00AD}'</code> SOFT HYPHEN</a> has width 1.</li>
<li><a href="https://util.unicode.org/UnicodeJsps/character.jsp?a=115F"><code>'\u{115F}'</code> HANGUL CHOSEONG FILLER</a> has width 2.</li>
<li>The following have width 0:
<ul>
<li><a href="https://util.unicode.org/UnicodeJsps/list-unicodeset.jsp?a=%5Cp%7BDefault_Ignorable_Code_Point%7D">Characters</a>
with the <a href="https://www.unicode.org/versions/Unicode15.0.0/ch05.pdf#G40095"><code>Default_Ignorable_Code_Point</code></a> property.</li>
<li><a href="https://util.unicode.org/UnicodeJsps/list-unicodeset.jsp?a=%5Cp%7BGrapheme_Extend%7D">Characters</a>
with the <a href="https://www.unicode.org/versions/Unicode15.0.0/ch03.pdf#G52443"><code>Grapheme_Extend</code></a> property.</li>
<li>The following 8 characters, all of which have NFD decompositions consisting of two <a href="https://www.unicode.org/versions/Unicode15.0.0/ch03.pdf#G52443"><code>Grapheme_Extend</code></a> chracters:
<ul>
<li><a href="https://util.unicode.org/UnicodeJsps/character.jsp?a=0CC0"><code>'\u{0CC0}'</code> KANNADA VOWEL SIGN II</a>,</li>
<li><a href="https://util.unicode.org/UnicodeJsps/character.jsp?a=0CC7"><code>'\u{0CC7}'</code> KANNADA VOWEL SIGN EE</a>,</li>
<li><a href="https://util.unicode.org/UnicodeJsps/character.jsp?a=0CC8"><code>'\u{0CC8}'</code> KANNADA VOWEL SIGN AI</a>,</li>
<li><a href="https://util.unicode.org/UnicodeJsps/character.jsp?a=0CCA"><code>'\u{0CCA}'</code> KANNADA VOWEL SIGN O</a>,</li>
<li><a href="https://util.unicode.org/UnicodeJsps/character.jsp?a=0CCB"><code>'\u{0CCB}'</code> KANNADA VOWEL SIGN OO</a>,</li>
<li><a href="https://util.unicode.org/UnicodeJsps/character.jsp?a=1B3B"><code>'\u{1B3B}'</code> BALINESE VOWEL SIGN RA REPA TEDUNG</a>,</li>
<li><a href="https://util.unicode.org/UnicodeJsps/character.jsp?a=1B3D"><code>'\u{1B3D}'</code> BALINESE VOWEL SIGN LA LENGA TEDUNG</a>, and</li>
<li><a href="https://util.unicode.org/UnicodeJsps/character.jsp?a=1B43"><code>'\u{1B43}'</code> BALINESE VOWEL SIGN PEPET TEDUNG</a>.</li>
</ul>
</li>
<li><a href="https://util.unicode.org/UnicodeJsps/list-unicodeset.jsp?a=%5Cp%7BHangul_Syllable_Type%3DV%7D%5Cp%7BHangul_Syllable_Type%3DT%7D">Characters</a>
with a <a href="https://www.unicode.org/versions/Unicode15.0.0/ch03.pdf#G45593"><code>Hangul_Syllable_Type</code></a>
of <code>Vowel_Jamo</code> (<code>V</code>) or <code>Trailing_Jamo</code> (<code>T</code>).</li>
<li><a href="https://util.unicode.org/UnicodeJsps/character.jsp?a=0000"><code>'\0'</code> NUL</a>.</li>
</ul>
</li>
<li>The <a href="https://util.unicode.org/UnicodeJsps/list-unicodeset.jsp?a=%5Cp%7BCc%7D">control characters</a>
have no defined width, and are ignored when determining the width of a string.</li>
<li><a href="https://util.unicode.org/UnicodeJsps/list-unicodeset.jsp?a=%5Cp%7BEast_Asian_Width%3DF%7D%5Cp%7BEast_Asian_Width%3DW%7D">Characters</a>
with an <a href="https://www.unicode.org/reports/tr11/#ED1"><code>East_Asian_Width</code></a> of <a href="https://www.unicode.org/reports/tr11/#ED2"><code>Fullwidth</code> (<code>F</code>)</a>
or <a href="https://www.unicode.org/reports/tr11/#ED4"><code>Wide</code> (<code>W</code>)</a> have width 2.</li>
<li><a href="https://util.unicode.org/UnicodeJsps/list-unicodeset.jsp?a=%5Cp%7BEast_Asian_Width%3DA%7D">Characters</a>
with an <a href="https://www.unicode.org/reports/tr11/#ED1"><code>East_Asian_Width</code></a> of <a href="https://www.unicode.org/reports/tr11/#ED6"><code>Ambiguous</code> (<code>A</code>)</a>
have width 2 in an East Asian context, and width 1 otherwise.</li>
<li>All other characters have width 1.</li>
</ol>
<h3 id="canonical-equivalence"><a class="doc-anchor" href="#canonical-equivalence">§</a>Canonical equivalence</h3>
<p>The non-CJK width methods guarantee that canonically equivalent strings are assigned the same width.
However, this guarantee does not currently hold for the CJK width variants.</p>
</div></details><h2 id="constants" class="section-header">Constants<a href="#constants" class="anchor">§</a></h2><ul class="item-table"><li><div class="item-name"><a class="constant" href="constant.UNICODE_VERSION.html" title="constant unicode_width::UNICODE_VERSION">UNICODE_VERSION</a></div><div class="desc docblock-short">The version of <a href="http://www.unicode.org/">Unicode</a>
that this version of unicode-width is based on.</div></li></ul><h2 id="traits" class="section-header">Traits<a href="#traits" class="anchor">§</a></h2><ul class="item-table"><li><div class="item-name"><a class="trait" href="trait.UnicodeWidthChar.html" title="trait unicode_width::UnicodeWidthChar">UnicodeWidthChar</a></div><div class="desc docblock-short">Methods for determining displayed width of Unicode characters.</div></li><li><div class="item-name"><a class="trait" href="trait.UnicodeWidthStr.html" title="trait unicode_width::UnicodeWidthStr">UnicodeWidthStr</a></div><div class="desc docblock-short">Methods for determining displayed width of Unicode strings.</div></li></ul></section></div></main></body></html>