Struct unicode_bidi::utf16::BidiInfo
source · pub struct BidiInfo<'text> {
pub text: &'text [u16],
pub original_classes: Vec<BidiClass>,
pub levels: Vec<Level>,
pub paragraphs: Vec<ParagraphInfo>,
}
Expand description
Bidi information of the text (UTF-16 version).
The original_classes
and levels
vectors are indexed by code unit offsets into the text. If a
character is multiple code units wide, then its class and level will appear multiple times in these
vectors.
Fields§
§text: &'text [u16]
The text
original_classes: Vec<BidiClass>
The BidiClass of the character at each byte in the text.
levels: Vec<Level>
The directional embedding level of each byte in the text.
paragraphs: Vec<ParagraphInfo>
The boundaries and paragraph embedding level of each paragraph within the text.
TODO: Use SmallVec or similar to avoid overhead when there are only one or two paragraphs? Or just don’t include the first paragraph, which always starts at 0?
Implementations§
source§impl<'text> BidiInfo<'text>
impl<'text> BidiInfo<'text>
sourcepub fn new(text: &[u16], default_para_level: Option<Level>) -> BidiInfo<'_>
pub fn new(text: &[u16], default_para_level: Option<Level>) -> BidiInfo<'_>
Split the text into paragraphs and determine the bidi embedding levels for each paragraph.
The hardcoded-data
Cargo feature (enabled by default) must be enabled to use this.
TODO: In early steps, check for special cases that allow later steps to be skipped. like
text that is entirely LTR. See the nsBidi
class from Gecko for comparison.
TODO: Support auto-RTL base direction
sourcepub fn new_with_data_source<'a, D: BidiDataSource>(
data_source: &D,
text: &'a [u16],
default_para_level: Option<Level>
) -> BidiInfo<'a>
pub fn new_with_data_source<'a, D: BidiDataSource>( data_source: &D, text: &'a [u16], default_para_level: Option<Level> ) -> BidiInfo<'a>
Split the text into paragraphs and determine the bidi embedding levels for each paragraph, with a custom BidiDataSource
for Bidi data. If you just wish to use the hardcoded Bidi data, please use BidiInfo::new()
instead (enabled with tbe default hardcoded-data
Cargo feature).
TODO: In early steps, check for special cases that allow later steps to be skipped. like
text that is entirely LTR. See the nsBidi
class from Gecko for comparison.
TODO: Support auto-RTL base direction
sourcepub fn reordered_levels(
&self,
para: &ParagraphInfo,
line: Range<usize>
) -> Vec<Level>
pub fn reordered_levels( &self, para: &ParagraphInfo, line: Range<usize> ) -> Vec<Level>
Produce the levels for this paragraph as needed for reordering, one level per byte
in the paragraph. The returned vector includes bytes that are not included
in the line
, but will not adjust them.
This runs Rule L1, you can run
Rule L2 by calling Self::reorder_visual()
.
If doing so, you may prefer to use Self::reordered_levels_per_char()
instead
to avoid non-byte indices.
For an all-in-one reordering solution, consider using Self::reorder_visual()
.
sourcepub fn reordered_levels_per_char(
&self,
para: &ParagraphInfo,
line: Range<usize>
) -> Vec<Level>
pub fn reordered_levels_per_char( &self, para: &ParagraphInfo, line: Range<usize> ) -> Vec<Level>
Produce the levels for this paragraph as needed for reordering, one level per character
in the paragraph. The returned vector includes characters that are not included
in the line
, but will not adjust them.
This runs Rule L1, you can run
Rule L2 by calling Self::reorder_visual()
.
If doing so, you may prefer to use Self::reordered_levels_per_char()
instead
to avoid non-byte indices.
For an all-in-one reordering solution, consider using Self::reorder_visual()
.
sourcepub fn reorder_line(
&self,
para: &ParagraphInfo,
line: Range<usize>
) -> Cow<'text, [u16]>
pub fn reorder_line( &self, para: &ParagraphInfo, line: Range<usize> ) -> Cow<'text, [u16]>
sourcepub fn reorder_visual(levels: &[Level]) -> Vec<usize>
pub fn reorder_visual(levels: &[Level]) -> Vec<usize>
Reorders pre-calculated levels of a sequence of characters.
NOTE: This is a convenience method that does not use a Paragraph
object. It is
intended to be used when an application has determined the levels of the objects (character sequences)
and just needs to have them reordered.
the index map will result in indexMap[visualIndex]==logicalIndex
.
This only runs Rule L2 as it does not have information about the actual text.
Furthermore, if levels
is an array that is aligned with code units, bytes within a codepoint may be
reversed. You may need to fix up the map to deal with this. Alternatively, only pass in arrays where each Level
is for a single code point.
§# Example
use unicode_bidi::BidiInfo;
use unicode_bidi::Level;
let l0 = Level::from(0);
let l1 = Level::from(1);
let l2 = Level::from(2);
let levels = vec![l0, l0, l0, l0];
let index_map = BidiInfo::reorder_visual(&levels);
assert_eq!(levels.len(), index_map.len());
assert_eq!(index_map, [0, 1, 2, 3]);
let levels: Vec<Level> = vec![l0, l0, l0, l1, l1, l1, l2, l2];
let index_map = BidiInfo::reorder_visual(&levels);
assert_eq!(levels.len(), index_map.len());
assert_eq!(index_map, [0, 1, 2, 6, 7, 5, 4, 3]);
sourcepub fn visual_runs(
&self,
para: &ParagraphInfo,
line: Range<usize>
) -> (Vec<Level>, Vec<LevelRun>)
pub fn visual_runs( &self, para: &ParagraphInfo, line: Range<usize> ) -> (Vec<Level>, Vec<LevelRun>)
Find the level runs within a line and return them in visual order.
line
is a range of bytes indices within levels
.
The first return value is a vector of levels used by the reordering algorithm, i.e. the result of Rule L1. The second return value is a vector of level runs, the result of Rule L2, showing the visual order that each level run (a run of text with the same level) should be displayed. Within each run, the display order can be checked against the Level vector.
This does not handle Rule L3 (combining characters) or Rule L4 (mirroring), as that should be handled by the engine using this API.
Conceptually, this is the same as running Self::reordered_levels()
followed by
Self::reorder_visual()
, however it returns the result as a list of level runs instead
of producing a level map, since one may wish to deal with the fact that this is operating on
byte rather than character indices.
http://www.unicode.org/reports/tr9/#Reordering_Resolved_Levels