Struct regex_syntax::hir::Hir

source ·

pub struct Hir { /* private fields */ }

Expand description

A high-level intermediate representation (HIR) for a regular expression.

An HIR value is a combination of a HirKind and a set of Properties. An HirKind indicates what kind of regular expression it is (a literal, a repetition, a look-around assertion, etc.), where as a Properties describes various facts about the regular expression. For example, whether it matches UTF-8 or if it matches the empty string.

The HIR of a regular expression represents an intermediate step between its abstract syntax (a structured description of the concrete syntax) and an actual regex matcher. The purpose of HIR is to make regular expressions easier to analyze. In particular, the AST is much more complex than the HIR. For example, while an AST supports arbitrarily nested character classes, the HIR will flatten all nested classes into a single set. The HIR will also “compile away” every flag present in the concrete syntax. For example, users of HIR expressions never need to worry about case folding; it is handled automatically by the translator (e.g., by translating (?i:A) to [aA]).

The specific type of an HIR expression can be accessed via its kind or into_kind methods. This extra level of indirection exists for two reasons:

Construction of an HIR expression must use the constructor methods on this Hir type instead of building the HirKind values directly. This permits construction to enforce invariants like “concatenations always consist of two or more sub-expressions.”
Every HIR expression contains attributes that are defined inductively, and can be computed cheaply during the construction process. For example, one such attribute is whether the expression must match at the beginning of the haystack.

In particular, if you have an HirKind value, then there is intentionally no way to build an Hir value from it. You instead need to do case analysis on the HirKind value and build the Hir value using its smart constructors.

§UTF-8

If the HIR was produced by a translator with TranslatorBuilder::utf8 enabled, then the HIR is guaranteed to match UTF-8 exclusively for all non-empty matches.

For empty matches, those can occur at any position. It is the responsibility of the regex engine to determine whether empty matches are permitted between the code units of a single codepoint.

§Stack space

This type defines its own destructor that uses constant stack space and heap space proportional to the size of the HIR.

Also, an Hir’s fmt::Display implementation prints an HIR as a regular expression pattern string, and uses constant stack space and heap space proportional to the size of the Hir. The regex it prints is guaranteed to be semantically equivalent to the original concrete syntax, but it may look very different. (And potentially not practically readable by a human.)

An Hir’s fmt::Debug implementation currently does not use constant stack space. The implementation will also suppress some details (such as the Properties inlined into every Hir value to make it less noisy).

Struct regex_syntax::hir::Hir

§UTF-8

§Stack space

Implementations§

impl Hir

pub fn kind(&self) -> &HirKind

pub fn into_kind(self) -> HirKind

pub fn properties(&self) -> &Properties

impl Hir

pub fn empty() -> Hir

pub fn fail() -> Hir

pub fn literal<B: Into<Box<[u8]>>>(lit: B) -> Hir

§Example

§Example: building a literal from a char

pub fn class(class: Class) -> Hir

pub fn look(look: Look) -> Hir

pub fn repetition(rep: Repetition) -> Hir

pub fn capture(capture: Capture) -> Hir

pub fn concat(subs: Vec<Hir>) -> Hir

§Example

pub fn alternation(subs: Vec<Hir>) -> Hir

§Example

pub fn dot(dot: Dot) -> Hir

§Example

Trait Implementations§

impl Clone for Hir

fn clone(&self) -> Hir

fn clone_from(&mut self, source: &Self)

impl Debug for Hir

fn fmt(&self, f: &mut Formatter<'_>) -> Result

impl Display for Hir

fn fmt(&self, f: &mut Formatter<'_>) -> Result

impl Drop for Hir

fn drop(&mut self)

impl PartialEq for Hir

fn eq(&self, other: &Hir) -> bool

fn ne(&self, other: &Rhs) -> bool

impl Eq for Hir

impl StructuralPartialEq for Hir

Auto Trait Implementations§

impl RefUnwindSafe for Hir

impl Send for Hir

impl Sync for Hir

impl Unpin for Hir

impl UnwindSafe for Hir

Blanket Implementations§

impl<T> Any for Twhere T: 'static + ?Sized,

fn type_id(&self) -> TypeId

impl<T> Borrow<T> for Twhere T: ?Sized,

fn borrow(&self) -> &T

impl<T> BorrowMut<T> for Twhere T: ?Sized,

fn borrow_mut(&mut self) -> &mut T

impl<T> From<T> for T

fn from(t: T) -> T

impl<T, U> Into<U> for Twhere U: From<T>,

fn into(self) -> U

impl<T> ToOwned for Twhere T: Clone,

type Owned = T

fn to_owned(&self) -> T

fn clone_into(&self, target: &mut T)

impl<T> ToString for Twhere T: Display + ?Sized,

default fn to_string(&self) -> String

impl<T, U> TryFrom<U> for Twhere U: Into<T>,

type Error = Infallible

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

impl<T, U> TryInto<U> for Twhere U: TryFrom<T>,

type Error = <U as TryFrom<T>>::Error

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

§Example: building a literal from a `char`

impl<T> Any for T
where T: 'static + ?Sized,

impl<T> Borrow<T> for T
where T: ?Sized,

impl<T> BorrowMut<T> for T
where T: ?Sized,

impl<T, U> Into<U> for T
where U: From<T>,

impl<T> ToOwned for T
where T: Clone,

impl<T> ToString for T
where T: Display + ?Sized,

impl<T, U> TryFrom<U> for T
where U: Into<T>,

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,