Skip to content

Commit

Permalink
Deactivate many regex Unicode crate features
Browse files Browse the repository at this point in the history
#1643 disabled many deafault features of the
`regex` crate but left the `unicode` meta feature enabled. With the
`unicode` feature enabled and `bindgen` as a build dependency,
`regex-syntax` (a direct dependency of the `regex` crate) takes 7
seconds to compile as a build dependency in my application.

The `unicode` feature includes support for many Unicode character class
lookups which I find unlikely that bindgen uses.

From https://docs.rs/regex/latest/regex/#unicode-features:

> - unicode-age - Provide the data for the Unicode Age property. This
>   makes it possible to use classes like `\p{Age:6.0}` to refer to all
>   codepoints first introduced in Unicode 6.0
> - unicode-bool - Provide the data for numerous Unicode boolean
>   properties. The full list is not included here, but contains
>   properties like `Alphabetic`, `Emoji`, `Lowercase`, `Math`,
>   `Uppercase` and `White_Space`.
> - unicode-case - Provide the data for case insensitive matching using
>   Unicode's "simple loose matches" specification.
> - unicode-gencat - Provide the data for Unicode general categories.
>   This includes, but is not limited to, `Decimal_Number`, `Letter`,
>   `Math_Symbol`, `Number` and `Punctuation`.
> - unicode-script - Provide the data for Unicode scripts and script
>   extensions. This includes, but is not limited to, `Arabic`, `Cyrillic`,
>   `Hebrew`, `Latin` and `Thai`.
> - unicode-segment - Provide the data necessary to provide the
>   properties used to implement the Unicode text segmentation
>   algorithms. This enables using classes like `\p{gcb=Extend}`,
>   `\p{wb=Katakana}` and `\p{sb=ATerm}`.

I have retained the `unicode-perl` feature, which gives support for
`\w`, `\s` and `\d`, because these character classes were required
to get tests to pass.

Removing support for these character classes removes the need to compile
many data tables, which should significantly reduce compile times.
  • Loading branch information
lopopolo authored and emilio committed Dec 27, 2023
1 parent 5ff913a commit d0c2b1e
Showing 1 changed file with 3 additions and 3 deletions.
6 changes: 3 additions & 3 deletions bindgen/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,7 @@ peeking_take_while = "0.1.2"
prettyplease = { version = "0.2.7", optional = true, features = ["verbatim"] }
proc-macro2 = { version = "1", default-features = false }
quote = { version = "1", default-features = false }
regex = { version = "1.5", default-features = false, features = ["std", "unicode"] }
regex = { version = "1.5", default-features = false, features = ["std", "unicode-perl"] }
rustc-hash = "1.0.1"
shlex = "1"
syn = { version = "2.0", features = ["full", "extra-traits", "visit-mut"] }
Expand All @@ -53,9 +53,9 @@ experimental = ["dep:annotate-snippets"]

## The following features are for internal use and they shouldn't be used if
## you're not hacking on bindgen
# Features used by `bindgen-cli`
# Features used by `bindgen-cli`
__cli = []
# Features used for CI testing
# Features used for CI testing
__testing_only_extra_assertions = []
__testing_only_libclang_9 = []
__testing_only_libclang_5 = []
Expand Down

0 comments on commit d0c2b1e

Please sign in to comment.