Ruby bindings for rust/regex library.
Install Rust via rustup or in any other way.
Add as a dependency:
# In your Gemfile
gem "rust_regexp"
# Or without Bundler
gem install rust_regexpInclude in your code:
require "rust_regexp"Regular expressions should be pre-compiled before use:
re = RustRegexp.new('p.t{2}ern*')
# => #<RustRegexp:...>Tip
Note the use of single quotes when passing the regular expression as
a string to rust/regex so that the backslashes aren't interpreted as escapes.
To find a single match in the haystack:
RustRegexp.new('\w+:\d+').match("ruby:123, rust:456")
# => ["ruby:123"]
RustRegexp.new('(\w+):(\d+)').match("ruby:123, rust:456")
# => ["ruby", "123"]To find all matches in the haystack:
RustRegexp.new('\w+:\d+').scan("ruby:123, rust:456")
# => ["ruby:123", "rust:456"]
RustRegexp.new('(\w+):(\d+)').scan("ruby:123, rust:456")
# => [["ruby", "123"], ["rust", "456"]]To check whether there is at least one match in the haystack:
RustRegexp.new('\w+:\d+').match?("ruby:123")
# => true
RustRegexp.new('\w+:\d+').match?("ruby")
# => falseInspect original pattern:
RustRegexp.new('\w+:\d+').pattern
# => "(\\w+):(\\d+)"Warning
rust/regex regular expression syntax differs from Ruby's built-in
Regexp library, see the
official syntax page for more
details.
RustRegexp::Set represents a collection of
regular expressions that can be searched for simultaneously. Calling RustRegexp::Set#match will return an array containing the indices of all the patterns that matched.
set = RustRegexp::Set.new(["abc", "def", "ghi", "xyz"])
set.match("abcdefghi") # => [0, 1, 2]
set.match("ghidefabc") # => [0, 1, 2]Note
Matches arrive in the order the constituent patterns were declared, not the order they appear in the haystack.
To check whether at least one pattern from the set matches the haystack:
RustRegexp::Set.new(["abc", "def"]).match?("abc")
# => true
RustRegexp::Set.new(["abc", "def"]).match?("123")
# => falseInspect original patterns:
RustRegexp::Set.new(["abc", "def"]).patterns
# => ["abc", "def"]Currently, rust_regexp expects the haystack to be an UTF-8 string.
It also supports parsing of strings with invalid UTF-8 characters by default. It's achieved via using regex::bytes instead of plain regex under the hood, so any byte sequence can be matched. The output match is encoded as UTF-8 string.
In case unicode awarness of matchers should be disabled, both RustRegexp and RustRegexp::Set support unicode: false option:
RustRegexp.new('\w+').match('ю٤夏')
# => ["ю٤夏"]
RustRegexp.new('\w+', unicode: false).match('ю٤夏')
# => []
RustRegexp::Set.new(['\w', '\d', '\s']).match("ю٤\u2000")
# => [0, 1, 2]
RustRegexp::Set.new(['\w', '\d', '\s'], unicode: false).match("ю٤\u2000")
# => []bin/setup # install deps
bin/console # interactive prompt to play around
rake compile # (re)compile extension
rake spec # run testsBug reports and pull requests are welcome on GitHub at https://github.com/ocvit/rust_regexp.
The gem is available as open source under the terms of the MIT License.