Skip to content

Commit b76f8a7

Browse files
authored
Merge pull request #2 from kimili/master
Adding a Javascript option.
2 parents 7494e27 + efac2e8 commit b76f8a7

File tree

3 files changed

+44
-12
lines changed

3 files changed

+44
-12
lines changed

README.md

Lines changed: 20 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -6,10 +6,10 @@
66
# UrlRegex
77

88
Provides the best known regex for validating and extracting URLs.
9-
It builds on amazing work done by [Diego Perini](https://gist.github.com/dperini/729294)
9+
It builds on amazing work done by [Diego Perini](https://gist.github.com/dperini/729294)
1010
and [Mathias Bynens](https://mathiasbynens.be/demo/url-regex).
1111

12-
Why do we need a gem for this regex?
12+
Why do we need a gem for this regex?
1313

1414
- You don't need to follow changes and improvements of original regex.
1515
- You can slightly customize the regex: a scheme can be optional, and you can get the regex for validation or parsing.
@@ -33,12 +33,12 @@ Or install it yourself as:
3333
Get the regex:
3434

3535
UrlRegex.get(options)
36-
36+
3737
where options are:
3838

3939
- `scheme_required` indicates that schema is required, defaults to `true`.
4040

41-
- `mode` can gets either `:validation` or `:parsing`, defaults to `:validation`.
41+
- `mode` can gets either `:validation`, `:parsing` or `:javascript`, defaults to `:validation`.
4242

4343
`:validation` asks to return the regex for validation, namely, with `\A` prefix, and with `\z` postfix.
4444
That means, it matches whole text:
@@ -47,17 +47,27 @@ That means, it matches whole text:
4747
# => false
4848
UrlRegex.get(mode: :validation).match('link: https://www.google.com').nil?
4949
# => true
50-
50+
5151
`:parsing` asks to return the regex for parsing:
5252

5353
str = 'links: google.com https://google.com?t=1'
5454
str.scan(UrlRegex.get(mode: :parsing))
5555
# => ["https://google.com?t=1"]
56-
56+
5757
# schema is not required
5858
str.scan(UrlRegex.get(scheme_required: false, mode: :parsing))
5959
# => ["google.com", "https://google.com?t=1"]
6060

61+
`:javascript` asks to return the regex formatted for use in Javascript files or as `pattern` attribute values on HTML inputs. For this purpose, you'd use the `source` method on the Regexp object instance in order to produce a string that Javascript will understand. These examples make use of the Rails `text_field` method to generate HTML input elements.
62+
63+
regex = UrlRegex.get(mode: :javascript)
64+
text_field(:site, :url, pattern: regex.source)
65+
# => <input type="text" id="site_url" name="site[url]" pattern="[javascript URL regex]" />
66+
67+
regex = UrlRegex.get(scheme_required: false, mode: :javascript)
68+
text_field(:site, :url, pattern: regex.source)
69+
# => <input type="text" id="site_url" name="site[url]" pattern="[javascript URL regex with optional scheme]" />
70+
6171
`UrlRegex.get` returns regular Ruby's [Regex](http://ruby-doc.org/core-2.0.0/Regexp.html) object,
6272
so you can use it as usual.
6373

@@ -66,18 +76,18 @@ All regexes are case-insensitive.
6676
## FAQ
6777

6878
Q: Hey, I want to parse HTML, but it doesn't work:
69-
79+
7080
str = '<a href="http://google.com?t=1">Link</a>'
7181
str.scan(UrlRegex.get(mode: :parsing))
7282
# => "http://google.com?t=1">Link</a>"
73-
74-
A: Well, you probably know that parsing HTML with regex is
83+
84+
A: Well, you probably know that parsing HTML with regex is
7585
[a bad idea](https://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags).
7686
It requires matching corresponding open and close brackets, that makes the regex even more complicated.
7787

7888
Q: How can I speed up processing?
7989

80-
A: Generated regex depends only on options, so you can get the regex only once and cache it.
90+
A: Generated regex depends only on options, so you can get the regex only once and cache it.
8191

8292
## Contributing
8393

lib/url_regex.rb

Lines changed: 12 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,15 @@ module UrlRegex
1313
def self.get(scheme_required: true, mode: :validation)
1414
raise ArgumentError, "wrong mode: #{mode}" if MODES.index(mode).nil?
1515
scheme = scheme_required ? PROTOCOL_IDENTIFIER : PROTOCOL_IDENTIFIER_OPTIONAL
16-
mode == :validation ? /\A#{scheme} #{BASE}\z/xi : /#{scheme} #{BASE}/xi
16+
case mode
17+
when :validation
18+
regex = /\A#{scheme} #{BASE}\z/xi
19+
when :parsing
20+
regex = /#{scheme} #{BASE}/xi
21+
when :javascript
22+
regex = /^#{scheme}#{JAVASCRIPT_BASE}$/
23+
end
24+
regex
1725
end
1826

1927
BASE = '
@@ -52,9 +60,11 @@ def self.get(scheme_required: true, mode: :validation)
5260
(?:[/?#]\S*)?
5361
'.freeze
5462

63+
JAVASCRIPT_BASE = '(?:\S+(?::\S*)?@)?(?:(?!(?:10|127)(?:\.\d{1,3}){3})(?!(?:169\.254|192\.168)(?:\.\d{1,3}){2})(?!172\.(?:1[6-9]|2\d|3[0-1])(?:\.\d{1,3}){2})(?:[1-9]\d?|1\d\d|2[01]\d|22[0-3])(?:\.(?:1?\d{1,2}|2[0-4]\d|25[0-5])){2}(?:\.(?:[1-9]\d?|1\d\d|2[0-4]\d|25[0-4]))|(?:(?:[a-z\u00a1-\uffff0-9]-*)*[a-z\u00a1-\uffff0-9]+)(?:\.(?:[a-z\u00a1-\uffff0-9]-*)*[a-z\u00a1-\uffff0-9]+)*(?:\.(?:[a-z\u00a1-\uffff]{2,}))\.?)(?::\d{2,5})?(?:[/?#]\S*)?'.freeze
64+
5565
PROTOCOL_IDENTIFIER = '(?:(?:https?|ftp)://)'.freeze
5666
PROTOCOL_IDENTIFIER_OPTIONAL = '(?:(?:https?|ftp)://)?'.freeze
57-
MODES = [:validation, :parsing].freeze
67+
MODES = [:validation, :parsing, :javascript].freeze
5868

5969
private_constant :BASE, :PROTOCOL_IDENTIFIER, :PROTOCOL_IDENTIFIER_OPTIONAL, :MODES
6070
end

spec/url_regex_spec.rb

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -56,6 +56,9 @@
5656
it "should match #{valid_url}" do
5757
expect(UrlRegex.get(scheme_required: true)).to match valid_url
5858
end
59+
it "should match #{valid_url} against Javascript regex" do
60+
expect(UrlRegex.get(scheme_required: true, mode: :javascript)).to match valid_url
61+
end
5962
end
6063

6164
[
@@ -100,6 +103,9 @@
100103
it "should not match #{invalid_url}" do
101104
expect(UrlRegex.get(scheme_required: true)).to_not match invalid_url
102105
end
106+
it "should not match #{invalid_url} against Javascript regex" do
107+
expect(UrlRegex.get(scheme_required: true, mode: :javascript)).to_not match invalid_url
108+
end
103109
end
104110
end
105111

@@ -149,6 +155,9 @@
149155
it "should match #{valid_url}" do
150156
expect(UrlRegex.get(scheme_required: false)).to match valid_url
151157
end
158+
it "should match #{valid_url} against Javascript regex" do
159+
expect(UrlRegex.get(scheme_required: false, mode: :javascript)).to match valid_url
160+
end
152161
end
153162

154163
[
@@ -191,6 +200,9 @@
191200
it "should not match #{invalid_url}" do
192201
expect(UrlRegex.get(scheme_required: false)).to_not match invalid_url
193202
end
203+
it "should not match #{invalid_url} against Javascript regex" do
204+
expect(UrlRegex.get(scheme_required: false, mode: :javascript)).to_not match invalid_url
205+
end
194206
end
195207
end
196208

0 commit comments

Comments
 (0)