Merge pull request #2 from kimili/master

amogil · web-flow · commit b76f8a776b04 · 2017-01-22T23:05:54.000+03:00
Adding a Javascript option.
diff --git a/README.md b/README.md
@@ -6,10 +6,10 @@
 # UrlRegex
 
 Provides the best known regex for validating and extracting URLs.
-It builds on amazing work done by [Diego Perini](https://gist.github.com/dperini/729294) 
+It builds on amazing work done by [Diego Perini](https://gist.github.com/dperini/729294)
 and [Mathias Bynens](https://mathiasbynens.be/demo/url-regex).
 
-Why do we need a gem for this regex? 
+Why do we need a gem for this regex?
 
 - You don't need to follow changes and improvements of original regex.
 - You can slightly customize the regex: a scheme can be optional, and you can get the regex for validation or parsing.
@@ -33,12 +33,12 @@ Or install it yourself as:
 Get the regex:
 
     UrlRegex.get(options)
-    
+
 where options are:
 
 - `scheme_required` indicates that schema is required, defaults to `true`.
 
-- `mode` can gets either `:validation` or `:parsing`, defaults to `:validation`.
+- `mode` can gets either `:validation`, `:parsing` or `:javascript`, defaults to `:validation`.
 
 `:validation` asks to return the regex for validation, namely, with `\A` prefix, and with `\z` postfix.
 That means, it matches whole text:
@@ -47,17 +47,27 @@ That means, it matches whole text:
     # => false
     UrlRegex.get(mode: :validation).match('link: https://www.google.com').nil?
     # => true
-    
+
 `:parsing` asks to return the regex for parsing:
 
     str = 'links: google.com https://google.com?t=1'
     str.scan(UrlRegex.get(mode: :parsing))
     # => ["https://google.com?t=1"]
-        
+
     # schema is not required
     str.scan(UrlRegex.get(scheme_required: false, mode: :parsing))
     # => ["google.com", "https://google.com?t=1"]
 
+`:javascript` asks to return the regex formatted for use in Javascript files or as `pattern` attribute values on HTML inputs. For this purpose, you'd use the `source` method on the Regexp object instance in order to produce a string that Javascript will understand. These examples make use of the Rails `text_field` method to generate HTML input elements.
+
+    regex = UrlRegex.get(mode: :javascript)
+    text_field(:site, :url, pattern: regex.source)
+    # => <input type="text" id="site_url" name="site[url]" pattern="[javascript URL regex]" />
+
+    regex = UrlRegex.get(scheme_required: false, mode: :javascript)
+    text_field(:site, :url, pattern: regex.source)
+    # => <input type="text" id="site_url" name="site[url]" pattern="[javascript URL regex with optional scheme]" />
+
 `UrlRegex.get` returns regular Ruby's [Regex](http://ruby-doc.org/core-2.0.0/Regexp.html) object,
 so you can use it as usual.
 
@@ -66,18 +76,18 @@ All regexes are case-insensitive.
 ## FAQ
 
 Q: Hey, I want to parse HTML, but it doesn't work:
-    
+
     str = '<a href="http://google.com?t=1">Link</a>'
     str.scan(UrlRegex.get(mode: :parsing))
     # => "http://google.com?t=1">Link</a>"
-    
-A: Well, you probably know that parsing HTML with regex is 
+
+A: Well, you probably know that parsing HTML with regex is
 [a bad idea](https://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags).
 It requires matching corresponding open and close brackets, that makes the regex even more complicated.
 
 Q: How can I speed up processing?
 
-A: Generated regex depends only on options, so you can get the regex only once and cache it. 
+A: Generated regex depends only on options, so you can get the regex only once and cache it.
 
 ## Contributing
 
diff --git a/lib/url_regex.rb b/lib/url_regex.rb
@@ -13,7 +13,15 @@ module UrlRegex
   def self.get(scheme_required: true, mode: :validation)
     raise ArgumentError, "wrong mode: #{mode}" if MODES.index(mode).nil?
     scheme = scheme_required ? PROTOCOL_IDENTIFIER : PROTOCOL_IDENTIFIER_OPTIONAL
-    mode == :validation ? /\A#{scheme} #{BASE}\z/xi : /#{scheme} #{BASE}/xi
+    case mode
+    when :validation
+      regex = /\A#{scheme} #{BASE}\z/xi
+    when :parsing
+      regex = /#{scheme} #{BASE}/xi
+    when :javascript
+      regex = /^#{scheme}#{JAVASCRIPT_BASE}$/
+    end
+    regex
   end
 
   BASE = '
@@ -52,9 +60,11 @@ def self.get(scheme_required: true, mode: :validation)
     (?:[/?#]\S*)?
   '.freeze
 
+  JAVASCRIPT_BASE = '(?:\S+(?::\S*)?@)?(?:(?!(?:10|127)(?:\.\d{1,3}){3})(?!(?:169\.254|192\.168)(?:\.\d{1,3}){2})(?!172\.(?:1[6-9]|2\d|3[0-1])(?:\.\d{1,3}){2})(?:[1-9]\d?|1\d\d|2[01]\d|22[0-3])(?:\.(?:1?\d{1,2}|2[0-4]\d|25[0-5])){2}(?:\.(?:[1-9]\d?|1\d\d|2[0-4]\d|25[0-4]))|(?:(?:[a-z\u00a1-\uffff0-9]-*)*[a-z\u00a1-\uffff0-9]+)(?:\.(?:[a-z\u00a1-\uffff0-9]-*)*[a-z\u00a1-\uffff0-9]+)*(?:\.(?:[a-z\u00a1-\uffff]{2,}))\.?)(?::\d{2,5})?(?:[/?#]\S*)?'.freeze
+
   PROTOCOL_IDENTIFIER = '(?:(?:https?|ftp)://)'.freeze
   PROTOCOL_IDENTIFIER_OPTIONAL = '(?:(?:https?|ftp)://)?'.freeze
-  MODES = [:validation, :parsing].freeze
+  MODES = [:validation, :parsing, :javascript].freeze
 
   private_constant :BASE, :PROTOCOL_IDENTIFIER, :PROTOCOL_IDENTIFIER_OPTIONAL, :MODES
 end
diff --git a/spec/url_regex_spec.rb b/spec/url_regex_spec.rb
@@ -56,6 +56,9 @@
       it "should match #{valid_url}" do
         expect(UrlRegex.get(scheme_required: true)).to match valid_url
       end
+      it "should match #{valid_url} against Javascript regex" do
+        expect(UrlRegex.get(scheme_required: true, mode: :javascript)).to match valid_url
+      end
     end
 
     [
@@ -100,6 +103,9 @@
       it "should not match #{invalid_url}" do
         expect(UrlRegex.get(scheme_required: true)).to_not match invalid_url
       end
+      it "should not match #{invalid_url} against Javascript regex" do
+        expect(UrlRegex.get(scheme_required: true, mode: :javascript)).to_not match invalid_url
+      end
     end
   end
 
@@ -149,6 +155,9 @@
       it "should match #{valid_url}" do
         expect(UrlRegex.get(scheme_required: false)).to match valid_url
       end
+      it "should match #{valid_url} against Javascript regex" do
+        expect(UrlRegex.get(scheme_required: false, mode: :javascript)).to match valid_url
+      end
     end
 
     [
@@ -191,6 +200,9 @@
       it "should not match #{invalid_url}" do
         expect(UrlRegex.get(scheme_required: false)).to_not match invalid_url
       end
+      it "should not match #{invalid_url} against Javascript regex" do
+        expect(UrlRegex.get(scheme_required: false, mode: :javascript)).to_not match invalid_url
+      end
     end
   end