Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DOC] Common methods rdoc #49

Merged
merged 7 commits into from
Jan 4, 2023
Merged
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
265 changes: 173 additions & 92 deletions lib/uri/common.rb
Original file line number Diff line number Diff line change
Expand Up @@ -68,16 +68,32 @@ module Schemes
end
private_constant :Schemes

# Registers the given +klass+ as the class to be instantiated
# when parsing a \URI with the given +scheme+:
peterzhu2118 marked this conversation as resolved.
Show resolved Hide resolved
#
# URI.register_scheme('MS_SEARCH', URI::Generic) # => URI::Generic
# URI.scheme_list['MS_SEARCH'] # => URI::Generic
#
# Register the given +klass+ to be instantiated when parsing URLs with the given +scheme+.
# Note that currently only schemes which after .upcase are valid constant names
# can be registered (no -/+/. allowed).
BurdetteLamar marked this conversation as resolved.
Show resolved Hide resolved
#
def self.register_scheme(scheme, klass)
Schemes.const_set(scheme.to_s.upcase, klass)
end

# Returns a Hash of the defined schemes.
# Returns a hash of the defined schemes:
#
# URI.scheme_list
# # =>
# {"MAILTO"=>URI::MailTo,
# "LDAPS"=>URI::LDAPS,
# "WS"=>URI::WS,
# "HTTP"=>URI::HTTP,
# "HTTPS"=>URI::HTTPS,
# "LDAP"=>URI::LDAP,
# "FILE"=>URI::File,
# "FTP"=>URI::FTP}
#
# Related: URI.register_scheme.
def self.scheme_list
Schemes.constants.map { |name|
[name.to_s.upcase, Schemes.const_get(name)]
Expand All @@ -88,9 +104,21 @@ def self.scheme_list
private_constant :INITIAL_SCHEMES
Ractor.make_shareable(INITIAL_SCHEMES) if defined?(Ractor)

# Returns a new object constructed from the given +scheme+, +arguments+,
# and +default+:
#
# - The new object is an instance of <tt>URI.scheme_list[scheme.upcase]</tt>.
# - The object is initialized by calling the class initializer
# using +scheme+ and +arguments+.
# See URI::Generic.new.
#
# Construct a URI instance, using the scheme to detect the appropriate class
# from +URI.scheme_list+.
# Examples:
#
# values = ['john.doe', 'www.example.com', '123', nil, '/forum/questions/', nil, 'tag=networking&order=newest', 'top']
# URI.for('https', *values)
# # => #<URI::HTTPS https://john.doe@www.example.com:123/forum/questions/?tag=networking&order=newest#top>
# URI.for('foo', *values, default: URI::HTTP)
# # => #<URI::HTTP foo://john.doe@www.example.com:123/forum/questions/?tag=networking&order=newest#top> #
BurdetteLamar marked this conversation as resolved.
Show resolved Hide resolved
#
def self.for(scheme, *arguments, default: Generic)
const_name = scheme.to_s.upcase
Expand Down Expand Up @@ -121,73 +149,37 @@ class InvalidComponentError < Error; end
#
class BadURIError < Error; end

#
# == Synopsis
#
# URI::split(uri)
#
# == Args
#
# +uri+::
# String with URI.
#
# == Description
#
# Splits the string on following parts and returns array with result:
#
# * Scheme
# * Userinfo
# * Host
# * Port
# * Registry
# * Path
# * Opaque
# * Query
# * Fragment
#
# == Usage
#
# require 'uri'
#
# URI.split("http://www.ruby-lang.org/")
# # => ["http", nil, "www.ruby-lang.org", nil, nil, "/", nil, nil, nil]
# Returns a 9-element array representing the parts of the \URI
# formed from the string +uri+;
# each array element is a string or +nil+:
#
# names = %w[scheme userinfo host port registry path opaque query fragment]
# values = URI.split('https://john.doe@www.example.com:123/forum/questions/?tag=networking&order=newest#top')
# names.zip(values)
# # =>
# [["scheme", "https"],
# ["userinfo", "john.doe"],
# ["host", "www.example.com"],
# ["port", "123"],
# ["registry", nil],
# ["path", "/forum/questions/"],
# ["opaque", nil],
# ["query", "tag=networking&order=newest"],
# ["fragment", "top"]]
#
def self.split(uri)
RFC3986_PARSER.split(uri)
end

# Returns a new \URI object constructed from the given string +uri+:
#
# == Synopsis
#
# URI::parse(uri_str)
#
# == Args
#
# +uri_str+::
# String with URI.
#
# == Description
#
# Creates one of the URI's subclasses instance from the string.
#
# == Raises
#
# URI::InvalidURIError::
# Raised if URI given is not a correct one.
#
# == Usage
#
# require 'uri'
#
# uri = URI.parse("http://www.ruby-lang.org/")
# # => #<URI::HTTP http://www.ruby-lang.org/>
# uri.scheme
# # => "http"
# uri.host
# # => "www.ruby-lang.org"
# URI.parse('https://john.doe@www.example.com:123/forum/questions/?tag=networking&order=newest#top')
# # => #<URI::HTTPS https://john.doe@www.example.com:123/forum/questions/?tag=networking&order=newest#top>
# URI.parse('http://john.doe@www.example.com:123/forum/questions/?tag=networking&order=newest#top')
# # => #<URI::HTTP http://john.doe@www.example.com:123/forum/questions/?tag=networking&order=newest#top>
#
# It's recommended to first ::escape the provided +uri_str+ if there are any
# invalid URI characters.
# It's recommended to first ::escape string +uri+
# if it may contain invalid URI characters.
#
def self.parse(uri)
RFC3986_PARSER.parse(uri)
Expand Down Expand Up @@ -314,17 +306,41 @@ def self.regexp(schemes = nil)
TBLDECWWWCOMP_['+'] = ' '
TBLDECWWWCOMP_.freeze

# Encodes given +str+ to URL-encoded form data.
# Returns a URL-encoded string derived from the given string +str+.
#
# This method doesn't convert *, -, ., 0-9, A-Z, _, a-z, but does convert SP
# (ASCII space) to + and converts others to %XX.
# The returned string:
#
# If +enc+ is given, convert +str+ to the encoding before percent encoding.
# - Preserves:
#
# This is an implementation of
# https://www.w3.org/TR/2013/CR-html5-20130806/forms.html#url-encoded-form-data.
# - Characters <tt>'*'</tt>, <tt>'.'</tt>, <tt>'-'</tt>, and <tt>'_'</tt>.
# - Character in ranges <tt>'a'..'z'</tt>, <tt>'A'..'Z'</tt>,
# and <tt>'0'..'9'</tt>.
#
# Example:
#
# URI.encode_www_form_component('*.-_azAZ09')
# # => "*.-_azAZ09"
#
# - Converts:
#
# - Character <tt>' '</tt> to character <tt>'+'</tt>.
# - Any other character to "percent notation";
# the percent notation for character <i>c</i> is <tt>'%%%X' % c.ord</tt>.
#
# Example:
#
# URI.encode_www_form_component('Here are some punctuation characters: ,;?:')
# # => "Here+are+some+punctuation+characters%3A+%2C%3B%3F%3A"
#
# Encoding:
#
# - If +str+ has encoding Encoding::ASCII_8BIT, argument +enc+ is ignored.
# - Otherwise +str+ is converted first to Encoding::UTF_8
# (with suitable character replacements),
# and then to encoding +enc+.
#
# In either case, the returned string has forced encoding Encoding::US_ASCII.
#
# See URI.decode_www_form_component, URI.encode_www_form.
def self.encode_www_form_component(str, enc=nil)
_encode_uri_component(/[^*\-.0-9A-Z_a-z]/, TBLENCWWWCOMP_, str, enc)
end
Expand Down Expand Up @@ -372,33 +388,98 @@ def self._decode_uri_component(regexp, str, enc)
end
private_class_method :_decode_uri_component

# Generates URL-encoded form data from given +enum+.
# Returns a URL-encoded string derived from the given
# {Enumerable}[https://docs.ruby-lang.org/en/master/Enumerable.html#module-Enumerable-label-Enumerable+in+Ruby+Classes]
# +enum+.
#
# This generates application/x-www-form-urlencoded data defined in HTML5
# from given an Enumerable object.
# The result is suitable for use as form data
# for an \HTTP request whose <tt>Content-Type</tt> is
# <tt>'application/x-www-form-urlencoded'</tt>.
#
# This internally uses URI.encode_www_form_component(str).
# The returned string consists of the elements of +enum+,
# each converted to one or more URL-encoded strings,
# and all joined with character <tt>'&'</tt>.
#
# This method doesn't convert the encoding of given items, so convert them
# before calling this method if you want to send data as other than original
# encoding or mixed encoding data. (Strings which are encoded in an HTML5
# ASCII incompatible encoding are converted to UTF-8.)
# Simple examples:
#
# This method doesn't handle files. When you send a file, use
# multipart/form-data.
# URI.encode_www_form([['foo', 0], ['bar', 1], ['baz', 2]])
# # => "foo=0&bar=1&baz=2"
# URI.encode_www_form({foo: 0, bar: 1, baz: 2})
# # => "foo=0&bar=1&baz=2"
#
# This refers https://url.spec.whatwg.org/#concept-urlencoded-serializer
# When +enum+ is Array-like, each element +ele+ is converted to a field:
#
# URI.encode_www_form([["q", "ruby"], ["lang", "en"]])
# #=> "q=ruby&lang=en"
# URI.encode_www_form("q" => "ruby", "lang" => "en")
# #=> "q=ruby&lang=en"
# URI.encode_www_form("q" => ["ruby", "perl"], "lang" => "en")
# #=> "q=ruby&q=perl&lang=en"
# URI.encode_www_form([["q", "ruby"], ["q", "perl"], ["lang", "en"]])
# #=> "q=ruby&q=perl&lang=en"
# - If +ele+ is an array of two or more elements,
# the field is formed from its first two elements
# (and any additional elements are ignored):
#
# name = URI.encode_www_form_component(ele[0], enc)
# value = URI.encode_www_form_component(ele[1], enc)
# "#{name}=#{value}"
#
# Examples:
#
# URI.encode_www_form([%w[foo bar], %w[baz bat bah]])
# # => "foo=bar&baz=bat"
# URI.encode_www_form([['foo', 0], ['bar', :baz, 'bat']])
# # => "foo=0&bar=baz"
#
# - If +ele+ is an array of one element,
# the field is formed from <tt>ele[0]</tt>:
#
# URI.encode_www_form_component(ele[0])
#
# Example:
#
# URI.encode_www_form([['foo'], [:bar], [0]])
# # => "foo&bar&0"
#
# - Otherwise the field is formed from +ele+:
#
# URI.encode_www_form_component(ele)
#
# Example:
#
# URI.encode_www_form(['foo', :bar, 0])
# # => "foo&bar&0"
#
# The elements of an Array-like +enum+ may be mixture:
#
# URI.encode_www_form([['foo', 0], ['bar', 1, 2], ['baz'], :bat])
# # => "foo=0&bar=1&baz&bat"
#
# When +enum+ is Hash-like,
# each +key+/+value+ pair is converted to one or more fields:
#
# - If +value+ is
# {Array-convertible}[https://docs.ruby-lang.org/en/master/implicit_conversion_rdoc.html#label-Array-Convertible+Objects],
# each element +ele+ in +value+ is paired with +key+ to form a field:
#
# name = URI.encode_www_form_component(key, enc)
# value = URI.encode_www_form_component(ele, enc)
# "#{name}=#{value}"
#
# Example:
#
# URI.encode_www_form({foo: [:bar, 1], baz: [:bat, :bam, 2]})
# # => "foo=bar&foo=1&baz=bat&baz=bam&baz=2"
#
# - Otherwise, +key+ and +value+ are paired to form a field:
#
# name = URI.encode_www_form_component(key, enc)
# value = URI.encode_www_form_component(value, enc)
# "#{name}=#{value}"
#
# Example:
#
# URI.encode_www_form({foo: 0, bar: 1, baz: 2})
# # => "foo=0&bar=1&baz=2"
#
# The elements of a Hash-like +enum+ may be mixture:
#
# URI.encode_www_form({foo: [0, 1], bar: 2})
# # => "foo=0&foo=1&bar=2"
#
# See URI.encode_www_form_component, URI.decode_www_form.
def self.encode_www_form(enum, enc=nil)
enum.map do |k,v|
if v.nil?
Expand Down