-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Replace normalizeURL with urlx.Parse #1217
base: master
Are you sure you want to change the base?
Conversation
normalizeURL used to break URLs with path by rewriting their host part and it would also panic on "IP:Port" strings.
Do you have an example / test-case that would show the URLs that would break? This is adding 8K lines of code (as dependency) to parse an URL, which seems a lot for just this, so wondering if the problem could be solved in another way. |
func normalizeURL(addr string) (*url.URL, error) { | ||
addr = strings.TrimSpace(addr) | ||
|
||
u, err := url.Parse(addr) | ||
if err != nil { | ||
return nil, err | ||
} | ||
|
||
if u.Opaque != "" { | ||
u.Host = net.JoinHostPort(u.Scheme, u.Opaque) | ||
u.Opaque = "" | ||
} else if u.Path != "" && !strings.Contains(u.Path, ":") { | ||
u.Host = net.JoinHostPort(u.Path, "8888") | ||
u.Path = "" | ||
} else if u.Scheme == "" { | ||
u.Host = u.Path | ||
u.Path = "" | ||
} | ||
|
||
if u.Scheme != "https" { | ||
u.Scheme = "http" | ||
} | ||
|
||
_, port, err := net.SplitHostPort(u.Host) | ||
if err != nil { | ||
_, port, err = net.SplitHostPort(u.Host + ":8888") | ||
if err != nil { | ||
return nil, err | ||
} | ||
} | ||
|
||
if port != "" { | ||
_, err = strconv.Atoi(port) | ||
if err != nil { | ||
return nil, err | ||
} | ||
} | ||
return u, nil | ||
return urlx.Parse(addr) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At a quick glance, this would make the tests pass (without needing the new dependency);
func normalizeURL(addr string) (*url.URL, error) { | |
addr = strings.TrimSpace(addr) | |
u, err := url.Parse(addr) | |
if err != nil { | |
return nil, err | |
} | |
if u.Opaque != "" { | |
u.Host = net.JoinHostPort(u.Scheme, u.Opaque) | |
u.Opaque = "" | |
} else if u.Path != "" && !strings.Contains(u.Path, ":") { | |
u.Host = net.JoinHostPort(u.Path, "8888") | |
u.Path = "" | |
} else if u.Scheme == "" { | |
u.Host = u.Path | |
u.Path = "" | |
} | |
if u.Scheme != "https" { | |
u.Scheme = "http" | |
} | |
_, port, err := net.SplitHostPort(u.Host) | |
if err != nil { | |
_, port, err = net.SplitHostPort(u.Host + ":8888") | |
if err != nil { | |
return nil, err | |
} | |
} | |
if port != "" { | |
_, err = strconv.Atoi(port) | |
if err != nil { | |
return nil, err | |
} | |
} | |
return u, nil | |
return urlx.Parse(addr) | |
} | |
// normalizeURL trims spaces and appends "http" as default protocol if not defined in url | |
func normalizeURL(addr string) (*url.URL, error) { | |
addr = strings.TrimSpace(addr) | |
if strings.Index(addr, "://") == -1 { | |
// use http:// if no scheme is provided | |
addr = "http://" + addr | |
} | |
u, err := url.Parse(addr) | |
if err != nil { | |
return nil, err | |
} | |
_, port, err := net.SplitHostPort(u.Host) | |
if err != nil { | |
var err2 error | |
_, port, err2 = net.SplitHostPort(u.Host + ":8888") | |
if err2 != nil { | |
return nil, err | |
} | |
} | |
if port != "" { | |
v, err := strconv.Atoi(port) | |
if err != nil { | |
return nil, fmt.Errorf("invalid port: %w", err) | |
} | |
if v < 0 || v > 65535 { | |
return nil, fmt.Errorf("invalid port %q", port) | |
} | |
} | |
return u, nil | |
} |
Given this was some time ago I'm no longer sure but I strongly suppose I put them in the tests. |
Thanks! I think the reason The code I tried above avoids that situation by checking if a scheme is present and if not, to prepend the default scheme; if strings.Index(addr, "://") == -1 {
// use http:// if no scheme is provided
addr = "http://" + addr
} Of course it's possible that urlx handles more esoteric cases besides those, but (and I'm not a maintainer, just a passer-by! 😅) I wonder if that's enough to warrant the larger number of lines imported from the additional dependency. |
normalizeURL used to break URLs with path by rewriting their host part and it would also panic on "IP:Port" strings.
I've refactored that to utilize urlx.Parse and also added a couple of test cases.