Description
First of all, thanks for your work.
I am having a problem related to this project and I have thought, worked and researched a lot before writing here an issue, and I have tried to explain it properly.
In my network there is a proxy where my requests are sent through to reach public internet.
In my project I have a dependency on grunt-contrib-imagemin
that, through several hierarchy dependencies, has dependencies on gifsicle
, jpegtran-bin
and optipng-bin
packages.
I resolve those 3 dependencies sucessfully doing npm install
, but the 3 of them have their own postinstall script calling node lib/install.js
which finally uses tunnel-agent
(version "0.4.3") to download some binary files (last recent version of tunnel-agent
also has this problem).
When executing that script the 3 of them finish like below:
‼ tunneling socket could not be established, statusCode=504
‼ gifsicle pre-build test failed
So the proxy is giving a 504 error (please note that the dependencies had been downloaded sucessfully through the proxy).
I have debugged tunnel-agent
and here you can find my conclussions.
As I am below an HTTP proxy and the files requested are hosted in a server with HTTPS, tunnel-agent
uses the function httpsOverHttp
.
In that function a CONNECT
request is created to ask the proxy to create a TLS connection to get the file from the origin server.
I have taken the request from Wireshark monitoring and it is as follows:
CONNECT <origin server domain>:<origin server port> HTTP/1.1
Host: <proxy IP>:<proxy port>
...
As I have shown I am getting a 504 error from the Proxy saying that The requested URL couldn't be resolved. So I tested same request using curl
command through curl -v <origin server domain>:<origin server port>/file
and curl suceeded creating the tunnel and getting the file. I took the request from Wireshark and it was the following:
CONNECT <origin server domain>:<origin server port> HTTP/1.1
Host: <origin server domain>:<origin server port>
...
If you check it, curl
writes the origin server domain and port also in the Host
header.
To confirm my theory about that tunnel-agent
would be fixed also doing the same with the Host header, I have modified the file /tunnel-agent/index.js
, function TunnelingAgent.prototype.createSocket
adding the following just before the request is executed (so before line var connectReq = self.request(connectOptions)
):
connectOptions.headers = connectOptions.headers || {};
connectOptions.headers['Host'] = options.host + ':' + options.port;
And it has worked. The tunnel is created and the file downloaded.
I have researched in the RFC of HTTP and also in some documentation references like MDN about if the Host header should contain the origin host and port as curl
does and look:
In the RFC it is not clearly said what should be the value of the Host header, but all the examples of the RFC and also MDN documentation reflect that the Host header has to be equal to the HTTP target, so equal to the origin server host&port. Maybe it is explained in some other part of the RFC I have not found.
In addition just to mention that with Wireshark I understood that the proxy host and port are only needed when doing the TCP connection previous to communicate by HTTP, so the proxy host and port are not needed in the HTTP request.
Do you think I am right?
Do you think I am wrong?
Do you have any official documentation resources that explain that what tunnel-agent
is doing with the Host header is fine?
For me it is clear that it should be as curl
does.
If you find this make sense, could you fix it?
I hope you can check it and agree, as it is being a hard problem while working under the proxy, and I always need to be under the proxy as we also have dependencies that need to be downloaded from an internal registry that is inside the normal network.
Thanks!