Description
Version
v20.0.0-nightly2023030448f99e5f6a
Platform
macOS, but will happen on any platform.
Subsystem
http, url
What steps will reproduce the bug?
// nodev20-http-break.example.js
const http = require('http')
const server = http.createServer((req, res) => {
console.log('SERVER: on request: %s %s %s', req.method, req.url, req.headers)
req.resume()
req.on('end', () => {
res.writeHead(200)
res.end('pong')
})
})
server.listen(0, '127.0.0.1', () => {
console.log('SERVER: listening on port', server.address().port)
const url = new URL(`http://127.0.0.1:${server.address().port}/ping?q=term`)
const reqOpts = {
method: 'GET',
protocol: url.protocol,
hostname: url.hostname[0] === '[' ? url.hostname.slice(1, -1) : url.hostname,
path: url.pathname + url.search,
port: Number(url.port),
href: url.href, // <--- oops
origin: url.origin, // <--- oops
headers: { // <--- this gets ignored
Foo: 'Bar'
},
}
console.log('CLIENT: make request with options:', reqOpts)
const clientReq = http.request(reqOpts)
clientReq.on('response', (clientRes) => {
console.log('CLIENT: on response: %s %s', clientRes.statusCode, clientRes.headers)
clientRes.resume()
})
clientReq.on('close', () => {
console.log('CLIENT: on close: close the server')
server.close()
})
clientReq.end()
})
How often does it reproduce? Is there a required condition?
Everytime a non-URL object is passed as first argument to http.request()
with the href
and origin
fields (accidentally), and with other non-URL-y valid input arguments.
What is the expected behavior?
That the "Foo" header is sent to the server.
What do you see instead?
When we run that, notice that the Foo
header is not received by the HTTP server:
% node --version
v20.0.0-nightly2023030448f99e5f6a
% node nodev20-http-break.example.js
SERVER: listening on port 64924
CLIENT: make request with options: {
method: 'GET',
protocol: 'http:',
hostname: '127.0.0.1',
path: '/ping?q=term',
port: 64924,
href: 'http://127.0.0.1:64924/ping?q=term',
origin: 'http://127.0.0.1:64924',
headers: { Foo: 'Bar' }
}
SERVER: on request: GET / { host: '127.0.0.1:64924', connection: 'keep-alive' }
CLIENT: on response: 200 {
date: 'Mon, 06 Mar 2023 21:11:32 GMT',
connection: 'keep-alive',
'keep-alive': 'timeout=5',
'transfer-encoding': 'chunked'
}
CLIENT: on close: close the server
With the preceding nightly node v20 build, the Foo
header is received:
% node --version
v20.0.0-nightly20230303a37b72da87
% node nodev20-http-break.example.js
SERVER: listening on port 64933
CLIENT: make request with options: {
method: 'GET',
protocol: 'http:',
hostname: '127.0.0.1',
path: '/ping?q=term',
port: 64933,
href: 'http://127.0.0.1:64933/ping?q=term',
origin: 'http://127.0.0.1:64933',
headers: { Foo: 'Bar' }
}
SERVER: on request: GET /ping?q=term { foo: 'Bar', host: '127.0.0.1:64933', connection: 'keep-alive' }
CLIENT: on response: 200 {
date: 'Mon, 06 Mar 2023 21:12:49 GMT',
connection: 'keep-alive',
'keep-alive': 'timeout=5',
'transfer-encoding': 'chunked'
}
CLIENT: on close: close the server
Additional information
#46904 introduced a change to the internal isURL
check that is used in http.request()
to determine if the first argument is a URL instance:
Line 136 in db81af6
It now considers the first argument to be a URL if it has the
href
and origin
fields.This resulted in a surprise for me.
Basically, the @elastic/elasticsearch
module is building an options object for http.request(options, ...)
that includes the href
and origin
fields. (There isn't a good reason for it to include those fields other than -- I'm guessing -- the original change just having used all the fields on a WHATWG URL object, rather than reading the input arguments specified in the node docs).
The result is that other non-URL-y fields in the options
object are ignored, like agent
and headers
(and perhaps method
).
I don't know if this would be considered a bug, but I'm opening it for discussion here. There isn't a good reason for the code above to be passing the href
and origin
fields to the http.request(...)
options. However, perhaps using just those two fields as the isURL
check for http.request
is too quick of a decision. Would it be too complex/slow to see if the given input
to ClientRequest
includes any of the other legitimate options? I don't object to closing this issue if it is expected that accidentally passing in an input object with href
and origin
will be rare.