Skip to content

new internal isURL check can result in http.request() option fields being ignored #46981

Closed
@trentm

Description

@trentm

Version

v20.0.0-nightly2023030448f99e5f6a

Platform

macOS, but will happen on any platform.

Subsystem

http, url

What steps will reproduce the bug?

// nodev20-http-break.example.js
const http = require('http')

const server = http.createServer((req, res) => {
  console.log('SERVER: on request: %s %s %s', req.method, req.url, req.headers)
  req.resume()
  req.on('end', () => {
    res.writeHead(200)
    res.end('pong')
  })
})

server.listen(0, '127.0.0.1', () => {
  console.log('SERVER: listening on port', server.address().port)

  const url = new URL(`http://127.0.0.1:${server.address().port}/ping?q=term`)
  const reqOpts = {
    method: 'GET',
    protocol: url.protocol,
    hostname: url.hostname[0] === '[' ? url.hostname.slice(1, -1) : url.hostname,
    path: url.pathname + url.search,
    port: Number(url.port),
    href: url.href,     // <--- oops
    origin: url.origin, // <--- oops
    headers: {          // <--- this gets ignored
      Foo: 'Bar'
    },
  }
  console.log('CLIENT: make request with options:', reqOpts)
  const clientReq = http.request(reqOpts)
  clientReq.on('response', (clientRes) => {
    console.log('CLIENT: on response: %s %s', clientRes.statusCode, clientRes.headers)
    clientRes.resume()
  })
  clientReq.on('close', () => {
    console.log('CLIENT: on close: close the server')
    server.close()
  })
  clientReq.end()
})

How often does it reproduce? Is there a required condition?

Everytime a non-URL object is passed as first argument to http.request() with the href and origin fields (accidentally), and with other non-URL-y valid input arguments.

What is the expected behavior?

That the "Foo" header is sent to the server.

What do you see instead?

When we run that, notice that the Foo header is not received by the HTTP server:

% node --version
v20.0.0-nightly2023030448f99e5f6a

% node nodev20-http-break.example.js
SERVER: listening on port 64924
CLIENT: make request with options: {
  method: 'GET',
  protocol: 'http:',
  hostname: '127.0.0.1',
  path: '/ping?q=term',
  port: 64924,
  href: 'http://127.0.0.1:64924/ping?q=term',
  origin: 'http://127.0.0.1:64924',
  headers: { Foo: 'Bar' }
}
SERVER: on request: GET / { host: '127.0.0.1:64924', connection: 'keep-alive' }
CLIENT: on response: 200 {
  date: 'Mon, 06 Mar 2023 21:11:32 GMT',
  connection: 'keep-alive',
  'keep-alive': 'timeout=5',
  'transfer-encoding': 'chunked'
}
CLIENT: on close: close the server

With the preceding nightly node v20 build, the Foo header is received:

% node --version
v20.0.0-nightly20230303a37b72da87

% node nodev20-http-break.example.js
SERVER: listening on port 64933
CLIENT: make request with options: {
  method: 'GET',
  protocol: 'http:',
  hostname: '127.0.0.1',
  path: '/ping?q=term',
  port: 64933,
  href: 'http://127.0.0.1:64933/ping?q=term',
  origin: 'http://127.0.0.1:64933',
  headers: { Foo: 'Bar' }
}
SERVER: on request: GET /ping?q=term { foo: 'Bar', host: '127.0.0.1:64933', connection: 'keep-alive' }
CLIENT: on response: 200 {
  date: 'Mon, 06 Mar 2023 21:12:49 GMT',
  connection: 'keep-alive',
  'keep-alive': 'timeout=5',
  'transfer-encoding': 'chunked'
}
CLIENT: on close: close the server

Additional information

#46904 introduced a change to the internal isURL check that is used in http.request() to determine if the first argument is a URL instance:

} else if (isURL(input)) {

It now considers the first argument to be a URL if it has the href and origin fields.
This resulted in a surprise for me.

Basically, the @elastic/elasticsearch module is building an options object for http.request(options, ...) that includes the href and origin fields. (There isn't a good reason for it to include those fields other than -- I'm guessing -- the original change just having used all the fields on a WHATWG URL object, rather than reading the input arguments specified in the node docs).

The result is that other non-URL-y fields in the options object are ignored, like agent and headers (and perhaps method).

I don't know if this would be considered a bug, but I'm opening it for discussion here. There isn't a good reason for the code above to be passing the href and origin fields to the http.request(...) options. However, perhaps using just those two fields as the isURL check for http.request is too quick of a decision. Would it be too complex/slow to see if the given input to ClientRequest includes any of the other legitimate options? I don't object to closing this issue if it is expected that accidentally passing in an input object with href and origin will be rare.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions