Allow plain HTTP console access (as a non-default option) #440

anjackson · 2021-09-30T12:47:06Z

We got a report that people were having problems with the fact that HTTPS is used to access the Heritrix3 web console. In some situations, e.g. corporate IT environments, it is not possible to accept self-signed certificates nor import/permit locally minted certificate authorities.

If there are no objections, I would like to propose a new command-line option that enables acccess over plain HTTP. If it is not set, then users should be directed to HTTPS (as per #318). But if this option is enabled, users should be able to access the console directly.

A further question is whether the HTTP Basic authentication should always be required. Certainly, if there is authentication it seems like we must enforce the use of HTTPS. But, we could allow users to switch off authentication when accessing Heritrix via HTTP?

ato · 2021-10-01T01:54:59Z

I can't imagine many scenarios where running Heritrix without HTTPS or authentication on a network would not be irresponsible as it enables trivial remote code execution. It would also very likely violate the security policy of said corporate IT environments.

That said I think there are some use cases that are reasonable:

A desktop user wanting to run Heritrix on their single-user PC and access it from a local browser.
Running Heritrix on a server behind an authentication proxy or tunnel that provides transport encryption and externally authenticates the user with their corporate credentials (e.g. oauth2-proxy, ssh port forwarding).

If we allow this configuration but only when the UI is bound to localhost it would cover these more reasonable use cases without encouraging the more irresponsible one. Of course it could be circumvented by using an unprotected reverse proxy but that circumvention can already be done and at least requires someone to think about what they're doing a bit more than "oh, I just used --disable-security temporarily to quickly get started while testing but oops its in production now".

ato added the feature request label Oct 1, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow plain HTTP console access (as a non-default option) #440

Allow plain HTTP console access (as a non-default option) #440

anjackson commented Sep 30, 2021

ato commented Oct 1, 2021 •

edited

Loading

Allow plain HTTP console access (as a non-default option) #440

Allow plain HTTP console access (as a non-default option) #440

Comments

anjackson commented Sep 30, 2021

ato commented Oct 1, 2021 • edited Loading

ato commented Oct 1, 2021 •

edited

Loading