Description
A new ingest processor, hash, is near completion. See #31087. As is, it is an excellent addition to the set of processors, but I wanted to continue the discussion started here if we should require key based hashing.
The current implementation requires a key for the hash function, which requires the key to exist in the keystore. For the key to exist in the keystore a human would need to add it.
This step could prove to be a barier to adoption since it requires configuration that lives outside the ingest pipeline configuration in order for the processor to be functional. The user wishing to use the new hash processor would need to coordinate with the user responsible for the elasticsearch deployment to ensure that this key exists. (not always the same the person / org). This extra step may also be cumbersome when used with Beats modules. Also, this could prove even more challenging on cloud (disclaimer: I am not familiar with how the keystore and cloud interop).
@jkakavas makes some excellent points about why a keyed hash is better, but I would like to consider removing the requirement to use keys for hashes.
This is a classic case of usability vs. security for the default values.
While I agree with @jkakavas that keyed hashes are better, I believe that non-keyed modern cryptogphic hashes are sufficient for the default case. IMO, allowing non-keyed hashes would remove that barier to adoption without introducing too much risk. Documentation can be provided to encourage the user to provide a key for the hash.
Also in the original thread @jkakavas mentions:
Another alternative would be the use of generated keys for HMACs if the user doesn't set one in the settings ( similar to what Kibana does for its auth cookie encryption key )
Thoughts ?