-
Notifications
You must be signed in to change notification settings - Fork 25.2k
Introduce a Hashing Processor #31087
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
10 commits
Select commit
Hold shift + click to select a range
0e77233
Initial commit for HashProcessor
talevy c6f9ff6
Merge branch 'master' into hash-processor
talevy 80ca1ee
add rest test, and add two params for salt & ignore_missing
talevy 514664e
Merge branch 'master' into hash-processor
talevy 1676aa9
fix key resolution
talevy cad0dc5
make salt required
talevy a0b411c
Merge branch 'master' into hash-processor
talevy 38f9141
Fix merge master test
talevy 8e44e06
Merge branch 'master' into hash-processor
talevy 278f235
remove extra lines
talevy File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
200 changes: 200 additions & 0 deletions
200
.../plugin/security/src/main/java/org/elasticsearch/xpack/security/ingest/HashProcessor.java
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,200 @@ | ||
/* | ||
* Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one | ||
* or more contributor license agreements. Licensed under the Elastic License; | ||
* you may not use this file except in compliance with the Elastic License. | ||
*/ | ||
package org.elasticsearch.xpack.security.ingest; | ||
|
||
import org.elasticsearch.ElasticsearchException; | ||
import org.elasticsearch.common.Nullable; | ||
import org.elasticsearch.common.Strings; | ||
import org.elasticsearch.common.collect.Tuple; | ||
import org.elasticsearch.common.settings.SecureSetting; | ||
import org.elasticsearch.common.settings.SecureString; | ||
import org.elasticsearch.common.settings.Setting; | ||
import org.elasticsearch.common.settings.Settings; | ||
import org.elasticsearch.ingest.AbstractProcessor; | ||
import org.elasticsearch.ingest.ConfigurationUtils; | ||
import org.elasticsearch.ingest.IngestDocument; | ||
import org.elasticsearch.ingest.Processor; | ||
import org.elasticsearch.xpack.core.security.SecurityField; | ||
|
||
import javax.crypto.Mac; | ||
import javax.crypto.SecretKeyFactory; | ||
import javax.crypto.spec.PBEKeySpec; | ||
import javax.crypto.spec.SecretKeySpec; | ||
import java.nio.charset.StandardCharsets; | ||
import java.security.InvalidKeyException; | ||
import java.security.NoSuchAlgorithmException; | ||
import java.security.spec.InvalidKeySpecException; | ||
import java.util.Arrays; | ||
import java.util.Base64; | ||
import java.util.HashMap; | ||
import java.util.List; | ||
import java.util.Locale; | ||
import java.util.Map; | ||
import java.util.Objects; | ||
import java.util.stream.Collectors; | ||
|
||
import static org.elasticsearch.ingest.ConfigurationUtils.newConfigurationException; | ||
|
||
/** | ||
* A processor that hashes the contents of a field (or fields) using various hashing algorithms | ||
*/ | ||
public final class HashProcessor extends AbstractProcessor { | ||
public static final String TYPE = "hash"; | ||
public static final Setting.AffixSetting<SecureString> HMAC_KEY_SETTING = SecureSetting | ||
.affixKeySetting(SecurityField.setting("ingest." + TYPE) + ".", "key", | ||
(key) -> SecureSetting.secureString(key, null)); | ||
|
||
private final List<String> fields; | ||
private final String targetField; | ||
private final Method method; | ||
private final Mac mac; | ||
private final byte[] salt; | ||
private final boolean ignoreMissing; | ||
|
||
HashProcessor(String tag, List<String> fields, String targetField, byte[] salt, Method method, @Nullable Mac mac, | ||
boolean ignoreMissing) { | ||
super(tag); | ||
this.fields = fields; | ||
this.targetField = targetField; | ||
this.method = method; | ||
this.mac = mac; | ||
this.salt = salt; | ||
this.ignoreMissing = ignoreMissing; | ||
} | ||
|
||
List<String> getFields() { | ||
return fields; | ||
} | ||
|
||
String getTargetField() { | ||
return targetField; | ||
} | ||
|
||
byte[] getSalt() { | ||
return salt; | ||
} | ||
|
||
@Override | ||
public void execute(IngestDocument document) { | ||
Map<String, String> hashedFieldValues = fields.stream().map(f -> { | ||
String value = document.getFieldValue(f, String.class, ignoreMissing); | ||
if (value == null && ignoreMissing) { | ||
return new Tuple<String, String>(null, null); | ||
} | ||
try { | ||
return new Tuple<>(f, method.hash(mac, salt, value)); | ||
} catch (Exception e) { | ||
throw new IllegalArgumentException("field[" + f + "] could not be hashed", e); | ||
} | ||
}).filter(tuple -> Objects.nonNull(tuple.v1())).collect(Collectors.toMap(Tuple::v1, Tuple::v2)); | ||
if (fields.size() == 1) { | ||
document.setFieldValue(targetField, hashedFieldValues.values().iterator().next()); | ||
} else { | ||
document.setFieldValue(targetField, hashedFieldValues); | ||
} | ||
} | ||
|
||
@Override | ||
public String getType() { | ||
return TYPE; | ||
} | ||
|
||
public static final class Factory implements Processor.Factory { | ||
|
||
private final Settings settings; | ||
private final Map<String, SecureString> secureKeys; | ||
|
||
public Factory(Settings settings) { | ||
this.settings = settings; | ||
this.secureKeys = new HashMap<>(); | ||
HMAC_KEY_SETTING.getAllConcreteSettings(settings).forEach(k -> { | ||
secureKeys.put(k.getKey(), k.get(settings)); | ||
}); | ||
} | ||
|
||
private static Mac createMac(Method method, SecureString password, byte[] salt, int iterations) { | ||
try { | ||
SecretKeyFactory secretKeyFactory = SecretKeyFactory.getInstance("PBKDF2With" + method.getAlgorithm()); | ||
PBEKeySpec keySpec = new PBEKeySpec(password.getChars(), salt, iterations, 128); | ||
byte[] pbkdf2 = secretKeyFactory.generateSecret(keySpec).getEncoded(); | ||
Mac mac = Mac.getInstance(method.getAlgorithm()); | ||
mac.init(new SecretKeySpec(pbkdf2, method.getAlgorithm())); | ||
return mac; | ||
} catch (NoSuchAlgorithmException | InvalidKeySpecException | InvalidKeyException e) { | ||
throw new IllegalArgumentException("invalid settings", e); | ||
} | ||
} | ||
|
||
@Override | ||
public HashProcessor create(Map<String, Processor.Factory> registry, String processorTag, Map<String, Object> config) { | ||
boolean ignoreMissing = ConfigurationUtils.readBooleanProperty(TYPE, processorTag, config, "ignore_missing", false); | ||
List<String> fields = ConfigurationUtils.readList(TYPE, processorTag, config, "fields"); | ||
if (fields.isEmpty()) { | ||
throw ConfigurationUtils.newConfigurationException(TYPE, processorTag, "fields", "must specify at least one field"); | ||
} else if (fields.stream().anyMatch(Strings::isNullOrEmpty)) { | ||
throw ConfigurationUtils.newConfigurationException(TYPE, processorTag, "fields", | ||
"a field-name entry is either empty or null"); | ||
} | ||
String targetField = ConfigurationUtils.readStringProperty(TYPE, processorTag, config, "target_field"); | ||
String keySettingName = ConfigurationUtils.readStringProperty(TYPE, processorTag, config, "key_setting"); | ||
SecureString key = secureKeys.get(keySettingName); | ||
if (key == null) { | ||
throw ConfigurationUtils.newConfigurationException(TYPE, processorTag, "key_setting", | ||
"key [" + keySettingName + "] must match [xpack.security.ingest.hash.*.key]. It is not set"); | ||
} | ||
String saltString = ConfigurationUtils.readStringProperty(TYPE, processorTag, config, "salt"); | ||
byte[] salt = saltString.getBytes(StandardCharsets.UTF_8); | ||
String methodProperty = ConfigurationUtils.readStringProperty(TYPE, processorTag, config, "method", "SHA256"); | ||
Method method = Method.fromString(processorTag, "method", methodProperty); | ||
int iterations = ConfigurationUtils.readIntProperty(TYPE, processorTag, config, "iterations", 5); | ||
Mac mac = createMac(method, key, salt, iterations); | ||
return new HashProcessor(processorTag, fields, targetField, salt, method, mac, ignoreMissing); | ||
} | ||
} | ||
|
||
enum Method { | ||
SHA1("HmacSHA1"), | ||
SHA256("HmacSHA256"), | ||
SHA384("HmacSHA384"), | ||
SHA512("HmacSHA512"); | ||
|
||
private final String algorithm; | ||
|
||
Method(String algorithm) { | ||
this.algorithm = algorithm; | ||
} | ||
|
||
public String getAlgorithm() { | ||
return algorithm; | ||
} | ||
|
||
@Override | ||
public String toString() { | ||
return name().toLowerCase(Locale.ROOT); | ||
} | ||
|
||
public String hash(Mac mac, byte[] salt, String input) { | ||
try { | ||
byte[] encrypted = mac.doFinal(input.getBytes(StandardCharsets.UTF_8)); | ||
byte[] messageWithSalt = new byte[salt.length + encrypted.length]; | ||
System.arraycopy(salt, 0, messageWithSalt, 0, salt.length); | ||
System.arraycopy(encrypted, 0, messageWithSalt, salt.length, encrypted.length); | ||
return Base64.getEncoder().encodeToString(messageWithSalt); | ||
} catch (IllegalStateException e) { | ||
throw new ElasticsearchException("error hashing data", e); | ||
} | ||
} | ||
|
||
public static Method fromString(String processorTag, String propertyName, String type) { | ||
try { | ||
return Method.valueOf(type.toUpperCase(Locale.ROOT)); | ||
} catch(IllegalArgumentException e) { | ||
throw newConfigurationException(TYPE, processorTag, propertyName, "type [" + type + | ||
"] not supported, cannot convert field. Valid hash methods: " + Arrays.toString(Method.values())); | ||
} | ||
} | ||
} | ||
} |
136 changes: 136 additions & 0 deletions
136
...rity/src/test/java/org/elasticsearch/xpack/security/ingest/HashProcessorFactoryTests.java
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,136 @@ | ||
/* | ||
* Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one | ||
* or more contributor license agreements. Licensed under the Elastic License; | ||
* you may not use this file except in compliance with the Elastic License. | ||
*/ | ||
package org.elasticsearch.xpack.security.ingest; | ||
|
||
import org.elasticsearch.ElasticsearchException; | ||
import org.elasticsearch.common.settings.MockSecureSettings; | ||
import org.elasticsearch.common.settings.Settings; | ||
import org.elasticsearch.test.ESTestCase; | ||
|
||
import java.nio.charset.StandardCharsets; | ||
import java.util.Collections; | ||
import java.util.HashMap; | ||
import java.util.Map; | ||
|
||
import static org.hamcrest.Matchers.equalTo; | ||
|
||
public class HashProcessorFactoryTests extends ESTestCase { | ||
|
||
public void testProcessor() { | ||
MockSecureSettings mockSecureSettings = new MockSecureSettings(); | ||
mockSecureSettings.setString("xpack.security.ingest.hash.processor.key", "my_key"); | ||
Settings settings = Settings.builder().setSecureSettings(mockSecureSettings).build(); | ||
HashProcessor.Factory factory = new HashProcessor.Factory(settings); | ||
Map<String, Object> config = new HashMap<>(); | ||
config.put("fields", Collections.singletonList("_field")); | ||
config.put("target_field", "_target"); | ||
config.put("salt", "_salt"); | ||
config.put("key_setting", "xpack.security.ingest.hash.processor.key"); | ||
for (HashProcessor.Method method : HashProcessor.Method.values()) { | ||
config.put("method", method.toString()); | ||
HashProcessor processor = factory.create(null, "_tag", new HashMap<>(config)); | ||
assertThat(processor.getFields(), equalTo(Collections.singletonList("_field"))); | ||
assertThat(processor.getTargetField(), equalTo("_target")); | ||
assertArrayEquals(processor.getSalt(), "_salt".getBytes(StandardCharsets.UTF_8)); | ||
} | ||
} | ||
|
||
public void testProcessorNoFields() { | ||
MockSecureSettings mockSecureSettings = new MockSecureSettings(); | ||
mockSecureSettings.setString("xpack.security.ingest.hash.processor.key", "my_key"); | ||
Settings settings = Settings.builder().setSecureSettings(mockSecureSettings).build(); | ||
HashProcessor.Factory factory = new HashProcessor.Factory(settings); | ||
Map<String, Object> config = new HashMap<>(); | ||
config.put("target_field", "_target"); | ||
config.put("salt", "_salt"); | ||
config.put("key_setting", "xpack.security.ingest.hash.processor.key"); | ||
config.put("method", HashProcessor.Method.SHA1.toString()); | ||
ElasticsearchException e = expectThrows(ElasticsearchException.class, | ||
() -> factory.create(null, "_tag", config)); | ||
assertThat(e.getMessage(), equalTo("[fields] required property is missing")); | ||
} | ||
|
||
public void testProcessorNoTargetField() { | ||
MockSecureSettings mockSecureSettings = new MockSecureSettings(); | ||
mockSecureSettings.setString("xpack.security.ingest.hash.processor.key", "my_key"); | ||
Settings settings = Settings.builder().setSecureSettings(mockSecureSettings).build(); | ||
HashProcessor.Factory factory = new HashProcessor.Factory(settings); | ||
Map<String, Object> config = new HashMap<>(); | ||
config.put("fields", Collections.singletonList("_field")); | ||
config.put("salt", "_salt"); | ||
config.put("key_setting", "xpack.security.ingest.hash.processor.key"); | ||
config.put("method", HashProcessor.Method.SHA1.toString()); | ||
ElasticsearchException e = expectThrows(ElasticsearchException.class, | ||
() -> factory.create(null, "_tag", config)); | ||
assertThat(e.getMessage(), equalTo("[target_field] required property is missing")); | ||
} | ||
|
||
public void testProcessorFieldsIsEmpty() { | ||
MockSecureSettings mockSecureSettings = new MockSecureSettings(); | ||
mockSecureSettings.setString("xpack.security.ingest.hash.processor.key", "my_key"); | ||
Settings settings = Settings.builder().setSecureSettings(mockSecureSettings).build(); | ||
HashProcessor.Factory factory = new HashProcessor.Factory(settings); | ||
Map<String, Object> config = new HashMap<>(); | ||
config.put("fields", Collections.singletonList(randomBoolean() ? "" : null)); | ||
config.put("salt", "_salt"); | ||
config.put("target_field", "_target"); | ||
config.put("key_setting", "xpack.security.ingest.hash.processor.key"); | ||
config.put("method", HashProcessor.Method.SHA1.toString()); | ||
ElasticsearchException e = expectThrows(ElasticsearchException.class, | ||
() -> factory.create(null, "_tag", config)); | ||
assertThat(e.getMessage(), equalTo("[fields] a field-name entry is either empty or null")); | ||
} | ||
|
||
public void testProcessorMissingSalt() { | ||
MockSecureSettings mockSecureSettings = new MockSecureSettings(); | ||
mockSecureSettings.setString("xpack.security.ingest.hash.processor.key", "my_key"); | ||
Settings settings = Settings.builder().setSecureSettings(mockSecureSettings).build(); | ||
HashProcessor.Factory factory = new HashProcessor.Factory(settings); | ||
Map<String, Object> config = new HashMap<>(); | ||
config.put("fields", Collections.singletonList("_field")); | ||
config.put("target_field", "_target"); | ||
config.put("key_setting", "xpack.security.ingest.hash.processor.key"); | ||
ElasticsearchException e = expectThrows(ElasticsearchException.class, | ||
() -> factory.create(null, "_tag", config)); | ||
assertThat(e.getMessage(), equalTo("[salt] required property is missing")); | ||
} | ||
|
||
public void testProcessorInvalidMethod() { | ||
MockSecureSettings mockSecureSettings = new MockSecureSettings(); | ||
mockSecureSettings.setString("xpack.security.ingest.hash.processor.key", "my_key"); | ||
Settings settings = Settings.builder().setSecureSettings(mockSecureSettings).build(); | ||
HashProcessor.Factory factory = new HashProcessor.Factory(settings); | ||
Map<String, Object> config = new HashMap<>(); | ||
config.put("fields", Collections.singletonList("_field")); | ||
config.put("salt", "_salt"); | ||
config.put("target_field", "_target"); | ||
config.put("key_setting", "xpack.security.ingest.hash.processor.key"); | ||
config.put("method", "invalid"); | ||
ElasticsearchException e = expectThrows(ElasticsearchException.class, | ||
() -> factory.create(null, "_tag", config)); | ||
assertThat(e.getMessage(), equalTo("[method] type [invalid] not supported, cannot convert field. " + | ||
"Valid hash methods: [sha1, sha256, sha384, sha512]")); | ||
} | ||
|
||
public void testProcessorInvalidOrMissingKeySetting() { | ||
Settings settings = Settings.builder().setSecureSettings(new MockSecureSettings()).build(); | ||
HashProcessor.Factory factory = new HashProcessor.Factory(settings); | ||
Map<String, Object> config = new HashMap<>(); | ||
config.put("fields", Collections.singletonList("_field")); | ||
config.put("salt", "_salt"); | ||
config.put("target_field", "_target"); | ||
config.put("key_setting", "invalid"); | ||
config.put("method", HashProcessor.Method.SHA1.toString()); | ||
ElasticsearchException e = expectThrows(ElasticsearchException.class, | ||
() -> factory.create(null, "_tag", new HashMap<>(config))); | ||
assertThat(e.getMessage(), | ||
equalTo("[key_setting] key [invalid] must match [xpack.security.ingest.hash.*.key]. It is not set")); | ||
config.remove("key_setting"); | ||
ElasticsearchException ex = expectThrows(ElasticsearchException.class, | ||
() -> factory.create(null, "_tag", config)); | ||
assertThat(ex.getMessage(), equalTo("[key_setting] required property is missing")); | ||
} | ||
} |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to require a user provided key ? If I understand correctly, it really only protects against rainbow table attack when used in this context, but with the salt and potentially hard coded default key it seems we should be covered.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My understanding was that this key is required to be private. Is there a way to set a default in a private way?
I am not aware of a way that users can retrieve this key on their own from the keystore
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, i don't think i was very clear.
I meant the hashing algorithm here is an HMAC (a keyed hash). For the anonymization use case we could use a non keyed Hash to the same effect. The key doesn't provide much value unless you are verifying the hash's integrity.
To avoid swapping implementations based on a key's existence, a hard coded key can be used, which effectively (but not technically) changes from a keyed hash to non keyed hash.
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't feel comfortable making that call, so I went with the more strict approach. @jkakavas, do you know if this approach would also be sufficient for some use cases like GDPR?
I would say leave it as is for now, and if we decide to give users that option, then we can simply allow for this setting to be
null
in later versions while preserving bwcUh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I disagree with this statement. Salted (and not keyed) hashes offer protection against precomputed hashes (rainbow tables) and using slow algorithms offers protection against brute force attacks. However, the use cases of pseudonymization/anonymization for hashing processor also include processing short length, non secret data.
We do want to protect against linking and correlating such compromised/leaked pseudonymous/anonymous data and keyed hash functions also help us mitigate this (assuming the key is secret of course)
Let's assume that two companies used the hashing processor to anynomize usernames of existing users and they both use the same key, so falling back to a salted hash processor (I'll use SHA1 for simplicity)
Company A would store SHA1(ioannis+salt1) :
[salt1$b68e23bb10067b6a816786783a8d530f39b0f19b]
Company B would store SHA1(ioannis+salt2) :
[salt2$cd428522855de2eaabce53084acb64c0df847b73]
Now if both data sets leaked and since salts are not supposed to be secret and we store them along with the hashes, if I would want to see if I am a user in both Companies, I would calculate Hash(ioannis+saltX) for all the salt values I could find in leaked data and see if I get matches in both data sets. No need to brute force the entire dataset, no need to use rainbow tables, just to see If I can get a match.
Finally GDPR defines pseudonymization as (emphasis mine)
If an HMAC is not used, then there is no key (additional information that is kept separately) and hence this is not pseudonymization, at least according to GDPR.
edit: I wrote this comment before going through the PR in detail. We do store the salt separately(in the config) but not securely so that would imply that it violates the
parameter.
All that said, the Hashing Processor is not GDPR specific and I think there is value in offering non keyed hash functions for other use cases. I would prefer if we make that explicit though by allowing the use of such algorithms instead of HMACs with hard coded keys.
Another alternative would be the use of generated keys for HMACs if the user doesn't set one in the settings ( similar to what Kibana does for its auth cookie encryption key )
WDYT ?
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jkakavas - thanks for the thoughtful reply. I don't want to derail this PR and will open an issue for further discussion.
EDIT: Issue logged: #31692
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thank you @jkakavas! I will leave as is and let the #31692 govern how it changes in the future