-
Notifications
You must be signed in to change notification settings - Fork 425
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Change Sha1HashKey to CityHash128Key for generating PreparedStatement handle and metadata cache keys #717
Change Sha1HashKey to CityHash128Key for generating PreparedStatement handle and metadata cache keys #717
Changes from 2 commits
ad1fd3e
cc30eff
24cf6c5
85ef29c
16649a3
b4759e3
ee347d8
d20e5d5
8f8067e
27da9e2
d349ff5
b442832
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -53,8 +53,6 @@ | |
import javax.sql.XAConnection; | ||
|
||
import org.ietf.jgss.GSSCredential; | ||
import org.ietf.jgss.GSSException; | ||
|
||
import mssql.googlecode.concurrentlinkedhashmap.ConcurrentLinkedHashMap; | ||
import mssql.googlecode.concurrentlinkedhashmap.ConcurrentLinkedHashMap.Builder; | ||
import mssql.googlecode.concurrentlinkedhashmap.EvictionListener; | ||
|
@@ -119,37 +117,31 @@ public class SQLServerConnection implements ISQLServerConnection { | |
|
||
private String originalHostNameInCertificate = null; | ||
|
||
static class Sha1HashKey { | ||
private byte[] bytes; | ||
static class CityHash128Key { | ||
private long[] segments; | ||
private int hashCode; | ||
|
||
Sha1HashKey(String sql, | ||
CityHash128Key(String sql, | ||
String parametersDefinition) { | ||
this(String.format("%s%s", sql, parametersDefinition)); | ||
} | ||
|
||
Sha1HashKey(String s) { | ||
bytes = getSha1Digest().digest(s.getBytes()); | ||
CityHash128Key(String s) { | ||
segments = CityHash.cityHash128(s.getBytes(), 0, s.length()); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Performance of this can be improved, especially in the case of large SQL strings. I recommend using CityHash128Key(String s) {
byte[] bytes = byte[s.length()];
s.getBytes(0, s.length(), bytes, 0);
segments = CityHash.cityHash128(bytes, 0, s.length());
} There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. s.getBytes(int srcBegin, int srcEnd, byte dst[], int dstBegin) truncates the upper 8 bits of Java's internal Unicode-16 representation. Since this is SQL before inserting parameter values, it should be mostly 7-bit ASCII -- the only exception I can think of would be database names/table names/column names, at least some of which can be full Unicode. For European or other alphabetic languages, their Unicode-16 spaces tend to be 256 characters wide and the odds of this truncation causing a spurious collision seem very small -- but I'd be less confident if the language in question was, say, Chinese, that you couldn't ever have two SQLs that differed only by one ideogram in a table/column name, and the two ideograms in question unfortunately happened to have the same lower 8 bits. It's not very likely, but it's significantly less astronomically unlikely than a 128-bit hash collision. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. So yes, we need to avoid character reencoding, but I think we also need to avoid throwing away hash entropy while we do so. I was wondering about something like getBytes(Charset.UTF_16BE) -- if it matches the internal representation used by String, will it skip reencoding? So maybe something like:
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @RDearnaley Looking at the code of There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @RDearnaley It appears only option would be a custom There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. How strange -- yes, taking a look in the Java code for them, it appears that none of UTF_16BE, UTF_16 etc make use of the obvious optimization that one of them has to be a no-op. I'm wondering if someone might have already created a raw double byte encoding to speed this up -- it seems a fairly obvious performance optimization for cases like this where you care about the entropy content rather than the specific values. If not, it might make a good addition to the Java version of CityHash There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There is another possible approach whose speed would be worth testing:
or just inline this since we only use it once. Old-school, I know. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Of course it would work, but getChars() is going to create an additional copy, incurring a CPU hit as well as increasing GC pressure. Regarding inlining, probably unnecessary as the JIT will inline it when it gets hot. I think a custom encoder would offer better performance, as the internal char array is passed without copy. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Would the str.charAt(i) approach also be making a copy, or would that just go on the stack/in a register? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. str.charAt(i) is just a simple array access, with the value passed on the stack. If the compiler inlines the call, it could be a viable option. You'd have to find it in the output of |
||
} | ||
|
||
public boolean equals(Object obj) { | ||
if (!(obj instanceof Sha1HashKey)) | ||
if (!(obj instanceof CityHash128Key)) | ||
return false; | ||
|
||
return java.util.Arrays.equals(bytes, ((Sha1HashKey)obj).bytes); | ||
return java.util.Arrays.equals(segments, ((CityHash128Key) obj).segments); | ||
} | ||
|
||
public int hashCode() { | ||
return java.util.Arrays.hashCode(bytes); | ||
} | ||
|
||
private java.security.MessageDigest getSha1Digest() { | ||
try { | ||
return java.security.MessageDigest.getInstance("SHA-1"); | ||
} | ||
catch (final java.security.NoSuchAlgorithmException e) { | ||
// This is not theoretically possible, but we're forced to catch it anyway | ||
throw new RuntimeException(e); | ||
if (0 == hashCode) { | ||
hashCode = java.util.Arrays.hashCode(segments); | ||
} | ||
return hashCode; | ||
} | ||
} | ||
|
||
|
@@ -162,9 +154,9 @@ class PreparedStatementHandle { | |
private boolean isDirectSql; | ||
private volatile boolean evictedFromCache; | ||
private volatile boolean explicitlyDiscarded; | ||
private Sha1HashKey key; | ||
private CityHash128Key key; | ||
|
||
PreparedStatementHandle(Sha1HashKey key, int handle, boolean isDirectSql, boolean isEvictedFromCache) { | ||
PreparedStatementHandle(CityHash128Key key, int handle, boolean isDirectSql, boolean isEvictedFromCache) { | ||
this.key = key; | ||
this.handle = handle; | ||
this.isDirectSql = isDirectSql; | ||
|
@@ -200,7 +192,7 @@ int getHandle() { | |
} | ||
|
||
/** Get the cache key. */ | ||
Sha1HashKey getKey() { | ||
CityHash128Key getKey() { | ||
return key; | ||
} | ||
|
||
|
@@ -248,21 +240,21 @@ void removeReference() { | |
static final private int PARSED_SQL_CACHE_SIZE = 100; | ||
|
||
/** Cache of parsed SQL meta data */ | ||
static private ConcurrentLinkedHashMap<Sha1HashKey, ParsedSQLCacheItem> parsedSQLCache; | ||
static private ConcurrentLinkedHashMap<CityHash128Key, ParsedSQLCacheItem> parsedSQLCache; | ||
|
||
static { | ||
parsedSQLCache = new Builder<Sha1HashKey, ParsedSQLCacheItem>() | ||
parsedSQLCache = new Builder<CityHash128Key, ParsedSQLCacheItem>() | ||
.maximumWeightedCapacity(PARSED_SQL_CACHE_SIZE) | ||
.build(); | ||
} | ||
|
||
/** Get prepared statement cache entry if exists, if not parse and create a new one */ | ||
static ParsedSQLCacheItem getCachedParsedSQL(Sha1HashKey key) { | ||
static ParsedSQLCacheItem getCachedParsedSQL(CityHash128Key key) { | ||
return parsedSQLCache.get(key); | ||
} | ||
|
||
/** Parse and create a information about parsed SQL text */ | ||
static ParsedSQLCacheItem parseAndCacheSQL(Sha1HashKey key, String sql) throws SQLServerException { | ||
static ParsedSQLCacheItem parseAndCacheSQL(CityHash128Key key, String sql) throws SQLServerException { | ||
JDBCSyntaxTranslator translator = new JDBCSyntaxTranslator(); | ||
|
||
String parsedSql = translator.translate(sql); | ||
|
@@ -282,9 +274,9 @@ static ParsedSQLCacheItem parseAndCacheSQL(Sha1HashKey key, String sql) throws | |
private int statementPoolingCacheSize = DEFAULT_STATEMENT_POOLING_CACHE_SIZE; | ||
|
||
/** Cache of prepared statement handles */ | ||
private ConcurrentLinkedHashMap<Sha1HashKey, PreparedStatementHandle> preparedStatementHandleCache; | ||
private ConcurrentLinkedHashMap<CityHash128Key, PreparedStatementHandle> preparedStatementHandleCache; | ||
/** Cache of prepared statement parameter metadata */ | ||
private ConcurrentLinkedHashMap<Sha1HashKey, SQLServerParameterMetaData> parameterMetadataCache; | ||
private ConcurrentLinkedHashMap<CityHash128Key, SQLServerParameterMetaData> parameterMetadataCache; | ||
/** | ||
* Checks whether statement pooling is enabled or disabled. The default is set to true; | ||
*/ | ||
|
@@ -5808,39 +5800,39 @@ public void setStatementPoolingCacheSize(int value) { | |
* @param value | ||
*/ | ||
private void prepareCache() { | ||
preparedStatementHandleCache = new Builder<Sha1HashKey, PreparedStatementHandle>().maximumWeightedCapacity(getStatementPoolingCacheSize()) | ||
preparedStatementHandleCache = new Builder<CityHash128Key, PreparedStatementHandle>().maximumWeightedCapacity(getStatementPoolingCacheSize()) | ||
.listener(new PreparedStatementCacheEvictionListener()).build(); | ||
|
||
parameterMetadataCache = new Builder<Sha1HashKey, SQLServerParameterMetaData>().maximumWeightedCapacity(getStatementPoolingCacheSize()) | ||
parameterMetadataCache = new Builder<CityHash128Key, SQLServerParameterMetaData>().maximumWeightedCapacity(getStatementPoolingCacheSize()) | ||
.build(); | ||
} | ||
|
||
/** Get a parameter metadata cache entry if statement pooling is enabled */ | ||
final SQLServerParameterMetaData getCachedParameterMetadata(Sha1HashKey key) { | ||
final SQLServerParameterMetaData getCachedParameterMetadata(CityHash128Key key) { | ||
if(!isStatementPoolingEnabled()) | ||
return null; | ||
|
||
return parameterMetadataCache.get(key); | ||
} | ||
|
||
/** Register a parameter metadata cache entry if statement pooling is enabled */ | ||
final void registerCachedParameterMetadata(Sha1HashKey key, SQLServerParameterMetaData pmd) { | ||
final void registerCachedParameterMetadata(CityHash128Key key, SQLServerParameterMetaData pmd) { | ||
if(!isStatementPoolingEnabled() || null == pmd) | ||
return; | ||
|
||
parameterMetadataCache.put(key, pmd); | ||
} | ||
|
||
/** Get or create prepared statement handle cache entry if statement pooling is enabled */ | ||
final PreparedStatementHandle getCachedPreparedStatementHandle(Sha1HashKey key) { | ||
final PreparedStatementHandle getCachedPreparedStatementHandle(CityHash128Key key) { | ||
if(!isStatementPoolingEnabled()) | ||
return null; | ||
|
||
return preparedStatementHandleCache.get(key); | ||
} | ||
|
||
/** Get or create prepared statement handle cache entry if statement pooling is enabled */ | ||
final PreparedStatementHandle registerCachedPreparedStatementHandle(Sha1HashKey key, int handle, boolean isDirectSql) { | ||
final PreparedStatementHandle registerCachedPreparedStatementHandle(CityHash128Key key, int handle, boolean isDirectSql) { | ||
if(!isStatementPoolingEnabled() || null == key) | ||
return null; | ||
|
||
|
@@ -5866,8 +5858,8 @@ final void evictCachedPreparedStatementHandle(PreparedStatementHandle handle) { | |
} | ||
|
||
// Handle closing handles when removed from cache. | ||
final class PreparedStatementCacheEvictionListener implements EvictionListener<Sha1HashKey, PreparedStatementHandle> { | ||
public void onEviction(Sha1HashKey key, PreparedStatementHandle handle) { | ||
final class PreparedStatementCacheEvictionListener implements EvictionListener<CityHash128Key, PreparedStatementHandle> { | ||
public void onEviction(CityHash128Key key, PreparedStatementHandle handle) { | ||
if(null != handle) { | ||
handle.setIsEvictedFromCache(true); // Mark as evicted from cache. | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
String.format()
is a terrible way to concatenate two Strings. If I wrote this in my original commit, I must have been high. Just change tothis(sql + parametersDefinition)
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having taken another look, yes, this showed up in my YourKit runs -- in fact I saw it roughly as much as the character encoding issue we've been discussing above.