Description
openedon May 3, 2019
Driver version
7.2.1
JAVA/JVM version
Oracle OpenJDK Java 11
Problem description
- Expected behaviour:
When searching for an enum, you should never call .values(). Java will always call an Array.copy on the internal Enum values array (to prevent someone from doing weird things like reordering the values).
JDBCType::of ends up being called very frequently in tight loops (particularly for things like TVPs). It should not allocate new memory.
- Actual behaviour:
JDBCType::of doesfor (JDBCType value: values())
This was 56% of the memory allocated while pushing in a TVP into sql server. (32 GiB for this vs 25 GiB the rest of the JDBC operations)
- Any other details that can be helpful:
Code in question
https://github.com/Microsoft/mssql-jdbc/blob/dev/src/main/java/com/microsoft/sqlserver/jdbc/DataTypes.java#L908
This particular problem is causing excessive memory thrashing/GCs when we TVPs in our JDBC driver with larger TVPs.
There are a few approaches I've taken for this problem in the past
For small enums, you can keep the same lookup method, but create a private static inner array of values.
So,
private static final JDBCType[] VALUES = values()
Replacing .values() with VALUES (Unless it is being later modified, almost never is) is a simple change with little risk of reduced performance.
If the Enum is large, it may make sense to create a map up front
private static final Map<Integer, JdbcType> mapping = new HashMap<>();
static {
for (var type : values())
mapping.put(type.intValue, type);
}
public JDBCTye of(int intValue) {
return mapping.get(intValue)
}
etc... That will be faster for large enums, slower for small. Whether or not to do it is a judgement call (preferably after using JMH to verify which is faster).
Finally, the fastest approach is going to be an exhaustive switch statement over the incoming ID. I'll do this if I want high speed, but you really need to have a unit test in place to ensure the switch remains exhaustive.
Not applicable to this case, but if the ENUM ids are closely packed, a faster way then the map approach (but still slower than the switch statement) is to allocate a new JDBCType array where size = maxId - minId. You place the enum values into the array based on [id - minId] and you look them up using the same. For very tightly packed ids, this ends up being faster than the map with low to no memory impact (only 1 memory indirection, no potential int->Integer conversions). However, sparse ids make this much worse memory wise than the map.