Skip to content

[BUG] Performance issue - JDBCType::of calls .values() on enum #1051

Closed

Description

Driver version

7.2.1

JAVA/JVM version

Oracle OpenJDK Java 11

Problem description

  1. Expected behaviour:
    When searching for an enum, you should never call .values(). Java will always call an Array.copy on the internal Enum values array (to prevent someone from doing weird things like reordering the values).

JDBCType::of ends up being called very frequently in tight loops (particularly for things like TVPs). It should not allocate new memory.

  1. Actual behaviour:
    JDBCType::of does for (JDBCType value: values())

This was 56% of the memory allocated while pushing in a TVP into sql server. (32 GiB for this vs 25 GiB the rest of the JDBC operations)

  1. Any other details that can be helpful:
    Code in question
    https://github.com/Microsoft/mssql-jdbc/blob/dev/src/main/java/com/microsoft/sqlserver/jdbc/DataTypes.java#L908

This particular problem is causing excessive memory thrashing/GCs when we TVPs in our JDBC driver with larger TVPs.

There are a few approaches I've taken for this problem in the past
For small enums, you can keep the same lookup method, but create a private static inner array of values.

So,

private static final JDBCType[] VALUES = values()

Replacing .values() with VALUES (Unless it is being later modified, almost never is) is a simple change with little risk of reduced performance.

If the Enum is large, it may make sense to create a map up front

private static final Map<Integer, JdbcType> mapping = new HashMap<>();
static {
  for (var type : values())
   mapping.put(type.intValue, type);
}

public JDBCTye of(int intValue) {
   return mapping.get(intValue)
}

etc... That will be faster for large enums, slower for small. Whether or not to do it is a judgement call (preferably after using JMH to verify which is faster).

Finally, the fastest approach is going to be an exhaustive switch statement over the incoming ID. I'll do this if I want high speed, but you really need to have a unit test in place to ensure the switch remains exhaustive.

Not applicable to this case, but if the ENUM ids are closely packed, a faster way then the map approach (but still slower than the switch statement) is to allocate a new JDBCType array where size = maxId - minId. You place the enum values into the array based on [id - minId] and you look them up using the same. For very tightly packed ids, this ends up being faster than the map with low to no memory impact (only 1 memory indirection, no potential int->Integer conversions). However, sparse ids make this much worse memory wise than the map.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Assignees

Labels

BugA bug in the driver. A high priority item that one can expect to be addressed quickly.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions