-
Notifications
You must be signed in to change notification settings - Fork 431
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use Charset throughout #26
Conversation
@@ -587,9 +590,9 @@ final Encoding checkSupported() throws UnsupportedEncodingException | |||
// This works for all of the code pages above in SE 5 and later. | |||
try | |||
{ | |||
" ".getBytes(charsetName); | |||
charset = Charset.forName(charsetName); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
instead of assigning the charset object to charset variable, is it possible to do the assignment outside the checkSupported() method? e.g. we could do the assignment after checkSupported() is called in charset()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can do this.
The driver spends a lot of time converting from Charset to String and from String to Charset. Every time checked exceptions have to be caught. In addition the Charset lookup code is quite involved. It would be simpler to use Charset everywhere.
checkSupported(); | ||
if (charset == null) | ||
{ | ||
charset = Charset.forName(charsetName); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@marschall thank you very much for your great idea and fix. I just have one more question and hope you could consider. Is it possible to delete the local variable 'private Charset charset'? I am not sure why we need to check if the variable is null here. instead of assigning it to charset and returning charset., could we just do "return Charset.forName(charsetName);" ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes I considered this. It would be possible but:
'private Charset charset' ist not a local variable but an instance/member variable. Encoding#charset()
is called quite frequently and by many users both directly and indirectly. If you look at the implementation of Charset#forName
it is quite involved and the worst case is quite bad, eg. some of the charsets are in the Extended Encoding Set. I would be more comfortable with looking them up just once.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@marschall thank you for your feedback.
I totally agree to what you said, we definitely need to look up once.
However, what I meant was something like this.:
try
{
checkSupported();
}
catch (UnsupportedEncodingException e)
...
return Charset.forName(charsetName);
the look up is already called within checkSupported
(by Charset.isSupported()
)
I am not sure why we need the member variable, or why we need to check if the variable is null.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need the member variable in order to cache the look up. Otherwise every time somebody calls Encoding#charset()
directly or indirectly a Charset lookup is performed. Examples include:
TypeInfo.Builder.Strategy#apply(TypeInfo, TDSReader)
Util#readUnicodeString(byte[], int, int, SQLServerConnection)
SQLServerSQLXML#getString()
SQLServerSQLXML#getValue()
TDSReader#readGUID(int, JDBCType, StreamType)
SQLServerBulkCopy#writeColumnToTdsWriter(TDSWriter, int, int, int, boolean, int, int, boolean, Object)
Without the member variable every time one of this methods is called directly or indirectly a lookup would be performed anew. The null check avoids the look up the second and following times the method is called.
It is for the same reason that jvmSupportConfirmed
is cached in a member variable instead of computed every time the method is called.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@marschall thanks for the very detailed explanation, now I understand why you have introduced a new member variable here. I wanted the code to be consistent with the current implementation, but your cache implementation is better than the current. I am going to accept the PR once the testing is done. thanks again.
@marschall thank you very much |
The driver spends a lot of time converting from Charset to String and
from String to Charset. Every time checked exceptions have to be
caught. In addition the Charset lookup code is quite involved. It would
be simpler to use Charset everywhere.