bpo-44859: Improve error handling in sqlite3 and change some errors#27654
bpo-44859: Improve error handling in sqlite3 and change some errors#27654serhiy-storchaka merged 2 commits intopython:mainfrom
Conversation
* MemoryError is now raised instead of sqlite3.Warning when memory is not enough for encoding a statement to UTF-8 in Connection.__call__() and Cursor.execute(). * UnicodEncodeError is now raised instead of sqlite3.Warning when the statement contains surrogate characters in Connection.__call__() and Cursor.execute(). * TypeError is now raised instead of ValueError for non-string script argument in Cursor.executescript(). * ValueError is now raised for script containing the null character instead of truncating it in Cursor.executescript(). * Correctly handle exceptions raised when getting boolean value of the result of the progress handler. * Add may tests covering different exceptional cases.
erlend-aasland
left a comment
There was a problem hiding this comment.
LGTM, in general. Region coverage for util.c is now above 80%; sqlite3 coverage is getting better and better.
I left some minor comments.
Misc/NEWS.d/next/Library/2021-08-07-17-28-56.bpo-44859.CCopjk.rst
Outdated
Show resolved
Hide resolved
Misc/NEWS.d/next/Library/2021-08-07-17-28-56.bpo-44859.CCopjk.rst
Outdated
Show resolved
Hide resolved
| } | ||
| } else { | ||
| PyErr_SetString(PyExc_ValueError, "script argument must be unicode."); | ||
| sql_len = strlen(sql_script); |
There was a problem hiding this comment.
What about adding a custom AC converter that can extract the string length, so we don't have to iterate over the string again? Is it worth it?
There was a problem hiding this comment.
I think it is not worth. The code is simpler now (some checks were delegated to Argument Clinic), and it I prefer simplicity at the cost of additional strlen(). Custom converter would make it more complex than the original code. strlen() should be called in any case to check for embedded null characters. Now it is called 2 times instead of 1 time. The overhead is not too large.
| sql_len = strlen(sql_script); | ||
| int max_length = sqlite3_limit(self->connection->db, | ||
| SQLITE_LIMIT_LENGTH, -1); | ||
| if (sql_len >= (unsigned)max_length) { |
There was a problem hiding this comment.
| if (sql_len >= (unsigned)max_length) { | |
| if (sql_len >= (size_t)max_length) { |
There was a problem hiding this comment.
How does it convert: int -> unsigned -> size_t or int -> ssize_t -> size_t? The standard specifies what the intermediate conversion the compiler should used, but without looking into it I cannot say that it will be the correct one.
If convert explicitly to unsigned, the conversion sequence will be unambiguous: int -> unsigned -> size_t.
|
BTW, that's a neat enhancement of the |
|
We should consider changing some more errors, as discussed in #27642 (comment), #27645 (comment), and #27645 (comment) |
Co-authored-by: Erlend Egeberg Aasland <erlend.aasland@innova.no>
memory is not enough for encoding a statement to UTF-8
in Connection.call() and Cursor.execute().
the statement contains surrogate characters
in
Connection.__call__()and Cursor.execute().script argument in Cursor.executescript().
character instead of truncating it in Cursor.executescript().
of the result of the progress handler.
https://bugs.python.org/issue44859