Skip to content

gh-118761: Improve import time of sqlite3 #131796

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

Wulian233
Copy link
Contributor

@Wulian233 Wulian233 commented Mar 27, 2025

Improve import time of sqlite3 by 1.5x faster

Benchmark on Windows10, CPython 3.14.0b1

D:\Python314>hyperfine -i --warmup 8 "./python -c 'from sqlite3 import *'" "./python -c 'from sqlite3_new import *'
Benchmark 1: ./python -c 'from sqlite3 import *'
  Time (mean ± σ):     176.8 µs ± 179.3 µs    [User: 1985.8 µs, System: 2886.4 µs]
  Range (min … max):     0.0 µs … 1018.1 µs    406 runs

Benchmark 2: ./python -c 'from sqlite3_new import *'
  Time (mean ± σ):     245.4 µs ± 200.7 µs    [User: 914.9 µs, System: 2002.8 µs]
  Range (min … max):     0.0 µs … 1787.5 µs    401 runs

Summary
  ./python -c 'from sqlite3 import *' ran
    1.39 ± 1.81 times faster than ./python -c 'from sqlite3_new import *'

I just realized today that I accidentally deleted an unmerged branch last month, which resulted in closing #129118 . My apologies for the oversight. I've now recreated the pull request - the content remains exactly the same as before

Module sqlite3 self(us) sqlite3 cumulative(us) sqlite3_new self(us) sqlite3_new cumulative(us)
_sqlite3 2658 6641 2455 6462
sqlite3.dbapi2 6431 31714 5607 28860
sqlite3 (main) 8592 40306 4213 35121
  • The time taken by _sqlite3 is not much different.
  • The time for sqlite3.dbapi2 decreased significantly from 6431 to 5607
  • The time for the main sqlite3 module dropped very noticeably from 8592 to 4213

@eendebakpt
Copy link
Contributor

@Wulian233 In your benchmark the sqlite3_new is slower (takes more time to import). Based on the name I would expect it to be the faster one.

Also time is a builtin module. Is that one really a bottleneck?

@brianschubert
Copy link
Contributor

brianschubert commented Mar 27, 2025

I'm also a bit surprised that time would be a bottleneck. When I run ./python -X importtime -c 'import sqlite3' on my system, I get

import time: self [us] | cumulative | imported package
[...]
import time:        80 |         80 |   time
[...]
import time:      2011 |       8096 | sqlite3

Which, if I'm reading it right, says that importing time can only account for only a small fraction sqlite3's import time?

@Wulian233
Copy link
Contributor Author

Wulian233 commented Mar 27, 2025

@Wulian233 In your benchmark the sqlite3_new is slower (takes more time to import). Based on the name I would expect it to be the faster one.

Also time is a builtin module. Is that one really a bottleneck?

I think I got the name wrong at the time, it was a copy of the previous PR content, and here is the test I just run

D:\Python313>hyperfine -i --warmup 8 "./python -c 'from sqlite3 import *'" "./python -c 'from sqlite3_new import *'"
Benchmark 1: ./python -c 'from sqlite3 import *'
  Time (mean ± σ):     697.1 µs ± 1059.7 µs    [User: 2577.5 µs, System: 3063.5 µs]
  Range (min … max):     0.0 µs … 10960.7 µs    391 runs

  Warning: Command took less than 5 ms to complete. Note that the results might be inaccurate because hyperfine can not calibrate the shell startup time much more precise than this limit. You can try to use the `-N`/`--shell=none` option to disable the shell completely.
  Warning: Ignoring non-zero exit code.
  Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet system without any interferences from other programs. It might help to use the '--warmup' or '--prepare' options.

Benchmark 2: ./python -c 'from sqlite3_new import *'
  Time (mean ± σ):     635.0 µs ± 755.4 µs    [User: 1821.0 µs, System: 3691.3 µs]
  Range (min … max):     0.0 µs … 3622.5 µs    399 runs

  Warning: Command took less than 5 ms to complete. Note that the results might be inaccurate because hyperfine can not calibrate the shell startup time much more precise than this limit. You can try to use the `-N`/`--shell=none` option to disable the shell completely.
  Warning: Ignoring non-zero exit code.

Summary
  ./python -c 'from sqlite3_new import *' ran
    1.10 ± 2.12 times faster than ./python -c 'from sqlite3 import *'

@@ -20,9 +20,9 @@
# misrepresented as being the original software.
# 3. This notice may not be removed or altered from any source distribution.

import datetime
import time
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the point of the PR if the import is not removed? Lazy imports are therefore not used, then why should any speedup be expected, a slowdown is more likely?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Already pointed out in #131796 (comment) :)

(But the PR author self-reviewed the review comment by declaring the review comment to be resolved, which is a controversial feature for github to even allow.)

@picnixz
Copy link
Member

picnixz commented Mar 29, 2025

To be precise, it's not importing sqlite3 that is faster, it's importing sqlite3 and loading its content in the global namespace (the benchmarks are done for from sqlite3 import * not for a plain import sqlite3 statement).

Also, the benchmarks are quite unstable. The standard deviation is much higher than the mean itself! In practice, I think we don't gain anything else than stability (the stdev becomes smaller with the new implementation but again this could be noise).

Can you check if import sqlite3 has less noise?


And to be precise, we're not gaining a 2x speed-up. On average, we're only gaining a 1.1x speed-up and the standard deviation of this speed-up is $\pm2$, which is again not really indicative =/

And yes, I'm also surprised that we're gaining "that much" when we just make import time local. Considering it's a built-in module (though not sure if it's present at startup), I don't think we need to do this change (we're maybe gaining a small speed-up but maybe not much; and sqlite3 is already is a heavy module to import). So benchmarks using -X importtime here are much more important IMO compared to interpreter's startup as well.

@erlend-aasland
Copy link
Contributor

@Wulian233, can you address @picnixz's last remark?

@erlend-aasland erlend-aasland added the pending The issue will be closed if no feedback is provided label Jun 5, 2025
@Wulian233
Copy link
Contributor Author

Wulian233 commented Jun 6, 2025

I've reworked another version and it's going to be faster now

I ran the command D:\Python314>python -X importtime -c "import sqlite3" and obtained the following output. Below is a summary table comparing sqlite3 and sqlite3_new import times (GPT helped):

Module sqlite3 self(us) sqlite3 cumulative(us) sqlite3_new self(us) sqlite3_new cumulative(us)
_sqlite3 2658 6641 2455 6462
sqlite3.dbapi2 6431 31714 5607 28860
sqlite3 (main) 8592 40306 4213 35121
  • _sqlite3 is not much different.
  • sqlite3.dbapi2 decreased from 6431 -> 5607
  • sqlite3 main dropped very noticeably from 8592 -> 4213

Raw output:

import time: self [us] | cumulative | imported package import time: 296 | 296 | winreg import time: 248 | 248 | _io import time: 70 | 70 | marshal import time: 312 | 312 | nt import time: 1637 | 2266 | _frozen_importlib_external import time: 1103 | 1103 | time import time: 1089 | 2191 | zipimport import time: 75 | 75 | _codecs import time: 793 | 868 | codecs import time: 1926 | 1926 | encodings.aliases import time: 797 | 797 | encodings._win_cp_codecs import time: 3400 | 6989 | encodings import time: 960 | 960 | encodings.utf_8 import time: 75 | 75 | _codecs_cn import time: 126 | 126 | _multibytecodec import time: 1389 | 1590 | encodings.gbk import time: 86 | 86 | _signal import time: 59 | 59 | _abc import time: 323 | 382 | abc import time: 104 | 104 | _stat import time: 527 | 630 | stat import time: 1745 | 1745 | _collections_abc import time: 126 | 126 | genericpath import time: 229 | 229 | _winapi import time: 3109 | 3463 | ntpath import time: 2543 | 8760 | os import time: 584 | 584 | _sitebuiltins import time: 1004 | 1004 | sitecustomize import time: 563 | 563 | usercustomize import time: 2043 | 12953 | site import time: 1104 | 1104 | linecache import time: 787 | 787 | _datetime import time: 1003 | 1790 | datetime import time: 704 | 704 | itertools import time: 1147 | 1147 | keyword import time: 128 | 128 | _operator import time: 1412 | 1540 | operator import time: 2293 | 2293 | reprlib import time: 164 | 164 | _collections import time: 4104 | 9950 | collections import time: 2917 | 12866 | collections.abc import time: 64 | 64 | _types import time: 1134 | 1197 | types import time: 136 | 136 | _functools import time: 2652 | 3984 | functools import time: 2658 | 6641 | _sqlite3 import time: 74 | 74 | _contextvars import time: 1406 | 1479 | _py_warnings import time: 2508 | 3987 | warnings import time: 6431 | 31714 | sqlite3.dbapi2 import time: 8592 | 40306 | sqlite3

D:\Python314>python -X importtime -c "import sqlite3_new"
import time: self [us] | cumulative | imported package
import time: 142 | 142 | winreg
import time: 284 | 284 | _io
import time: 119 | 119 | marshal
import time: 461 | 461 | nt
import time: 2051 | 2914 | _frozen_importlib_external
import time: 849 | 849 | time
import time: 651 | 1500 | zipimport
import time: 131 | 131 | _codecs
import time: 4036 | 4166 | codecs
import time: 1493 | 1493 | encodings.aliases
import time: 1159 | 1159 | encodings._win_cp_codecs
import time: 6495 | 13311 | encodings
import time: 1028 | 1028 | encodings.utf_8
import time: 72 | 72 | _codecs_cn
import time: 124 | 124 | _multibytecodec
import time: 1305 | 1500 | encodings.gbk
import time: 119 | 119 | _signal
import time: 58 | 58 | _abc
import time: 366 | 424 | abc
import time: 100 | 100 | _stat
import time: 2987 | 3086 | stat
import time: 1215 | 1215 | _collections_abc
import time: 135 | 135 | genericpath
import time: 238 | 238 | _winapi
import time: 673 | 1045 | ntpath
import time: 2186 | 7956 | os
import time: 1146 | 1146 | _sitebuiltins
import time: 1021 | 1021 | sitecustomize
import time: 545 | 545 | usercustomize
import time: 1849 | 12515 | site
import time: 1417 | 1417 | linecache
import time: 163 | 163 | _datetime
import time: 809 | 972 | datetime
import time: 570 | 570 | itertools
import time: 1219 | 1219 | keyword
import time: 154 | 154 | _operator
import time: 1384 | 1538 | operator
import time: 1604 | 1604 | reprlib
import time: 151 | 151 | _collections
import time: 3696 | 8776 | collections
import time: 2846 | 11621 | collections.abc
import time: 68 | 68 | _types
import time: 1052 | 1119 | types
import time: 174 | 174 | _functools
import time: 2716 | 4007 | functools
import time: 2455 | 6462 | _sqlite3
import time: 77 | 77 | _contextvars
import time: 2803 | 2879 | _py_warnings
import time: 1321 | 4200 | warnings
import time: 5607 | 28860 | sqlite3.dbapi2
import time: 1298 | 30158 | sqlite3
import time: 751 | 30908 | sqlite3.dbapi2
import time: 4213 | 35121 | sqlite3_new

hyperfine test:

D:\Python314>hyperfine -i --warmup 8 "./python -c 'from sqlite3 import *'" "./python -c 'from sqlite3_new import *'
Benchmark 1: ./python -c 'from sqlite3 import *'
  Time (mean ± σ):     176.8 µs ± 179.3 µs    [User: 1985.8 µs, System: 2886.4 µs]
  Range (min … max):     0.0 µs … 1018.1 µs    406 runs

Benchmark 2: ./python -c 'from sqlite3_new import *'
  Time (mean ± σ):     245.4 µs ± 200.7 µs    [User: 914.9 µs, System: 2002.8 µs]
  Range (min … max):     0.0 µs … 1787.5 µs    401 runs

Summary
  ./python -c 'from sqlite3 import *' ran
    1.39 ± 1.81 times faster than ./python -c 'from sqlite3_new import *'

Comment on lines 66 to +69
def main(*args):
from argparse import ArgumentParser
from textwrap import dedent

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not see a valid reason to do code churn and delay imports for the sake of speeding up import sqlite3.__main__ while assuming that sqlite3.__main__.main() is never called.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
awaiting core review pending The issue will be closed if no feedback is provided topic-sqlite3
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants