Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix exponential memory allocation in Exec and improve performance #1296

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

charlievieth
Copy link
Contributor

@charlievieth charlievieth commented Nov 7, 2024

This commit changes SQLiteConn.Exec to use the raw Go query string instead of repeatedly converting it to a C string (which it would do for every statement in the provided query). This yields a ~20% performance improvement for a query containing one statement and a significantly larger improvement when the query contains multiple statements as is common when importing a SQL dump (our benchmark shows a 5x improvement for handling 1k SQL statements).

Additionally, this commit improves the performance of Exec by 2x or more and makes number and size of allocations constant when there are no bind parameters (the performance improvement scales with the number of SQL statements in the query). This is achieved by having the entire query processed in C code thus requiring only one CGO call.

The speedup for Exec'ing single statement queries means that wrapping simple statements in a transaction is now twice as fast.

This commit also improves the test coverage of Exec, which previously failed to test that Exec could process multiple statements like INSERT. It also adds some Exec specific benchmarks that highlight both the improvements here and the overhead of using a cancellable Context.

This commit is a slimmed down and improved version of PR #1133:

goos: darwin
goarch: arm64
pkg: github.com/mattn/go-sqlite3
cpu: Apple M1 Max
                                       │    b.txt     │                n.txt                │
                                       │    sec/op    │   sec/op     vs base                │
Suite/BenchmarkExec/Params-10             1.434µ ± 1%   1.186µ ± 0%  -17.27% (p=0.000 n=10)
Suite/BenchmarkExec/NoParams-10          1267.5n ± 0%   759.2n ± 1%  -40.10% (p=0.000 n=10)
Suite/BenchmarkExecContext/Params-10      2.886µ ± 0%   2.517µ ± 0%  -12.80% (p=0.000 n=10)
Suite/BenchmarkExecContext/NoParams-10    2.605µ ± 1%   1.829µ ± 1%  -29.81% (p=0.000 n=10)
Suite/BenchmarkExecStep-10               1852.6µ ± 1%   582.3µ ± 0%  -68.57% (p=0.000 n=10)
Suite/BenchmarkExecContextStep-10        3053.3µ ± 3%   582.0µ ± 0%  -80.94% (p=0.000 n=10)
Suite/BenchmarkExecTx-10                  4.126µ ± 2%   2.200µ ± 1%  -46.67% (p=0.000 n=10)
geomean                                   16.40µ        8.455µ       -48.44%

                                       │      b.txt      │                n.txt                │
                                       │      B/op       │    B/op     vs base                 │
Suite/BenchmarkExec/Params-10                 248.0 ± 0%   240.0 ± 0%    -3.23% (p=0.000 n=10)
Suite/BenchmarkExec/NoParams-10              128.00 ± 0%   64.00 ± 0%   -50.00% (p=0.000 n=10)
Suite/BenchmarkExecContext/Params-10          408.0 ± 0%   400.0 ± 0%    -1.96% (p=0.000 n=10)
Suite/BenchmarkExecContext/NoParams-10        288.0 ± 0%   208.0 ± 0%   -27.78% (p=0.000 n=10)
Suite/BenchmarkExecStep-10               5406674.50 ± 0%   64.00 ± 0%  -100.00% (p=0.000 n=10)
Suite/BenchmarkExecContextStep-10         5566758.5 ± 0%   208.0 ± 0%  -100.00% (p=0.000 n=10)
Suite/BenchmarkExecTx-10                      712.0 ± 0%   520.0 ± 0%   -26.97% (p=0.000 n=10)
geomean                                     4.899Ki        189.7        -96.22%

                                       │     b.txt     │               n.txt                │
                                       │   allocs/op   │ allocs/op   vs base                │
Suite/BenchmarkExec/Params-10              10.000 ± 0%   9.000 ± 0%  -10.00% (p=0.000 n=10)
Suite/BenchmarkExec/NoParams-10             7.000 ± 0%   4.000 ± 0%  -42.86% (p=0.000 n=10)
Suite/BenchmarkExecContext/Params-10        12.00 ± 0%   11.00 ± 0%   -8.33% (p=0.000 n=10)
Suite/BenchmarkExecContext/NoParams-10      9.000 ± 0%   6.000 ± 0%  -33.33% (p=0.000 n=10)
Suite/BenchmarkExecStep-10               7000.000 ± 0%   4.000 ± 0%  -99.94% (p=0.000 n=10)
Suite/BenchmarkExecContextStep-10        9001.000 ± 0%   6.000 ± 0%  -99.93% (p=0.000 n=10)
Suite/BenchmarkExecTx-10                    27.00 ± 0%   18.00 ± 0%  -33.33% (p=0.000 n=10)
geomean                                     74.60        7.224       -90.32%

@charlievieth charlievieth force-pushed the cev/exec-allocs branch 2 times, most recently from 74a60e6 to 9fb17fc Compare November 9, 2024 03:27
@charlievieth charlievieth changed the title Fix exponential memory allocation in SQLiteConn.Exec Fix exponential memory allocation in Exec and improve performance Nov 9, 2024
@charlievieth
Copy link
Contributor Author

Note: This PR has been updated to include logic for Exec'ing queries that have no bind parameters.

@charlievieth charlievieth force-pushed the cev/exec-allocs branch 10 times, most recently from 0a54f03 to 89769f9 Compare November 12, 2024 04:38
@charlievieth
Copy link
Contributor Author

@rittneje Any chance I could get this reviewed?

This commit changes SQLiteConn.Exec to use the raw Go query string
instead of repeatedly converting it to a C string (which it would do for
every statement in the provided query). This yields a ~20% performance
improvement for a query containing one statement and a significantly
larger improvement when the query contains multiple statements as is
common when importing a SQL dump (our benchmark shows a 5x improvement
for handling 1k SQL statements).

Additionally, this commit improves the performance of Exec by 2x or more
and makes number and size of allocations constant when there are no bind
parameters (the performance improvement scales with the number of SQL
statements in the query). This is achieved by having the entire query
processed in C code thus requiring only one CGO call.

The speedup for Exec'ing single statement queries means that wrapping
simple statements in a transaction is now twice as fast.

This commit also improves the test coverage of Exec, which previously
failed to test that Exec could process multiple statements like INSERT.
It also adds some Exec specific benchmarks that highlight both the
improvements here and the overhead of using a cancellable Context.

This commit is a slimmed down and improved version of PR mattn#1133:
  mattn#1133

```
goos: darwin
goarch: arm64
pkg: github.com/mattn/go-sqlite3
cpu: Apple M1 Max
                                       │    b.txt     │                n.txt                │
                                       │    sec/op    │   sec/op     vs base                │
Suite/BenchmarkExec/Params-10             1.434µ ± 1%   1.186µ ± 0%  -17.27% (p=0.000 n=10)
Suite/BenchmarkExec/NoParams-10          1267.5n ± 0%   759.2n ± 1%  -40.10% (p=0.000 n=10)
Suite/BenchmarkExecContext/Params-10      2.886µ ± 0%   2.517µ ± 0%  -12.80% (p=0.000 n=10)
Suite/BenchmarkExecContext/NoParams-10    2.605µ ± 1%   1.829µ ± 1%  -29.81% (p=0.000 n=10)
Suite/BenchmarkExecStep-10               1852.6µ ± 1%   582.3µ ± 0%  -68.57% (p=0.000 n=10)
Suite/BenchmarkExecContextStep-10        3053.3µ ± 3%   582.0µ ± 0%  -80.94% (p=0.000 n=10)
Suite/BenchmarkExecTx-10                  4.126µ ± 2%   2.200µ ± 1%  -46.67% (p=0.000 n=10)
geomean                                   16.40µ        8.455µ       -48.44%

                                       │      b.txt      │                n.txt                │
                                       │      B/op       │    B/op     vs base                 │
Suite/BenchmarkExec/Params-10                 248.0 ± 0%   240.0 ± 0%    -3.23% (p=0.000 n=10)
Suite/BenchmarkExec/NoParams-10              128.00 ± 0%   64.00 ± 0%   -50.00% (p=0.000 n=10)
Suite/BenchmarkExecContext/Params-10          408.0 ± 0%   400.0 ± 0%    -1.96% (p=0.000 n=10)
Suite/BenchmarkExecContext/NoParams-10        288.0 ± 0%   208.0 ± 0%   -27.78% (p=0.000 n=10)
Suite/BenchmarkExecStep-10               5406674.50 ± 0%   64.00 ± 0%  -100.00% (p=0.000 n=10)
Suite/BenchmarkExecContextStep-10         5566758.5 ± 0%   208.0 ± 0%  -100.00% (p=0.000 n=10)
Suite/BenchmarkExecTx-10                      712.0 ± 0%   520.0 ± 0%   -26.97% (p=0.000 n=10)
geomean                                     4.899Ki        189.7        -96.22%

                                       │     b.txt     │               n.txt                │
                                       │   allocs/op   │ allocs/op   vs base                │
Suite/BenchmarkExec/Params-10              10.000 ± 0%   9.000 ± 0%  -10.00% (p=0.000 n=10)
Suite/BenchmarkExec/NoParams-10             7.000 ± 0%   4.000 ± 0%  -42.86% (p=0.000 n=10)
Suite/BenchmarkExecContext/Params-10        12.00 ± 0%   11.00 ± 0%   -8.33% (p=0.000 n=10)
Suite/BenchmarkExecContext/NoParams-10      9.000 ± 0%   6.000 ± 0%  -33.33% (p=0.000 n=10)
Suite/BenchmarkExecStep-10               7000.000 ± 0%   4.000 ± 0%  -99.94% (p=0.000 n=10)
Suite/BenchmarkExecContextStep-10        9001.000 ± 0%   6.000 ± 0%  -99.93% (p=0.000 n=10)
Suite/BenchmarkExecTx-10                    27.00 ± 0%   18.00 ± 0%  -33.33% (p=0.000 n=10)
geomean                                     74.60        7.224       -90.32%
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant