Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ignore ERROR: prepared statement already exists (SQLSTATE 42P05) error when preparing #1847

Closed
jameshartig opened this issue Dec 22, 2023 · 31 comments
Labels

Comments

@jameshartig
Copy link
Contributor

jameshartig commented Dec 22, 2023

Describe the bug
It's not clear what happened but we had a bunch of the following errors:
expected statement description, got *pgconn.ResultReader
expected statement description, got *pgconn.PipelineSync
unexpected pipeline result: *pgconn.PipelineSync
unexpected pipeline result: *pgconn.StatementDescription

Then the connection forever was throwing the error ERROR: prepared statement 'stmtcache_2fdb49e13e67d391f9a11b5965c6bdc9427a2b5e65613919' already exists (SQLSTATE 42P05).

To Reproduce
Unfortunately, I have no idea how to reproduce this. There was a bunch of packet loss in GCP at the time this occurred so it could have been something related to that.

We issue this particular SQL like:

b := new(pgx.Batch)
b.Queue("BEGIN READ ONLY")
b.Queue(query, arg)
b.Queue("COMMIT")
pool.SendBatch(ctx, b)

Expected behavior
Despite the pipeline parsing failing, I would expect future prepares to handle 42P051 especially since Prepare says it's idempotent.

Alternatively, if we get an unexpected pipeline type, we should maybe call asyncClose so the connection can be thrown away?

Actual behavior
Instead that query forever failed on the connection with the error ERROR: prepared statement 'stmtcache_2fdb49e13e67d391f9a11b5965c6bdc9427a2b5e65613919' already exists (SQLSTATE 42P05).

Version

  • Go: $ go version -> 1.21.5
  • PostgreSQL: $ psql --no-psqlrc --tuples-only -c 'select version()' -> PostgreSQL 11.2-YB-2.18.2.2-b0 on x86_64-pc-linux-gnu, compiled by clang version 15.0.3 (https://github.com/yugabyte/llvm-project.git 0b8d1183745fd3998d8beffeec8cbe99c1b20529), 64-bit
  • pgx: $ grep 'github.com/jackc/pgx/v[0-9]' go.mod -> v5.5.0

Additional context

@jameshartig
Copy link
Contributor Author

My best guess is that there was maybe an error here and we failed to close the connection:

diff --git a/pgconn/pgconn.go b/pgconn/pgconn.go
index d5a67bea..49319350 100644
--- a/pgconn/pgconn.go
+++ b/pgconn/pgconn.go
@@ -2117,6 +2117,7 @@ func (p *Pipeline) getResults() (results any, err error) {
                case *pgproto3.ParseComplete:
                        peekedMsg, err := p.conn.peekMessage()
                        if err != nil {
+                               p.conn.asyncClose()
                                return nil, err
                        }
                        if _, ok := peekedMsg.(*pgproto3.ParameterDescription); ok {

If I happen to get a network error right at that peekMessage() then I can reproduce the issue.

jackc added a commit that referenced this issue Dec 23, 2023
@jackc
Copy link
Owner

jackc commented Dec 23, 2023

I think you are correct about the peekMessage error. Fixed in cbc5a70 along with normalizing the error.

Not sure if that actually will resolve the original issue though.

The idempotency of Prepare is based on it keeping track of all prepared statements and being a no-op on repeats. It doesn't handle getting out of sync with the server. I suppose we could handle that case, but it seems like it might be masking an underlying issue.

@jameshartig
Copy link
Contributor Author

The idempotency of Prepare is based on it keeping track of all prepared statements and being a no-op on repeats. It doesn't handle getting out of sync with the server. I suppose we could handle that case, but it seems like it might be masking an underlying issue.

That makes sense, I'm just trying to figure out how to recover from this scenario. The solution might be to just kill the connection using the new OnPgError. I could try to call Deallocate but it isn't simple for me to generate the hash for the SQL since that's an internal detail.

@jacobmikesell
Copy link

jacobmikesell commented Jan 26, 2024

@jackc We just saw this error on pgx 5.5.3!
ERROR: prepared statement "stmtcache_e3bbdda88e6fb7fb4a24ad6c6d84a6a52aba9fd5c8667677" already exists (SQLSTATE 42P05)

for more information, we have a pgxpool that runs out of statement cache space relatively frequently (few times a day) and we also have very strict context timeouts so we often have conns getting timed out. Not sure if that info helps

@pascal-za
Copy link

We also experienced this error intermittently and had to roll back to 5.4, but unfortunately it's quite hard to track down the situation that leads to this. I will add that we also use context timeouts extensively, so perhaps there could be clue in there.

jackc added a commit that referenced this issue Feb 3, 2024
When a conn is going to execute a query, the first thing it does is to
deallocate any invalidated prepared statements from the statement cache.
However, the statements were removed from the cache regardless of
whether the deallocation succeeded. This would cause subsequent calls of
the same SQL to fail with "prepared statement already exists" error.

This problem is easy to trigger by running a query with a context that
is already canceled.

This commit changes the deallocate invalidated cached statements logic
so that the statements are only removed from the cache if the
deallocation was successful on the server.

#1847
@jackc
Copy link
Owner

jackc commented Feb 3, 2024

I think I found the underlying issue. If an error occurred when deallocating invalidated statements from the cache they would be removed from the cache but not actually deallocated on the server. In particular this could happen when a context was canceled before the query was even started.

This problem is fixed in 832b4f9 and included in the just tagged v5.5.3.

@jacobmikesell
Copy link

@jackc Looking over that commit (and checking my PG knowledge) is it correct to say that the pipeline only succeeds or fails atomically? Or is it that deallocates are idempotent so potentially resending them is safe?

@jackc
Copy link
Owner

jackc commented Feb 6, 2024

is it correct to say that the pipeline only succeeds or fails atomically?

Technically no, but practically yes. Preparing and deallocating statements at the protocol layer is not transactional. If I recall correctly it may even work in a broken transaction. (This might vary between real PostgreSQL and more or less compatible databases like CockroachDB.) But pretty much the only way it could partially succeed is if the network connection was interrupted. But if that happened the connection is dead anyway.

Or is it that deallocates are idempotent so potentially resending them is safe?

Protocol level deallocates are idempotent.

@jacobmikesell
Copy link

FYI @jackc we just saw the same error again on 5.5.3 Interestingly this time it was in a pgx.Batch execution that is part of a larger transaction

@jackc
Copy link
Owner

jackc commented Feb 24, 2024

@jacobmikesell The problem only occurred in very specific circumstances.

If the size of the statement cache is exceeded then a query is evicted from the cache, but it is only marked as invalid it is not actually deallocated on the server.

Before the next query is sent all invalidated statements are deallocated.

However, deallocating invalidated statements did not occur inside of a transaction. I believe this was a relic of the older deallocate code that actually called a DEALLOCATE SQL statement instead of using the underlying protocol operation.

This wasn't a problem to regular queries as the entry still exists in Conn.preparedStatements so the call to Prepare() is idempotent. However, batch queries do not call Prepare(). They send all prepare statement requests in bulk.

So if a transaction was in progress, and a statement was evicted from the cache, and the batch then tried to send that exact query then the error would occur.

Should be resolved in 046f497. It allows deallocating invalidated statements inside of a transaction that is not in an error state.

@jameshartig
Copy link
Contributor Author

@jackc sorry for the comment spam but just wanted to give you kudos for the investigations in this issue! Hopefully this is the last instance of this and we can close it out.

@jacobmikesell
Copy link

Thanks for looking into and fixing :D is there an estimate for when this will hit a minor version? We bump into it about once or twice a week (no good repro on our end) but I'd love to pull the new minor version and test it out!

@jackc
Copy link
Owner

jackc commented Mar 2, 2024

@jacobmikesell I expect a new patch release next week.

@jeanscherf
Copy link

Thank you for working on a fix for that. We needed to roll it back after bumping into this problem as well.
Is there any news on the patch release?

@jackc
Copy link
Owner

jackc commented May 14, 2024

@jacobmikesell The fix was included into v5.5.5.

@vytskalt
Copy link

vytskalt commented Jun 27, 2024

I was running pgx (v5.6.0) on AWS lambda when the function timeout was exceeded and context was cancelled. Now every other run of the function fails with the error:

{"errorMessage":"ERROR: prepared statement \"stmtcache_7fb03f1b51f17f5fd2622a2d28406b66f300cdef9f045c81\" already exists (SQLSTATE 42P05)","errorType":"PgError"}

In the lambda function I create a *pgxpool.Pool and only call pool.Exec on it.

EDIT: turns out the context doesn't even have to be canceled, here is minimal reproduction code:

package main

import (
	"context"
	"os"

	"github.com/jackc/pgx/v5/pgxpool"
)

func main() {
	pgConfig, err := pgxpool.ParseConfig(os.Getenv("POSTGRES_URL"))
	if err != nil {
		panic(err)
	}

	postgresPool, err := pgxpool.NewWithConfig(context.Background(), pgConfig)
	if err != nil {
		panic(err)
	}
	defer postgresPool.Close()

	_, err = postgresPool.Exec(context.Background(), "SELECT $1", "hello world")
	if err != nil {
		panic(err)
	}
}

run the code two times to get this error:

panic: ERROR: prepared statement "stmtcache_165a22d8eabe300a7430a1c63e8bd81b6fbc45f42f099dcb" already exists (SQLSTATE 42P05)

@jackc
Copy link
Owner

jackc commented Jun 27, 2024

@vytskalt

run the code two times to get this error:

You mentioned that you are running on AWS Lambda. It runs fine locally, as would be expected. It's impossible for two program executions to share the same connection. But that's what it sounds like is happening. It seems the there is a naive load balancer between your program and the server. When your second execution connects, it actually gets the same underlying connection with PostgreSQL that already has the statement prepared from the first execution.

The simplest solution is to connect with QueryExecModeCacheDescribe to disable the automatic prepared statement cache. Or you may be able to configure the load balancer to properly handle prepared statements. But either way, I don't think this is a pgx issue.

@patryk4815

This comment was marked as outdated.

@reVrost
Copy link

reVrost commented Jun 29, 2024

I am also running into this issue (v5.6.0) while using a lambda connecting to a supabase db (with ?pgbouncer=true), my setup looks like:

var pool *pgxpool.Pool

func init() {
	ctx := context.Background()
	pgConn := os.Getenv("PG_CONN")

	if pgConn == "" {
		log.Fatalf("PG_CONN environment variable is not set")
	}

	var err error
	pool, err = pgxpool.New(ctx, pgConn)
	if err != nil {
		log.Fatalf("Unable to connect to database: %v", err)
	}
}

func handler(ctx context.Context, event events.SQSEvent) error {
	// Use pgxpool
	serv := service.New(pool)
	...
	return nil
}

func main() {
	lambda.Start(handler)
}

Seems like it happens intermittently. The only way to fix this right now is to restart my supabase postgres instance

@CermakM
Copy link

CermakM commented Jul 3, 2024

@reVrost I am running into the same issue. When I switched from Supabase's session mode (port 5432) to transaction mode (port 6543), I am hitting this error with mostly every query.

EDIT: Restarting supabase instance doesn't help.

@reVrost
Copy link

reVrost commented Jul 6, 2024

@CermakM You're right, restarting only help for a moment but It quickly fails again soon after. Changing to session mode (5432) fixes the prepared statements issue. Although for me this led to "FATAL: Max client connections reached (SQLSTATE XX000)" error as I am using aws lambdas. And it seems that from supabase they recommend to

use transaction mode
session mode limits concurrent connections to the Supavisor pool size
use port 6543 with your pooler connection string
configure your client to disable prepared statements, all clients support this

There might be a way to disable prepared statements using pgx but I haven't thoroughly looked into the config settings. Currently I'm just reducing the number of concurrent lambdas with session mode to alleviate it for now.

@jackc
Copy link
Owner

jackc commented Jul 6, 2024

There might be a way to disable prepared statements using pgx but I haven't thoroughly looked into the config settings.

Any ConnConfig.DefaultQueryExecMode other than the default QueryExecModeCacheStatement.

@reVrost
Copy link

reVrost commented Jul 8, 2024

Just want to give an update connecting to port 6543 (transaction mode) to supabase with QueryExecModeSimpleProtocol seems to work ok.
Although I'm not sure which is preferred 5432 with default exec mode or 6543 with simple protocol. @jackc some recommendation here which one is better would be appreciated :). Thanks!

While using the simple protocol I also ran into an issue when inserting jsonb bytes into the database. Similar to the issue written up here jackc/pgtype#45

Although, I managed to solve it using the :copyfrom insert.

@jackc
Copy link
Owner

jackc commented Jul 8, 2024

@reVrost I've never used Supabase, so I can't really give any advice there. But QueryExecModeCacheDescribe is a good choice if you do not have schema changes occurring while your application is running.

@reVrost
Copy link

reVrost commented Jul 10, 2024

Thanks @jackc! QueryExecModeCacheDescribe works great with supabase's transaction mode.
Heres code for anyone interested:

config, err := pgxpool.ParseConfig(pgConnStr)
if err != nil {
	log.Fatalf("Unable to parse connection string: %v", err)
}
config.ConnConfig.DefaultQueryExecMode = pgx.QueryExecModeCacheDescribe

pool, err = pgxpool.NewWithConfig(ctx, config)
if err != nil {
	log.Fatalf("Unable to connect to database: %v", err)
}

rodrigomolter added a commit to rodrigomolter/ama-backend that referenced this issue Aug 11, 2024
This solves problem "statment already exists (SQLSTATE 42P05)" when
using transcation mode in Supabase as mention here
jackc/pgx#1847 (comment)
@urjitbhatia
Copy link

In case someone runs into this with Supabase: use the port 5432 instead of the session mode 6543

@evanlurvey
Copy link

I had this issue using a batched query and the first execution has a schema error and swallowed the error the second attempt of the same query produces this error.

@jackc
Copy link
Owner

jackc commented Sep 13, 2024

@evanlurvey Can you provide example code demonstrating this?

@evanlurvey
Copy link

Sure

Setup DB

docker run --rm -it -p 5432:5432 -e POSTGRES_PASSWORD=password postgres

main.go

package main

import (
	"context"
	"fmt"

	"github.com/jackc/pgx/v5"
)

func main() {
	ctx := context.Background()
	conn, err := pgx.Connect(ctx, "postgres://postgres:password@localhost:5432/postgres")
	if err != nil {
		fmt.Println(err)
		return
	}
	_, _ = conn.Exec(ctx, "CREATE TABLE foo (col1 TEXT PRIMARY KEY)")
	err = query(ctx, conn)
	fmt.Println(err)
	err = query(ctx, conn)
	fmt.Println(err)
}

func query(ctx context.Context, conn *pgx.Conn) error {
	b := &pgx.Batch{}
	b.Queue("SELECT col1 FROM foo").Query(func(rows pgx.Rows) error {
		for rows.Next() {
			var col1 string
			rows.Scan(&col1)
		}
		return nil
	})
	b.Queue("SELECT col1 FROM baz").Query(func(rows pgx.Rows) error {
		for rows.Next() {
			var col1 string
			rows.Scan(&col1)
		}
		return nil
	})
	err := conn.SendBatch(ctx, b).Close()
	if err != nil {
		return err
	}
	return nil
}
go mod init whatever
go mod tidy
go run .
ERROR: relation "baz" does not exist (SQLSTATE 42P01)
ERROR: prepared statement "stmtcache_abcef3bff3ee2c6d01eb3016fe43f23a45b61c9480c7ce77" already exists (SQLSTATE 42P05)

When you play with it and change the second query you will notice the sha doesn't change because it is just driven off the first query it seems.
Also if you swap it so the first query bombs you do not get this error it handles it properly.

jackc added a commit that referenced this issue Sep 13, 2024
When a batch successfully prepared some statements, but then failed to
prepare others, the prepared statements that were successfully prepared
were not properly cleaned up. This could lead to a "prepared statement
already exists" error on subsequent attempts to prepare the same
statement.

#1847 (comment)
@jackc
Copy link
Owner

jackc commented Sep 13, 2024

@evanlurvey Thanks for the clear reproduction. Should be fixed in fd0c654.

@jackc jackc closed this as completed Sep 13, 2024
@DavidHoenisch
Copy link

In case someone runs into this with Supabase: use the port 5432 instead of the session mode 6543

That fixed it for me

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests