Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tablet splitting: SqlSecondaryIndex sample app errors: Request id -1 is less than min running 0 #7096

Closed
ttyusupov opened this issue Feb 4, 2021 · 2 comments
Assignees
Labels
area/docdb YugabyteDB core features kind/bug This issue is a bug

Comments

@ttyusupov
Copy link
Contributor

./bin/yb-ctl --rf=1 create --num_shards_per_tserver=1 --ysql_num_shards_per_tserver=1 --master_flags '"tablet_split_size_threshold_bytes=300000"' --tserver_flags '"db_write_buffer_size=100000"'

java -jar ~/code/yb-sample-apps/target/yb-sample-apps.jar --workload SqlSecondaryIndex --nodes 127.0.0.1:5433 --num_threads_read 1 --num_threads_write 1 --num_unique_keys 10000000 --nouuid

Got into a state procuding errors infinitely:

1358164 [Thread-2] INFO com.yugabyte.sample.apps.SqlSecondaryIndex  - Failed writing key: key:5997
org.postgresql.util.PSQLException: ERROR: Operation failed. Try again.: [Operation failed. Try again. (yb/client/tablet_rpc.cc:417): Request id -1 is less than min running 0: Min running request ID: 0]
	at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2553)
	at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2285)
	at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:323)
	at org.postgresql.jdbc.PgStatement.executeInternal(PgStatement.java:473)
	at org.postgresql.jdbc.PgStatement.execute(PgStatement.java:393)
	at org.postgresql.jdbc.PgPreparedStatement.executeWithFlags(PgPreparedStatement.java:164)
	at org.postgresql.jdbc.PgPreparedStatement.executeUpdate(PgPreparedStatement.java:130)
	at com.yugabyte.sample.apps.SqlSecondaryIndex.doWrite(SqlSecondaryIndex.java:174)
	at com.yugabyte.sample.apps.AppBase.performWrite(AppBase.java:741)
	at com.yugabyte.sample.common.IOPSThread.run(IOPSThread.java:87)
1358446 [Thread-2] INFO com.yugabyte.sample.apps.SqlSecondaryIndex  - Failed writing key: key:5998
org.postgresql.util.PSQLException: ERROR: Operation failed. Try again.: [Operation failed. Try again. (yb/client/tablet_rpc.cc:417): Request id -1 is less than min running 0: Min running request ID: 0]
	at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2553)
	at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2285)
	at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:323)
	at org.postgresql.jdbc.PgStatement.executeInternal(PgStatement.java:473)
	at org.postgresql.jdbc.PgStatement.execute(PgStatement.java:393)
	at org.postgresql.jdbc.PgPreparedStatement.executeWithFlags(PgPreparedStatement.java:164)
	at org.postgresql.jdbc.PgPreparedStatement.executeUpdate(PgPreparedStatement.java:130)
	at com.yugabyte.sample.apps.SqlSecondaryIndex.doWrite(SqlSecondaryIndex.java:174)
	at com.yugabyte.sample.apps.AppBase.performWrite(AppBase.java:741)
	at com.yugabyte.sample.common.IOPSThread.run(IOPSThread.java:87)
1358722 [Thread-2] INFO com.yugabyte.sample.apps.SqlSecondaryIndex  - Failed writing key: key:5999
org.postgresql.util.PSQLException: ERROR: Operation failed. Try again.: [Operation failed. Try again. (yb/client/tablet_rpc.cc:417): Request id -1 is less than min running 0: Min running request ID: 0]
	at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2553)
	at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2285)
	at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:323)
	at org.postgresql.jdbc.PgStatement.executeInternal(PgStatement.java:473)
	at org.postgresql.jdbc.PgStatement.execute(PgStatement.java:393)
	at org.postgresql.jdbc.PgPreparedStatement.executeWithFlags(PgPreparedStatement.java:164)
	at org.postgresql.jdbc.PgPreparedStatement.executeUpdate(PgPreparedStatement.java:130)
	at com.yugabyte.sample.apps.SqlSecondaryIndex.doWrite(SqlSecondaryIndex.java:174)
	at com.yugabyte.sample.apps.AppBase.performWrite(AppBase.java:741)
	at com.yugabyte.sample.common.IOPSThread.run(IOPSThread.java:87)
....

While some amount of these errors is expected until #5854 is implemented, these errors should disappear after retries on client side, but this is not happening.

@ttyusupov ttyusupov added the area/docdb YugabyteDB core features label Feb 4, 2021
@ttyusupov ttyusupov self-assigned this Feb 4, 2021
@ttyusupov ttyusupov added the kind/bug This issue is a bug label Feb 4, 2021
@ttyusupov
Copy link
Contributor Author

I've found that postgres process is restarted each time it is getting an error executing a statement during that workload. This is caused by the fact that SqlSecondaryIndex sample app establishing a new connection on every error.

So, after every error newly started postgres process has a fresh embedded YBClient MetaCache instance.

We have a duplicate requests tracking mechanism on tserver, every Write request to TServer has a client ID + request ID. And tserver tracks mapping from client ID to request IDs already executed to return an error in case of getting duplicate request from the same client. When we split the tablet - this mapping is also copied into child tablets on tserver side. So, ideally, the client needs to continue from parent tablet request ID when sending requests to child tablets. But it can happen that client was “away” during subsequent splits and then it tries to send RPC to some tablet for which it didn’t know the parent. So, it can’t determine from which request ID to start for this child tablet.

There is another mechanism implemented to deal with such scenarios: f9cae11.
When MetaCache doesn’t know about the parent tablet - we use a special value as request_id_seq for post-split tablet, then on getting “request id is less than min” error - use min request ID from the server plus 2^24 (there wouldn’t be 2^24 client requests in progress from the same client to the same tablet, so it is safe to do this).
This supposed to work in a way that the same client gets an error and updates request_id_seq to use for this tablet and then retries the request. But since postgres process with embedded YBClient MetaCache is restarted - this happens again and again: fresh postgres process doesn’t know which request ID to use - sends special value, got an error and got restarted by the client app.

@ttyusupov
Copy link
Contributor Author

#5854 should fix this issue because retries for requests described above will also be handled transparently for the client.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/docdb YugabyteDB core features kind/bug This issue is a bug
Projects
None yet
Development

No branches or pull requests

1 participant