Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Query in transaction may return rows with same unique index column value #24195

Open
vivid392845427 opened this issue Apr 21, 2021 · 14 comments · May be fixed by #27630
Open

Query in transaction may return rows with same unique index column value #24195

vivid392845427 opened this issue Apr 21, 2021 · 14 comments · May be fixed by #27630
Assignees
Labels
affects-5.0 This bug affects 5.0.x versions. affects-5.1 This bug affects 5.1.x versions. affects-5.2 This bug affects 5.2.x versions. affects-5.3 This bug affects 5.3.x versions. affects-5.4 This bug affects 5.4.x versions. affects-6.0 affects-6.1 affects-6.2 affects-6.3 affects-6.4 affects-6.5 affects-6.6 affects-7.0 affects-7.1 help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. severity/moderate sig/transaction SIG:Transaction type/bug The issue is confirmed as a bug.

Comments

@vivid392845427
Copy link

vivid392845427 commented Apr 21, 2021

Bug Report

Please answer these questions before submitting your issue. Thanks!

1. Minimal reproduce step (Required)

drop table if exists t;
create table t (a int, b int, primary key (a, b) /*T![clustered_index] nonclustered */);
insert into t values (1, 10);
begin optimistic;
insert into t values (1, 10);
select * from t;
commit;

2. What did you expect to see? (Required)

Only one row is returned by the select statement.

3. What did you see instead (Required)

mysql> /* t */ select * from t; -- (1, 10), (1, 10)
+---+----+
| a | b  |
+---+----+
| 1 | 10 |
| 1 | 10 |
+---+----+
2 rows in set (0.00 sec)

mysql> /* t */ commit;
ERROR 1062 (23000): Duplicate entry '1-10' for key 'PRIMARY'

4. What is your TiDB version? (Required)

Release Version: v5.0.0-43-g41871e0c8
Edition: Community
Git Commit Hash: 41871e0
Git Branch: release-5.0
UTC Build Time: 2021-04-20 09:24:52
GoVersion: go1.13
Race Enabled: false
TiKV Min Version: v3.0.0-60965b006877ca7234adaced7890d7b029ed1306
Check Table Before Drop: false

@vivid392845427 vivid392845427 added the type/bug The issue is confirmed as a bug. label Apr 21, 2021
@cfzjywxk
Copy link
Contributor

The different behaviours bettween int handle and non-int handle is caused that the unionScan operator will do merge or exclude based on the handle value. When the non-int primary key is used, the generated handle _tidb_rowid for two rows may be different so we could see two rows in the optimistic transaction.
This different behaviour exist in v4.0, v5.0 and master, it's needed to define the expected behaviour first.

@vivid392845427
Copy link
Author

vivid392845427 commented Apr 23, 2021

Scene 2:Pessimistic transaction

/* s1 */ drop table if exists t;
/* s1 */ create table t (c1 varchar(10), c2 int, c3 char(20), primary key (c1, c2));
/* s1 */ insert into t values ('tag', 10, 't'), ('cat', 20, 'c');
/* s2 */ begin;
/* s2 */ update t set c1=reverse(c1) where c1='tag';
/* s4 */ begin;
/* s4 */ insert into t values('dress',40,'d'),('tag', 10, 't');
/* s2 */ commit;
/* s4 */ select * from t use index(primary) order by c1,c2;

when primary key is clustered index, query return
mysql> /* s4 */ select * from t use index(primary) order by c1,c2;
+-------+----+------+
| c1    | c2 | c3   |
+-------+----+------+
| cat   | 20 | c    |
| dress | 40 | d    |
| tag   | 10 | t    |
+-------+----+------+
3 rows in set (0.00 sec)

when primary key is nonclustered index, query return
mysql> /* s4 */ select * from t use index(primary) order by c1,c2;
+-------+----+------+
| c1    | c2 | c3   |
+-------+----+------+
| cat   | 20 | c    |
| dress | 40 | d    |
| tag   | 10 | t    |
| tag   | 10 | t    |
+-------+----+------+
4 rows in set (0.00 sec)

mysql return
mysql> /* s4 */ select * from t use index(primary) order by c1,c2;
+-------+----+------+
| c1    | c2 | c3   |
+-------+----+------+
| cat   | 20 | c    |
| dress | 40 | d    |
| gat   | 10 | t    |
| tag   | 10 | t    |
+-------+----+------+
4 rows in set (0.00 sec)

@cfzjywxk
Copy link
Contributor

cfzjywxk commented Apr 23, 2021

As described above, the different behaviours bettwen int handle and non-int handle sometimes may confuse the use that two rows with same unique key values are returned doing snapshot read in a transaction.
Both optimistic and pessimistic transactions may encounter this phenomenon.

@cfzjywxk cfzjywxk changed the title Optimistic transactions delay checking,the newly inserted record is unable to merge Query in transaction may return rows with same unique index column value Apr 25, 2021
@cfzjywxk cfzjywxk added severity/major help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. and removed status/discussion labels Apr 29, 2021
@cfzjywxk
Copy link
Contributor

It's needed to fix this issue, the way to to so is to change the unionScan operator which merge the in memory contents and snapshot results from kv storage, the unique index should be considered doing the merge task.

@vivid392845427
Copy link
Author

vivid392845427 commented Apr 29, 2021

Scene 3:insert some record into the table,union scan can not merge when update clustered index set the value of the first record bigger than the second record;

/* s1 */ drop table if exists test1;
/* s1 */ create table test1(a int primary key clustered, b int);
/* s1 */ insert into test1 values(1,1),(5,5),(10,10);
/* s1 */ begin;
/* s1 */ update test1 set a=8 where a=1;
/* s2 */ begin;
/* s2 */ update test1 set a=a+1;
/* s1 */ commit;
/* s2 */ select * from test1;
actual return:
mysql> /* s2 */ select * from test1;
+----+------+
| a  | b    |
+----+------+
|  1 |    1 |
|  6 |    5 |
|  9 |    1 |
| 11 |   10 |
+----+------+
4 rows in set (0.00 sec)

expect return:
mysql> /* s2 */ select * from test1;
+----+------+
| a  | b    |
+----+------+
|  6 |    5 |
|  9 |    1 |
| 11 |   10 |
+----+------+
3 rows in set (0.01 sec)

@cfzjywxk
Copy link
Contributor

cfzjywxk commented Apr 29, 2021

@vivid392845427
This case is a bit different I think, the pessimistic mode transactions in tidb will use current read doing update, so the actually updated rows in s2 are 5, 8, 10, value 1 is not touched. After the update test1 set a=a+1; statement in s2, the new rows are inserted in the memory buffer, then the snapshot read in select * from test1 in s2 will see the value 1, as well as the newly inserted rows 6, 9, 11, the final result is 1, 6, 9, 11.
I think this is expected as the snapshot read in pessimistic mode is diffrent with that in optimistic mode.

But it's still incompatbible or different for cluster and non cluster index for this case, as the non-cluster index case the handle value which is _tidb_rowid will not change.

It's still the case cluster/non-cluster index table will have incompatible behaviour problem.

@you06
Copy link
Contributor

you06 commented Aug 24, 2021

There is one thing that confused me, the optimistic transactions will not do unique key check for insert statement before commit. This behavior is consistent across all versions of TiDB. However, it brought the problem of dealing with union membuffer data and snapshot data together. Further, which data to be choose is hard to answer.

tidb/executor/union_scan.go

Lines 223 to 228 in 5516d7d

checkKey := tablecodec.EncodeRecordKey(us.table.RecordPrefix(), snapshotHandle)
if _, err := us.memBufSnap.Get(context.TODO(), checkKey); err == nil {
// If src handle appears in added rows, it means there is conflict and the transaction will fail to
// commit, but for simplicity, we don't handle it here.
continue
}

From the comment, we can notice what the developer thought is the conflict transaction fails when committing anyway, we do not handle the read issue for simplicity (Anomalies in aborted transaction can be dismissed in some theories).

IMO, there are 2 ways to solve this issue.

    1. Handle the read issue for transactions which not able to commit or check the constaint during execution write statements. This can by done by check if any of unique indexes is in membuffer. However, some behavior may still not make sense.
/* s1 */ drop table if exists t;
/* s1 */ create table t (a int, b int, primary key (a));
/* s1 */ insert into t values (1, 10);
/* s1 */ begin optimistic;
/* s1 */ insert into t values (1, 11); -- Success in optimistic transaction 
/* s1 */ select * from t; -- What result set do we expect? If we take membuffer data prior, we'll got (1, 11)
/* s1 */ commit; -- Fail in optimistic transaction
    1. Check the constraints for optimistic transactions also. This can make the issue clearer and we'll not meet the problem of which data to be choosed.

I prefer the second solution, @cfzjywxk, what's your idea?

@cfzjywxk
Copy link
Contributor

There is one thing that confused me, the optimistic transactions will not do unique key check for insert statement before commit. This behavior is consistent across all versions of TiDB. However, it brought the problem of dealing with union membuffer data and snapshot data together. Further, which data to be choose is hard to answer.

tidb/executor/union_scan.go

Lines 223 to 228 in 5516d7d

checkKey := tablecodec.EncodeRecordKey(us.table.RecordPrefix(), snapshotHandle)
if _, err := us.memBufSnap.Get(context.TODO(), checkKey); err == nil {
// If src handle appears in added rows, it means there is conflict and the transaction will fail to
// commit, but for simplicity, we don't handle it here.
continue
}

From the comment, we can notice what the developer thought is the conflict transaction fails when committing anyway, we do not handle the read issue for simplicity (Anomalies in aborted transaction can be dismissed in some theories).

IMO, there are 2 ways to solve this issue.

    1. Handle the read issue for transactions which not able to commit or check the constaint during execution write statements. This can by done by check if any of unique indexes is in membuffer. However, some behavior may still not make sense.
/* s1 */ drop table if exists t;
/* s1 */ create table t (a int, b int, primary key (a));
/* s1 */ insert into t values (1, 10);
/* s1 */ begin optimistic;
/* s1 */ insert into t values (1, 11); -- Success in optimistic transaction 
/* s1 */ select * from t; -- What result set do we expect? If we take membuffer data prior, we'll got (1, 11)
/* s1 */ commit; -- Fail in optimistic transaction
    1. Check the constraints for optimistic transactions also. This can make the issue clearer and we'll not meet the problem of which data to be choosed.

I prefer the second solution, @cfzjywxk, what's your idea?

@you06
To keep the compatibility, I think it's more reasonable to process this conflict in the union scan operator and return the memory buffer content doing merge, it's weird to return same rows reading a table with unique index keys.
For this solution we may need to verify if this prior strategy will introduce some other weird behaviours or issues related to compatibility, if there's not I think it's an improvement for the optimistic transaction.

@you06
Copy link
Contributor

you06 commented Aug 24, 2021

If I understood correctly, we may fix it by the first solution, handle it in union scan. The second solution may be a long term improvement in which we should care about if our users will suffer unexpected failures introduced by this change. @cfzjywxk is it what you mean?

@cfzjywxk
Copy link
Contributor

If I understood correctly, we may fix it by the first solution, handle it in union scan. The second solution may be a long term improvement in which we should care about if our users will suffer unexpected failures introduced by this change. @cfzjywxk is it what you mean?

@you06
Yes, we could solve some unexpected or unreasonable behaviours first without breaking any compatibility by now.

@you06
Copy link
Contributor

you06 commented Aug 27, 2021

Scene 2:Pessimistic transaction

The weird behavior of MySQL should blame to its lazy snapshot strategy, if you select after s4's begin statement(this will take a snapshot for MySQL's transaction). Then we'll get the following result, this is also the expected result for me.

/* s1 */ drop table if exists t;
/* s1 */ create table t (c1 varchar(10), c2 int, c3 char(20), primary key (c1, c2));
/* s1 */ insert into t values ('tag', 10, 't'), ('cat', 20, 'c');

/* s2 */ begin;
/* s2 */ update t set c1=reverse(c1) where c1='tag';

/* s4 */ begin;
/* s4 */ select * from t use index(primary) order by c1,c2;
/* s4 */ insert into t values('dress',40,'d'),('tag', 10, 't');

/* s2 */ commit;
/* s4 */ select * from t use index(primary) order by c1,c2;
-- s4 >> +-------+----+----+
-- s4    |  C1   | C2 | C3 |
-- s4    +-------+----+----+
-- s4    | cat   | 20 | c  |
-- s4    | dress | 40 | d  |
-- s4    | tag   | 10 | t  |
-- s4    +-------+----+----+
/* s4 */ commit;

/* s2 */ select * from t use index(primary) order by c1,c2;
-- s2 >> +-------+----+----+
-- s2    |  C1   | C2 | C3 |
-- s2    +-------+----+----+
-- s2    | cat   | 20 | c  |
-- s2    | dress | 40 | d  |
-- s2    | gat   | 10 | t  |
-- s2    | tag   | 10 | t  |
-- s2    +-------+----+----+

@cfzjywxk
Copy link
Contributor

cfzjywxk commented Nov 22, 2021

@you06 @vivid392845427 @zyguan
As there's compatibility risk for these related issues and they are not easy to fix, what do you think we process this next?

@zyguan
Copy link
Contributor

zyguan commented Nov 23, 2021

@you06 @vivid392845427 @zyguan As there's compatibility risk for these related issues and they are not easy to fix, what do you think we process this next?

Maybe we can document it as a known issue currently. IMO, it's actually not a conventional usage, since a optimistic transaction is not guaranteed to be committed, one should handle the intermediate results within the transaction carefully.

@you06
Copy link
Contributor

you06 commented Nov 24, 2021

@you06 @vivid392845427 @zyguan As there's compatibility risk for these related issues and they are not easy to fix, what do you think we process this next?

There are similarities to the optimization of delete-your-writes, since #28806 is blocked by some complex problems, I think it's better to hold this issue by now.

IMO, it's actually not a conventional usage

I agree with this point, maybe we can downgrade the severity of this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
affects-5.0 This bug affects 5.0.x versions. affects-5.1 This bug affects 5.1.x versions. affects-5.2 This bug affects 5.2.x versions. affects-5.3 This bug affects 5.3.x versions. affects-5.4 This bug affects 5.4.x versions. affects-6.0 affects-6.1 affects-6.2 affects-6.3 affects-6.4 affects-6.5 affects-6.6 affects-7.0 affects-7.1 help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. severity/moderate sig/transaction SIG:Transaction type/bug The issue is confirmed as a bug.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants