Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

expression,table: fix insert partitioned table bug when the time zone change #14370

Merged
merged 6 commits into from
Jan 14, 2020

Conversation

tiancaiamao
Copy link
Contributor

What problem does this PR solve?

mysql> set @@time_zone = 'UTC';                                                                                                                                                                       Query OK, 0 rows affected (0.00 sec)

mysql> CREATE TABLE `sbtest2` (
    ->        `id` int(11) NOT NULL,
    ->        `k` int(11) NOT NULL DEFAULT '0',
    ->        `c` char(120) NOT NULL DEFAULT '',
    ->        `pad` char(60) NOT NULL DEFAULT '',
    ->        `creation_dt` timestamp DEFAULT CURRENT_TIMESTAMP )  PARTITION BY RANGE ( unix_timestamp(`creation_dt`) )
    -> ( PARTITION p5 VALUES LESS THAN ( UNIX_TIMESTAMP('2020-01-03 15:10:00') ),
    ->   PARTITION p6 VALUES LESS THAN ( UNIX_TIMESTAMP('2020-01-03 15:15:00') ),
    ->   PARTITION p7 VALUES LESS THAN ( UNIX_TIMESTAMP('2020-01-03 15:20:00') ),
    ->   PARTITION p8 VALUES LESS THAN ( UNIX_TIMESTAMP('2020-01-03 15:25:00') ),
    ->   PARTITION p9 VALUES LESS THAN (MAXVALUE) )
    -> ;
Query OK, 0 rows affected (0.03 sec)

mysql> set @@time_zone = 'Asia/Shanghai';
Query OK, 0 rows affected (0.00 sec)

mysql> insert into sbtest2 values(1,1,'123','123','2020-01-03 15:16:59');                                                                                                                             Query OK, 1 row affected (0.00 sec)

mysql> select * from sbtest2 partition (p5);
+----+---+-----+-----+---------------------+
| id | k | c   | pad | creation_dt         |
+----+---+-----+-----+---------------------+
|  1 | 1 | 123 | 123 | 2020-01-03 15:16:59 |
+----+---+-----+-----+---------------------+
1 row in set (0.01 sec)

Create a partitioned table at timezone=X, then change timezone=Y, and insert data, the data doesn't locate in the correct partition.

What is changed and how it works?

The root cause of this bug is that expression should not cache the session context (which contains the timezone information).

When a partitioned table is loaded into memory, the partition expression is computed with the session context at that time.
If we change the session context (timezone) later and still use the partition expression, things would go wrong because expression uses the old session context.

Here for example:

insert into sbtest2 values(1,1,'123','123','2020-01-03 15:16:59');

When we're deciding which partition this data should belong to, we calculate the partition expression to get the answer.
The current timezone has already changed, but the partition expression still uses the old timezone.
So the result is wrong and we insert the data to the wrong partition.


Fix the expression is not a easy work. There are too much code involved.
This commit solely fixes the current table partition bug I meet.
We need to refactor a lot of codes to totally solve the problem:

  1. define a builtinFuncNew interface
  2. implement all evalXXXWithCtx for all builtinFuncSig
  3. replace the builtinFunc with builtinFuncNew
  4. remove ctx from the baseBuiltinFunc struct

Check List

Tests

  • Unit test

Code changes

  • Has interface methods change

Side effects

  • Possible performance regression

Related changes

  • Need to cherry-pick to the release branch

Release note

  • Write release note for bug-fix or new feature.

@tiancaiamao tiancaiamao requested a review from a team as a code owner January 7, 2020 07:51
@ghost ghost requested review from qw4990 and XuHuaiyu and removed request for a team January 7, 2020 07:51
@tiancaiamao
Copy link
Contributor Author

PTAL @XuHuaiyu @bb7133

@bb7133
Copy link
Member

bb7133 commented Jan 7, 2020

Hi @tiancaiamao , please update the release note in the PR Description.

@@ -291,6 +291,9 @@ func (sf *ScalarFunction) Eval(row chunk.Row) (d types.Datum, err error) {

// EvalInt implements Expression interface.
func (sf *ScalarFunction) EvalInt(ctx sessionctx.Context, row chunk.Row) (int64, bool, error) {
if f, ok := sf.Function.(builtinFuncNew); ok {
return f.evalIntWithCtx(ctx, row)
}
return sf.Function.evalInt(row)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For using the context in current session. I think of another way:
How about add a new function setCtx for struct builtinFunc,
and change here as:

Suggested change
return sf.Function.evalInt(row)
sf.Function.setCtx(ctx)
return sf.Function.evalInt(row)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The setCtx way is ugly IMHO. @Deardrops
The real problem is, the session context is not something belongs to a function.
As long as function store the ctx, the function is invalid when the session context changes.
So the right way to fix the problem is removing ctx from the struct, instead of providing setCtx as a workaround.

go.sum Outdated Show resolved Hide resolved
table/tables/partition_test.go Outdated Show resolved Hide resolved
table/tables/partition_test.go Show resolved Hide resolved
tiancaiamao and others added 2 commits January 7, 2020 23:38
Co-Authored-By: bb7133 <bb7133@gmail.com>
@Deardrops
Copy link
Contributor

In MySQL 8.0, It has same behavior with current TiDB version, does this really a bug?

mysql> set @@time_zone='+00:00';
Query OK, 0 rows affected (0.00 sec)

mysql> create table t1(dt timestamp) partition by range ( unix_timestamp(`dt`)) (
    partition p5 values less than (unix_timestamp('2020-01-03 15:10:00')), 
    partition p6 values less than (maxvalue)
);
Query OK, 0 rows affected (0.05 sec)

mysql> set @@time_zone='+08:00';
Query OK, 0 rows affected (0.00 sec)

mysql> insert into t1 values('2020-01-03 15:20:00');
Query OK, 1 row affected (0.01 sec)

mysql> select * from t1 partition (p5);
+---------------------+
| dt                  |
+---------------------+
| 2020-01-03 15:20:00 |
+---------------------+
1 row in set (0.04 sec)

mysql> select * from t1 partition (p6);
Empty set (0.00 sec)

@tiancaiamao
Copy link
Contributor Author

Try this one and verify the result in TiDB and MySQL:

set @@time_zone='+08:00';
create table t1(dt timestamp) partition by range ( unix_timestamp(`dt`)) (
    partition p5 values less than (unix_timestamp('2020-01-03 15:10:00')), 
    partition p6 values less than (unix_timestamp('2020-01-03 15:20:00')), 
    partition p7 values less than (maxvalue)
);
set @@time_zone='+00:00';
insert into t1 values ('2020-01-03 15:15:00');
select * from t1 partition (p7);

@Deardrops

@Deardrops
Copy link
Contributor

Deardrops commented Jan 8, 2020

TiDB: (master branch)

mysql> set @@time_zone='+08:00';
Query OK, 0 rows affected (0.04 sec)

mysql> create table t1(dt timestamp) partition by range ( unix_timestamp(`dt`)) (
    ->     partition p5 values less than (unix_timestamp('2020-01-03 15:10:00')), 
    ->     partition p6 values less than (unix_timestamp('2020-01-03 15:20:00')), 
    ->     partition p7 values less than (maxvalue)
    -> );
Query OK, 0 rows affected (0.06 sec)

mysql> set @@time_zone='+00:00';
Query OK, 0 rows affected (0.00 sec)

mysql> insert into t1 values ('2020-01-03 15:15:00');
Query OK, 1 row affected (0.05 sec)

mysql> select * from t1 partition (p7);
Empty set (0.00 sec)

mysql> select * from t1 partition (p6);
+---------------------+
| dt                  |
+---------------------+
| 2020-01-03 15:15:00 |
+---------------------+
1 row in set (0.00 sec)

MySQL 8.0

mysql> set @@time_zone='+08:00';
Query OK, 0 rows affected (0.00 sec)

mysql> create table t1(dt timestamp) partition by range ( unix_timestamp(`dt`)) (
    ->     partition p5 values less than (unix_timestamp('2020-01-03 15:10:00')), 
    ->     partition p6 values less than (unix_timestamp('2020-01-03 15:20:00')), 
    ->     partition p7 values less than (maxvalue)
    -> );
Query OK, 0 rows affected (0.06 sec)

mysql> set @@time_zone='+00:00';
Query OK, 0 rows affected (0.00 sec)

mysql> insert into t1 values ('2020-01-03 15:15:00');
Query OK, 1 row affected (0.01 sec)

mysql> select * from t1 partition (p7);
+---------------------+
| dt                  |
+---------------------+
| 2020-01-03 15:15:00 |
+---------------------+
1 row in set (0.00 sec)

@tiancaiamao
Copy link
Contributor Author

PTAL @bb7133 @XuHuaiyu @Deardrops

1 similar comment
@tiancaiamao
Copy link
Contributor Author

PTAL @bb7133 @XuHuaiyu @Deardrops

Copy link
Member

@bb7133 bb7133 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@bb7133 bb7133 added the status/LGT1 Indicates that a PR has LGTM 1. label Jan 10, 2020
@bb7133
Copy link
Member

bb7133 commented Jan 11, 2020

PTAL @Deardrops @XuHuaiyu

Copy link
Contributor

@Deardrops Deardrops left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@Deardrops Deardrops added status/LGT2 Indicates that a PR has LGTM 2. and removed status/LGT1 Indicates that a PR has LGTM 1. labels Jan 14, 2020
@tiancaiamao
Copy link
Contributor Author

/run-all-tests

@jackysp
Copy link
Member

jackysp commented Jan 14, 2020

/merge

@sre-bot sre-bot added the status/can-merge Indicates a PR has been approved by a committer. label Jan 14, 2020
@sre-bot
Copy link
Contributor

sre-bot commented Jan 14, 2020

/run-all-tests

@sre-bot sre-bot merged commit c36f83e into pingcap:master Jan 14, 2020
@sre-bot
Copy link
Contributor

sre-bot commented Jan 14, 2020

cherry pick to release-3.0 failed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component/expression status/can-merge Indicates a PR has been approved by a committer. status/LGT2 Indicates that a PR has LGTM 2. type/bugfix This PR fixes a bug.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants