-
Notifications
You must be signed in to change notification settings - Fork 66
optimize memory in hundreds-of tables scenario #52
Conversation
@GregoryIan @csuzhangxc PTAL |
return nextRowID, affectedRows, checksum, errors.Trace(err) | ||
} | ||
|
||
//////////////////////////////////////////////////////////////// | ||
|
||
type kvEncoderPool struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no more use the kvencoder pool. useless
lightning/restore/restore.go
Outdated
dbInfo.Name, tableInfo.Name, tableInfo.ID, | ||
tableInfo.Columns, cfg.TiDB.SQLMode) | ||
if err != nil { | ||
common.AppLogger.Errorf("failed to new kv encoder (%s) : %s", dbInfo.Name, err.Error()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
errors.ErrorStack(err)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have removed that line since we have trace the error.
lightning/restore/restore.go
Outdated
@@ -484,118 +504,19 @@ func (t *regionRestoreTask) Run(ctx context.Context) error { | |||
return errors.Trace(err) | |||
} | |||
t.status = statFinished | |||
|
|||
return nil | |||
err = t.encoder.Close() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we need to do t.encoder.Close()
in defer
?
or maybe you can make encoder
's life cycle minimized
|
||
tr := &TableRestore{ | ||
ctx: ctx, | ||
cfg: cfg, | ||
dbInfo: dbInfo, | ||
tableInfo: tableInfo, | ||
tableMeta: tableMeta, | ||
encoders: encoders, | ||
encoder: encoder, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seems no need to save it? because we do not use it anymore
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to save it because we need to close the encoder in the last when all the regions(of the table) finished.
Why we can't close the encoder early? The reason is that it may close the global dom.
ref: https://github.com/pingcap/tidb/blob/master/util/kvencoder/kv_encoder.go#L118
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
😳 fine
lightning/restore/restore.go
Outdated
@@ -487,6 +486,11 @@ func newRegionRestoreTask( | |||
} | |||
|
|||
func (t *regionRestoreTask) Run(ctx context.Context) error { | |||
defer func() { | |||
if err := t.encoder.Close(); err != nil { | |||
common.AppLogger.Errorf("close encoder err %v", errors.ErrorStack(err)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we need to modify the returned error here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
lightning/restore/restore.go
Outdated
func (t *regionRestoreTask) Run(ctx context.Context) error { | ||
func (t *regionRestoreTask) Run(ctx context.Context) (err error) { | ||
defer func() { | ||
if err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Return it, and output error of encoder.Close()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@GregoryIan PTAL
Rest LGTM |
I actually want to know if there are any other memory points? |
@GregoryIan There is another memory point. Tikv-importer will allocate 1 GB memory for each open tables(1 GB is configurable). If there are many opened tables in the same time which will allocate so much memory and even causes tikv-importer OOM killed. For one hand, tidb-lightning should control the maximum opened engines(one engine is corresponding to an opened table in tikv-importer), on the other hand, tikv-importer should control it by itself but it doesn't. |
LGTM |
I known it. Has the program been specifically tested for memory usage? |
LGTM |
@GregoryIan I only tested it with multiple tables(512 tables), and I am going to test with large table. |
We'll talk more about it later |
It's mainly caused by tidb kvencoder and of course lightning use it incorrectly.
Reference: pingcap/tidb#7094
Test data:
512 tables,each table has 10000 rows. tested on my laptop.
After optimized, the heap memory costs at average 200-400 MB
data:image/s3,"s3://crabby-images/d99f5/d99f559af394741616a41a16c365fa08cbd9180b" alt="image"
About the modification:
wait #51 merged