Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The gRPC log is output to the wrong file #38941

Closed
tangenta opened this issue Nov 7, 2022 · 2 comments · Fixed by #39176
Closed

The gRPC log is output to the wrong file #38941

tangenta opened this issue Nov 7, 2022 · 2 comments · Fixed by #39176
Assignees
Labels
affects-4.0 This bug affects 4.0.x versions. affects-5.0 This bug affects 5.0.x versions. affects-5.1 This bug affects 5.1.x versions. affects-5.2 This bug affects 5.2.x versions. affects-5.3 This bug affects 5.3.x versions. affects-5.4 This bug affects 5.4.x versions. affects-6.0 affects-6.1 affects-6.2 affects-6.3 affects-6.4 severity/major sig/sql-infra SIG: SQL Infra type/bug The issue is confirmed as a bug.

Comments

@tangenta
Copy link
Contributor

tangenta commented Nov 7, 2022

Bug Report

Please answer these questions before submitting your issue. Thanks!

1. Minimal reproduce step (Required)

  1. Start TiDB with the following config:
[log.file]
max-size = 1
  1. Enable TopSQL or set GRPC_DEBUG environment variable to let gRPC keep printing
[ERROR] [grpclogger.go:116] ["[transport]transport: Got too many pings from the client, closing the connection."] [system=grpc] [grpc_log=true]"
[ERROR] [grpclogger.go:116] ["[transport]transport: loopyWriter.run returning. Err: transport: Connection closing"] [system=grpc] [grpc_log=true]
  1. Add workload to cluster to populate the log file.
  2. When the log file is rotated, gRPC error message should be output to the origin log file.

2. What did you expect to see? (Required)

New error messages from gRPC should be output to "tidb.log".

3. What did you see instead (Required)

New error messages from gRPC are output to the rotated log file.

From tidb-2022-11-07T16-14-51.727.log:

[2022/11/07 16:15:35.553 +08:00] [ERROR] [grpclogger.go:116] ["[transport]transport: Got too many pings from the client, closing the connection."] [system=grpc] [grpc_log=true]
[2022/11/07 16:15:35.554 +08:00] [ERROR] [grpclogger.go:116] ["[transport]transport: loopyWriter.run returning. Err: transport: Connection closing"] [system=grpc] [grpc_log=true]
[2022/11/07 16:16:45.557 +08:00] [ERROR] [grpclogger.go:116] ["[transport]transport: Got too many pings from the client, closing the connection."] [system=grpc] [grpc_log=true]
[2022/11/07 16:16:45.558 +08:00] [ERROR] [grpclogger.go:116] ["[transport]transport: loopyWriter.run returning. Err: transport: Connection closing"] [system=grpc] [grpc_log=true]

4. What is your TiDB version? (Required)

30a76ce

@tangenta tangenta added the type/bug The issue is confirmed as a bug. label Nov 7, 2022
@tangenta tangenta changed the title The GRPC log is output to the wrong file The gRPC log is output to the wrong file Nov 7, 2022
@tangenta
Copy link
Contributor Author

tangenta commented Nov 7, 2022

There are two logger instances, one for TiDB and the other for gRPC:

gl, props, err := log.InitLogger(&cfg.Config, zap.AddStacktrace(zapcore.FatalLevel))

l, prop, err = log.InitLogger(&config, zap.AddStacktrace(zapcore.FatalLevel))

Each of the logger maintains its own size, checks the size in Write() function and rotates the file if necessary.

The root cause: when a logger's file descriptor changes, there is no way to inform the other.

There are possible ways to fix this issue:

  • Unify the loggers of TiDB and gRPC. This means we cannot set a different level for gRPC logger. GRPC_DEBUG will not take effect anymore.
  • Discard gPRC log. We may not be able to diagnose the gRPC problem by log.
  • Print the gPRC log to stderr. The logger is more fragmented.

@breezewish @SabaPing @crazycs520 WDYT?

@tangenta tangenta added affects-5.0 This bug affects 5.0.x versions. affects-5.1 This bug affects 5.1.x versions. affects-5.2 This bug affects 5.2.x versions. affects-5.3 This bug affects 5.3.x versions. affects-5.4 This bug affects 5.4.x versions. affects-6.0 affects-6.1 affects-6.2 affects-6.3 affects-6.4 labels Nov 7, 2022
@ti-chi-bot ti-chi-bot added the may-affects-4.0 This bug maybe affects 4.0.x versions. label Nov 7, 2022
@xhebox
Copy link
Contributor

xhebox commented Nov 16, 2022

/assign xhebox

I am interested in refactoring the mess of log and cli flags, #38857 is also a related issue. I'll post a simple workaround for this first.

This was referenced Nov 16, 2022
@xhebox xhebox added affects-4.0 This bug affects 4.0.x versions. and removed may-affects-4.0 This bug maybe affects 4.0.x versions. labels Nov 16, 2022
ti-chi-bot pushed a commit that referenced this issue Nov 16, 2022
ti-chi-bot added a commit that referenced this issue Nov 16, 2022
ti-chi-bot added a commit that referenced this issue Nov 16, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
affects-4.0 This bug affects 4.0.x versions. affects-5.0 This bug affects 5.0.x versions. affects-5.1 This bug affects 5.1.x versions. affects-5.2 This bug affects 5.2.x versions. affects-5.3 This bug affects 5.3.x versions. affects-5.4 This bug affects 5.4.x versions. affects-6.0 affects-6.1 affects-6.2 affects-6.3 affects-6.4 severity/major sig/sql-infra SIG: SQL Infra type/bug The issue is confirmed as a bug.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants