Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change some logging levels #18421

Merged
merged 4 commits into from
Jan 29, 2022

Conversation

zeripath
Copy link
Contributor

  • PlainTextWithBytes - 4xx/5xx this should just be TRACE
  • notFoundInternal - the "error" here is too noisy and should be DEBUG
  • WorkerPool - Worker pool scaling messages are normal and should be DEBUG

Signed-off-by: Andrew Thornton art27@cantab.net

* PlainTextWithBytes - 4xx/5xx this should just be TRACE
* notFoundInternal - the "error" here is too noisy and should be DEBUG
* WorkerPool - Worker pool scaling messages are normal and should be DEBUG

Signed-off-by: Andrew Thornton <art27@cantab.net>
@zeripath zeripath added this to the 1.17.0 milestone Jan 27, 2022
@wxiaoguang
Copy link
Contributor

I really do not like the skip method. It's ugly, unstable and difficult to tune.

We could just output the full stack if it is necessary, then developer can get the information they need, just like most all modern languages. If not necessary we could just leave it as it is as do not touch the skip.

Or we could use a internal function to avoid exposing the skip:

plainTextBytesInternal (skip=2)
PlainTextBytes(no skip)
PlainText(no skip)

@GiteaBot GiteaBot added the lgtm/need 2 This PR needs two approvals by maintainers to be considered for merging. label Jan 27, 2022
@zeripath zeripath closed this Jan 27, 2022
@zeripath zeripath reopened this Jan 27, 2022
@zeripath
Copy link
Contributor Author

skip is precisely the right thing to do here. This is a framework method.

PlainTextBytes, PlainText and notFoundInternal are not useful or usable places to say that an error came from - they're never the cause of the problem and referring to them in the logs is not useful. Using skip here is the same as the t.Helper() function in go tests. These are pure framework methods and using the message already you can uniquely identify the line of code that has emitted the error if you need. ​

Stacktraces are almost always unhelpful in Go as you can see if you turn on Stacktrace logging on - therefore we have to work around the problem and that means we have to decide what is best to log from. The judicious use of skip within framework methods is the right way around that.

Skip is the only way you can you as the framework writer choose what is the most important place to say where the error has been emitted from. I would agree that we need to be cautious about using skip but this is the right thing to do here.

Without skip there is NO point in this logging and the only benefit of a stack trace would be to search up the stack to find precisely the line that the skip here is providing.

@wxiaoguang
Copy link
Contributor

skip is the framework method, but it shouldn't be exposed by PlainText(...) to end users.

Stacktraces are almost always unhelpful in Go, I can not agree. Do you still remember the secret key related 2FA base64 decode error in discord? The missing of stacktrace misled everyone, if there was a stacktrace, everyone can know that the error occurs during the AES decryption. Gitea code is very old and lacks of correct error handling.

@zeripath
Copy link
Contributor Author

zeripath commented Jan 27, 2022

skip is the framework method, but it shouldn't be exposed by PlainText(...) to end users.

then your problem here is that the skip is being exposed as an optional arg?

Stacktraces are almost always unhelpful in Go, I can not agree. Do you still remember the secret key related 2FA base64 decode error in discord? The missing of stacktrace misled everyone, if there was a stacktrace, everyone can know that the error occurs during the AES decryption. Gitea code is very old and lacks of correct error handling.

The problem is code like:

if err != nil {
   return err
}

Adding a stacktrace when you eventually log the error up to 15 functions up the chain is NOT helpful - the context for that error is lost and often the context for that error isn't actually going to be in the stacktrace anyway.

I don't know how many times I have to write the same or similar examples but let's add another one:

func retErr(i int) error {
	errChan := make(chan error)
	go func() {
		err := fmt.Errorf("retErr error")
		if i > 0 {
			err = retErr(i - 1)
		}
		errChan <- err
		close(errChan)
	}()

	return <-errChan
}

func main() {
	fmt.Println(retErr(15))
}

Now what should be the "stacktrace" of the error be here? Even the stack when error arises is going to be retErr.func().1 (+ some go funcs) and that's it. This isn't particularly elucidating over simply grepping the code for "retErr error".


To get a useful stacktrace for an error would mean changing every place that returns an error to something like:

if err != nil {
  return errors.WithStackTrace(err) 
}

(Although see above how channels and go routines completely mess that)

Where WithStackTrace does some magic to add that context - BUT - it's still not going to provide the correct stack trace for the error as the only line it knows about is the return line - not the place that actually created the error. We would then have to change every single piece of code that generates an error to do something similar - but we're screwed because every external piece of code won't have that context.

However, it could be argued - (and is argued by the golang devs) that this is equivalent to :

if err != nil {
   log.Error("This code caused a problem: %v", err) // if considered helpful
   return fmt.Errorf("this error came from this bit and wraps: %w", err)
}

We need to add context to errors when we see them and we need to decide what is the correct context for that error. Whilst a stacktrace of where the error was seen helps give that context, one of the really appalling things about stacktraces is that most of the noise they produce is just not relevant.

More targeted information being added to errors can be much more useful.

The Go developers argue that the additional context provided by stacktraces is simply not worth the performance cost and noise cost of providing this. Therefore they are not provided by the language.

That is simply a feature of the language - we can disagree with that as much as we like and we could add back something like errors.WithStackTrace(...) and it could be helpful - but to be honest so would simply adding context to the errors.

Your problem with base64 was because the error didn't return or add any context.


However, this is frankly irrelevant to this PR.

This PR simply looks at the framework and determines that the correct context for this error is not the place where the error arose but actually 1 or 2 functions back - it adds context to help determine where exactly the error arose but it tells the reader of the log file where it thinks is the most important thing. That is correct error handling.

A stacktrace can be correct error handling and they can certainly be helpful in debugging, using the stacktrace here? Well that's just noise. It's not any more helpful at all in determining where this error came from as compared to skip and a bit of context in the error.

@wxiaoguang
Copy link
Contributor

then your problem here is that the skip is being exposed as an optional arg?

Yep, it's not ideal to expose it. For internal usage (limited to the package), since it just works.

Or we could use a internal function to avoid exposing the skip:
plainTextBytesInternal (skip=2)
PlainTextBytes(no skip)
PlainText(no skip)

Just like what notFoundInternal does

@wxiaoguang
Copy link
Contributor

wxiaoguang commented Jan 28, 2022

PR: zeripath#12

@wxiaoguang wxiaoguang force-pushed the PlainBytes-404-is-not-an-error branch from 68389bd to d57bc8b Compare January 28, 2022 03:28
@wxiaoguang
Copy link
Contributor

wxiaoguang commented Jan 28, 2022

Sorry, made a mistake that push the forked PR directly into this main repo. Let me fix it ....


fixed

@GiteaBot GiteaBot added lgtm/need 1 This PR needs approval from one additional maintainer to be merged. and removed lgtm/need 2 This PR needs two approvals by maintainers to be considered for merging. labels Jan 29, 2022
@GiteaBot GiteaBot added lgtm/done This PR has enough approvals to get merged. There are no important open reservations anymore. and removed lgtm/need 1 This PR needs approval from one additional maintainer to be merged. labels Jan 29, 2022
@codecov-commenter
Copy link

Codecov Report

❗ No coverage uploaded for pull request base (main@e19b965). Click here to learn what that means.
The diff coverage is 71.42%.

Impacted file tree graph

@@           Coverage Diff           @@
##             main   #18421   +/-   ##
=======================================
  Coverage        ?   46.03%           
=======================================
  Files           ?      842           
  Lines           ?    93202           
  Branches        ?        0           
=======================================
  Hits            ?    42908           
  Misses          ?    43483           
  Partials        ?     6811           
Impacted Files Coverage Δ
modules/queue/workerpool.go 55.62% <25.00%> (ø)
modules/context/context.go 65.26% <90.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e19b965...b6a1e61. Read the comment docs.

@zeripath zeripath merged commit be77ede into go-gitea:main Jan 29, 2022
@zeripath zeripath deleted the PlainBytes-404-is-not-an-error branch January 29, 2022 20:52
Chianina pushed a commit to Chianina/gitea that referenced this pull request Mar 28, 2022
* Change some logging levels

* PlainTextWithBytes - 4xx/5xx this should just be TRACE
* notFoundInternal - the "error" here is too noisy and should be DEBUG
* WorkerPool - Worker pool scaling messages are normal and should be DEBUG

Signed-off-by: Andrew Thornton <art27@cantab.net>
Co-authored-by: wxiaoguang <wxiaoguang@gmail.com>
@go-gitea go-gitea locked and limited conversation to collaborators Apr 28, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
lgtm/done This PR has enough approvals to get merged. There are no important open reservations anymore.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants