Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add reraise_exception_classes keyword argument to Logger and LogDevice #37

Merged
merged 1 commit into from
Jul 11, 2024

Conversation

jeremyevans
Copy link
Contributor

@jeremyevans jeremyevans commented Aug 16, 2019

This allows the user to specify exception classes to treat as regular
exceptions instead of being swallowed. Among other things, it is
useful for having Logger work with Timeout when having Timeout raise
a specific subclass.

Fixes Ruby Bug 9115. https://bugs.ruby-lang.org/issues/9115

@jeremyevans jeremyevans force-pushed the reraise_exception_classes branch from 931326d to 702276e Compare August 16, 2019 16:32
@ioquatix
Copy link
Member

Isn't the problem here the catch-all:

            rescue
	              warn("log shifting failed. #{$!}")

To me, that's bad design.

@ioquatix
Copy link
Member

Also, while I'm not certain of the best design, there is certainly room for figuring out if Timeout is derived from StandardError or Exception. If it's derived from Exception, it won't be caught unless there is an explicit rescue Exception or rescue Timeout...

@jeremyevans
Copy link
Contributor Author

There's already a rescue Exception later in the same method. I believe the reason for this is that logger should never break existing code, so any exceptions that occur while logging should be swallowed (by default). This patch allows a way for the developer to override the default and state that certain exception classes should not be swallowed.

@ioquatix
Copy link
Member

Would another solution just be to make logger aware of timeout error?

@jeremyevans
Copy link
Contributor Author

Timeout allows you to raise an arbitrary exception. See https://bugs.ruby-lang.org/issues/9115

@ioquatix
Copy link
Member

ioquatix commented Aug 17, 2019

So, generally, I agree with you. The bug here lies in the logger implementation and the assumption that it should eat all exceptions. But the proposal you make to fix it, IMHO, is not sufficient.

What I mean by this, is that logger is often injected into other code.

So, you have different code bases sharing the same logger.

But only one code base is responsible for instantiating it and setting up the exceptions to ignore.

So, there is no way for code to reason about how any given logger instance behaves. If you wanted to ignore a given exception, but you were given a logger instance, you'd have to add it, but you potentially break someone elses code if they didn't expect that exception to be thrown.

Logger eating all errors is wrong. Only exceptions relating to specific operations (e.g. file system permissions, run out of disk space, etc) should be captured. Even then, I think if I have a logger instance and it's silently discarding all messages, that's an error.

The correct fix, IMHO, is to remove the rescue catch all and only catch specific errors that relate to the operations the log device is doing.

@jeremyevans
Copy link
Contributor Author

If you already are aware of all exception classes that you would need to catch, and are sure that switching from swallowing all exceptions by default to only rescuing only those specific exceptions can be done without breaking backwards compatibility, I am fine with that approach. However, considering that the underlying @dev in the log device can be an arbitrary object that responds to write and close and could theoretically raise any exception, I'm not sure how you would do that.

@ioquatix
Copy link
Member

only rescuing only those specific exceptions can be done without breaking backwards compatibility

I think this is a better option, but probably you need feedback from @sonots

Regarding @dev. I think it's up to the individual implementations to correctly handle internal exceptions and signal some kind of failure state - e.g. maybe Logger::DeviceError with appropriate cause can be managed internally - i.e. what to do when the @dev has entered some kind of failure - I can't imagine how this works if it's not already handled, e.g. ENOSPC would just cascade into many other problems.

lib/logger.rb Outdated
@@ -370,14 +370,19 @@ def fatal!; self.level = FATAL; end
# +shift_period_suffix+::
# The log file suffix format for +daily+, +weekly+ or +monthly+ rotation.
# Default is '%Y%m%d'.
# +reraise_exception_classes+::
Copy link
Member

@sonots sonots Aug 17, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The name was not clear to me. I could not recognize as it re-raises if there is an error when writing to the log device without reading description.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. I'll force push a new commit that changes this to reraise_write_errors.

@sonots
Copy link
Member

sonots commented Aug 17, 2019

The correct fix, IMHO, is to remove the rescue catch all and only catch specific errors that relate to the operations the log device is doing.

I agree, but

@dev in the log device can be an arbitrary object that responds to write and close and could theoretically raise any exception

this is correct.

  # +logdev+::
  #   The log device.  This is a filename (String) or IO object (typically
  #   +STDOUT+, +STDERR+, or an open file).

The @dev can be any objects implementing IO interface (only #write and #close are necessary) and I know that some people are passing a object implementing same interface with IO objects, but the internal implementation is like HTTP POST.

So, I feel

only catch specific errors

is difficult without breaking compatibilities. Any idea?

@ioquatix
Copy link
Member

Any idea?

Here is the solution:

Bump the major version, and fix the issue correctly. Correctly being, remove the catch all.

Users will have some pain, but you've signalled that via the major version bump.

I'd also suggest clearly documenting the expectations of what happens if the dev fails and what exceptions are caught.

@sonots
Copy link
Member

sonots commented Aug 19, 2019

Bump the major version, and fix the issue correctly. Correctly being, remove the catch all.

What is correct? Are you assuming we prohibit passing any IO-like objects to logdev argument?
As long as we allow passing IO-like objects, we can not narrow exceptions to be thrown.

@jeremyevans jeremyevans force-pushed the reraise_exception_classes branch from 702276e to 0864085 Compare August 19, 2019 19:30
@ioquatix
Copy link
Member

What is correct? Are you assuming we prohibit passing any IO-like objects to logdev argument?
As long as we allow passing IO-like objects, we can not narrow exceptions to be thrown.

Why not?

IO has a fairly narrow set of exceptions which can be thrown and caught in a useful way.

@ioquatix
Copy link
Member

ioquatix commented Aug 19, 2019

Sometimes, when the solution to a "problem" is worse than the problem itself, I start to wonder about the core assumptions that lead us to decide that it was a problem in the first place.

In this case, we are assuming Logger should never raise exceptions.

But there are cases where maybe it should.

What happens if someone is using logger for logging important details - if the logger fails, those details would be lost silently. Is this the right behaviour?

In general, I can understand that logger should do it's best to log information, and if that fails, it shouldn't break the program.

But from the other POV, if something is so broken by your logger that it's throwing exceptions, maybe that represents a failure which logger itself can't know how to process. Right now, it's ignoring all errors, which might cascade to other failures later, or loose important information.

So to me, the correct solution is to only rescue errors that the logger can specifically do something about.

I'm sure there are better guidelines than this regarding how to make exception safe programs. That being said, maybe one "safe" option is to introduce new parameter ignore_errors: true which becomes

rescue
    raise unless @ignore_errors

That way, users can optimistically convert their logger, i.e. Logger.new(ignore_errors: false).

If there are specific exceptions we can catch and handle correctly, that should exist too, however, I'm against any kind of rescue clause which just ignores the error. It should only exist if there is some tangible action to be taken.

With major version bump, we could make ignore_errors: false the default, depending on user feedback.

@jeremyevans
Copy link
Contributor Author

What is correct? Are you assuming we prohibit passing any IO-like objects to logdev argument?
As long as we allow passing IO-like objects, we can not narrow exceptions to be thrown.

Why not?

IO has a fairly narrow set of exceptions which can be thrown and caught in a useful way.

IO and IO-like, while they appear similar, are actually quite different things.

I'm not opposed to an ignore_errors keyword, but that would be a separate pull request. It doesn't solve the need for the current pull request (as described), because this need is treat specific exceptions differently. Potentially, you could have ignore_errors be a callable that is called with the exception instance and it returns true or false for whether to ignore it. That would handle the need addressed by this pull request, albeit in a more verbose (though more flexible) manner.

@ioquatix
Copy link
Member

IO and IO-like, while they appear similar, are actually quite different things.

Yes, that may be the case, but I'm thinking of the interface we expect. That part needs to be well defined. If we just expect code to throw any error, is it our job to handle it? I don't think so. Personally, I can think of a ton of reasons why this is might be the case, e.g.

  • If using HTTP for the IO-like output device, and the authentication is not configured correctly, all logging is silently dropped, and user doesn't know.

  • If someone was using non-blocking IO with exceptions (another poorly designed part of stdlib) that will never transit through the logger.

I get the feeling that as soon as someone choose to use a non-stderr log device, they are really fully responsible for any errors that may occur and those errors should be visible.

It doesn't solve the need for the current pull request (as described)

It absolutely does. Set ignore_errors: false and the problem is fixed - timeout errors will transit through logger. But what you are concerned about is now other exceptions will also transit through. But to me, that is correct behaviour. As you said yourself, timeout can raise any error. And as I mentioned, the person who creates logger is not often the person who uses logger. So, there is a disconnect there, and it will invariably cause pain, and it's due to the design choice we make here - there is no pain free option, but there are definitely some options which are better than others.

The goal moving forward should not be what is the minimum additional functionality we can add to fix the issue, but, what is the semantic model of logger and how can we simplify it so that we can fix these issues. Otherwise, the next bug report is going to be "how can I add exceptions to the ignored exception list". Then the next bug report will be "logger is now throwing exceptions I didn't expect".

This is a rabbit hole.

@jeremyevans
Copy link
Contributor Author

Agreed that this can be a rabbit hole. While reraising all exceptions solves the timeout problem, it potentially introduces other problems that are more serious and more widespread.

If you take the position that logging is of equal importance to the normal function of the application, then it would make sense to not handle exceptions at all (forcing callers to handle all exceptions). However, my impression is logger assumes logging is of lesser importance than normal function, and therefore it is better to swallow errors related to logging than raise them.

The goal moving forward should not be what is the minimum additional functionality we can add to fix the issue, but, what is the semantic model of logger and how can we simplify it so that we can fix these issues.

Absent backwards compatibility issues, I would agree with this. But there are significant backwards compatibility issues, and giving up on backwards compatibility and saying "just bump major version" is not a good approach for one of the most commonly used libraries in stdlib.

Otherwise, the next bug report is going to be "how can I add exceptions to the ignored exception list". Then the next bug report will be "logger is now throwing exceptions I didn't expect".

With this pull request, the answer to the first question would be "add the exception class to :reraise_write_errrors" and the answer to the second question would be "remove the exception class from :reraise_write_errors". These seem obvious enough that it is unlikely a user would file such a bug report, and if they did, it would be an easy bug report to respond to.

One thing to consider is that relatively few people have issues with how logger currently works. Timeout is probably the largest problem, which this pull request addresses. Timeout also has tons of problems in general with its design and its interaction with other libraries.

Anyway, I have said my piece and will refrain from commenting further unless @sonots requests additional changes.

@jeremyevans
Copy link
Contributor Author

@sonots I'm not intending to comment further on this, but I just want to check on the status. It's now been more than a year since the previous comment was posted.

@ioquatix
Copy link
Member

I'm not using this gem much so my original passion for this issue is much less.

However, I do feel the need to say that many people use this gem in practice, and my experience is the situation is far from ideal.

With this pull request, the answer to the first question would be "add the exception class to :reraise_write_errrors" and the answer to the second question would be "remove the exception class from :reraise_write_errors". These seem obvious enough that it is unlikely a user would file such a bug report, and if they did, it would be an easy bug report to respond to.

The problem is, most users do not explicitly create the logger instance they are using. e.g. Rails.logger.

@jeremyevans jeremyevans force-pushed the reraise_exception_classes branch from 0864085 to a67b811 Compare August 27, 2023 17:46
@jeremyevans
Copy link
Contributor Author

It's been another 3 years since the last comment, and 4 years since this was opened. This is still a significant issue in logger. I've rebased this against master. @sonots @hsbt @amatsuda Do any of you feel comfortable reviewing this?

@eregon
Copy link
Member

eregon commented Sep 14, 2023

Why not just rescue IOError? If an IO or IO-like raise something which is not an IOError on IO#write it seems clearly a bug of that IO or IO-like object.

This allows the user to specify exception classes to treat as regular
exceptions instead of being swallowed.  Among other things, it is
useful for having Logger work with Timeout.

Fixes Ruby Bug 9115.
@jeremyevans jeremyevans force-pushed the reraise_exception_classes branch from a67b811 to 8a4f87c Compare July 11, 2024 15:16
@jeremyevans jeremyevans merged commit 436a7d6 into ruby:master Jul 11, 2024
24 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

4 participants