Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

flusher errors now do not fail silently #1015

Merged
merged 1 commit into from
Jul 14, 2022
Merged

Conversation

GeorgeEngland
Copy link

The flusher could fail the write and the user would not be able to handle nor get alerted to this error.

Although this is still poor since the error occurs asynchronously, at least the user will be alerted to the error having happened.

Copy link
Member

@kozlovic kozlovic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the contribution.

My first reaction would be: is this really necessary? The reason we don't do any reporting here is because if there is a failure to write to the socket, it is likely that the error will be detected in the readloop (or lack of heartbeats, etc..) and those errors are already handled.

That being said, since I don't see this change breaking any existing test, I would think that it is ok to approve the change. Will see if other maintainers agree with that or not.

Copy link
Member

@wallyqs wallyqs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@kozlovic kozlovic merged commit f4a86f3 into nats-io:main Jul 14, 2022
@GeorgeEngland
Copy link
Author

Thank you for the contribution.

My first reaction would be: is this really necessary? The reason we don't do any reporting here is because if there is a failure to write to the socket, it is likely that the error will be detected in the readloop (or lack of heartbeats, etc..) and those errors are already handled.

That being said, since I don't see this change breaking any existing test, I would think that it is ok to approve the change. Will see if other maintainers agree with that or not.

What about a full TCP buffer? The default timeout of the writer is 2s, so i assume a full buffer could cause a silent failure?

@kozlovic
Copy link
Member

I don't think we set any deadline so it would just block, but not fail.

@GeorgeEngland
Copy link
Author

GeorgeEngland commented Jul 21, 2022

@kozlovic Yes I think you do - the Write method for the timeoutWriter uses a timeout deadline if the client has set a FlusherTimeout as one of the Opts.

nats.go:L5345

// Write implements the io.Writer interface.
func (tw *timeoutWriter) Write(p []byte) (int, error) {
	if tw.err != nil {
		return 0, tw.err
	}

	var n int
	tw.conn.SetWriteDeadline(time.Now().Add(tw.timeout))
	n, tw.err = tw.conn.Write(p)
	tw.conn.SetWriteDeadline(time.Time{})
	return n, tw.err
}

@GeorgeEngland
Copy link
Author

@kozlovic Secondly, the underlying connection also has a timeout of 2 seconds by default

L2014 nats.go:

 func (nc *Conn) processConnectInit() error {

	// Set our deadline for the whole connect process
	nc.conn.SetDeadline(time.Now().Add(nc.Opts.Timeout))
	defer nc.conn.SetDeadline(time.Time{})

the nc.Opts.Timeout is set to 2 seconds if it is set to 0

@kozlovic
Copy link
Member

@GeorgeEngland You are right about the timeoutWriter. As for processConnectInit(), at this stage, the flusher is not started yet and things are done "synchronously". Regardless, your PR was accepted and merge, and now users will get indeed a notification if an error occurred while flushing the outgoing buffer. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants