Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TCP Margin calcuation for subscription liveness check. #12009

Open
yunhanw-google opened this issue Nov 19, 2021 · 13 comments
Open

TCP Margin calcuation for subscription liveness check. #12009

yunhanw-google opened this issue Nov 19, 2021 · 13 comments

Comments

@yunhanw-google
Copy link
Contributor

TCP isn't instant though. RTO * retries can be hundreds of seconds if a link exhibits a high round-trip time. Even a single lost packet could lead to a breakage for TCP here, whereas this code might accommodate such a thing for UDP.

I don't see that it makes sense to branch like this PR attempts to. In this context, the only thing that is special about UDP+MRP as compared to TCP is that we have immediate access to retry parameters with MRP, whereas these are likely obfuscated for TCP. But they still absolutely exist.

Originally posted by @msandstedt in #11795 (comment)

@msandstedt
Copy link
Contributor

Also note the maximum idle timeout computation in #11795 is not compliant with the current spec, which has jitter and exponential backoff.

@yunhanw-google
Copy link
Contributor Author

copy from #11795,
from @msandstedt
The simplest thing would be to decouple from the exact nature of the underlying transport's reliability features. We could instead say: transport attempts to be reliable. Therefore, we should allow 1 strike before assuming the subscription is dead. So given mMaxIntervalCeilingSeconds, fail at 2x mMaxIntervalCeilingSeconds.

As an aside, this PR's computation for maximum idle-state MRP timeout is not spec compliant.

@yunhanw-google
Copy link
Contributor Author

we need to discuss here for the suitable timeout value

@yunhanw-google
Copy link
Contributor Author

@bzbarsky-apple
Copy link
Contributor

The suggestion in #12009 (comment) might be pretty reasonable....

@holbrookt
Copy link
Contributor

@bzbarsky-apple @yunhanw-google - does this need to be fixed for 1.0?

@bzbarsky-apple
Copy link
Contributor

Going to redirect to @msandstedt

@msandstedt
Copy link
Contributor

I feel like the code @yunhanw-google has in place is pretty reasonable for 1.0, and particularly so if TCP is out of scope. This is how we compute the liveness timeout:

    System::Clock::Timeout timeout =
        System::Clock::Seconds16(mMaxIntervalCeilingSeconds) + mpExchangeCtx->GetSessionHandle()->GetAckTimeout();

And this is GetAckTimeout:

System::Clock::Milliseconds32 ExchangeContext::GetAckTimeout()
{
    System::Clock::Timeout timeout;
    if (IsUDPTransport())
    {   
        timeout = GetMRPConfig().mIdleRetransTimeout * (CHIP_CONFIG_RMP_DEFAULT_MAX_RETRANS + 1); 
    }   
    else if (IsTCPTransport())
    {   
        // TODO: issue 12009, need actual tcp margin value considering restransmission
        timeout = System::Clock::Seconds16(30);
    }   
    return timeout;
}

So because we don't have a good way to know link conditions for TCP, we are just adding 30-seconds margin. But I don't believe we are going to be supporting TCP in 1.0.

@woody-apple
Copy link
Contributor

Agree, thanks @msandstedt . Moving to post V1.0

@woody-apple woody-apple added V1.X and removed V1.0 labels Feb 8, 2022
@stale
Copy link

stale bot commented Aug 12, 2022

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.

@stale stale bot added the stale Stale issue or PR label Aug 12, 2022
@bzbarsky-apple bzbarsky-apple removed the stale Stale issue or PR label Aug 26, 2022
@stale
Copy link

stale bot commented Mar 12, 2023

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.

@stale stale bot added the stale Stale issue or PR label Mar 12, 2023
@bzbarsky-apple bzbarsky-apple removed the stale Stale issue or PR label Mar 13, 2023
@stale
Copy link

stale bot commented Oct 15, 2023

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.

@stale stale bot added the stale Stale issue or PR label Oct 15, 2023
@bzbarsky-apple bzbarsky-apple removed the stale Stale issue or PR label Oct 16, 2023
@bzbarsky-apple
Copy link
Contributor

@pidarped You might be interested in this one...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants