Skip to content

Conversation

@vmamidi
Copy link
Contributor

@vmamidi vmamidi commented Jan 30, 2023

Motivation: In a production system that was configured to chase redirects, we noticed a dramatic increase in aborts and began caching (and serving) the intermediate 302 response instead of chasing the redirect.

Possible Fix: Revert PRs #7667 and #7807.

PR #7807 could leave the HTTP State machine in a INACTIVE state.

For example, if state_cache_open_read is rescheduled in an error using the below code.

case CACHE_EVENT_OPEN_READ_FAILED:
    if ((intptr_t)data == -ECACHE_DOC_BUSY) {
      // Somebody else is writing the object
      if (open_read_tries <= master_sm->t_state.txn_conf->max_cache_open_read_retries) {
        // Retry to read; maybe the update finishes in time
        open_read_cb = false;
        do_schedule_in();
      } else {

This code will return to event system leaving the state machine in a INACTIVE state and results in an abort.

  if (captive_action.cancelled == 1) {
    return VC_EVENT_CONT; // SM gave up on us
  }

This can be easily reproduced by setting proxy.config.http.number_of_redirections greater than zero.(https://docs.trafficserver.apache.org/en/8.1.x/admin-guide/files/records.config.en.html#proxy-config-http-number-of-redirections)

Although I like the idea behind #7667, I am proposing reverting #7667 along with #7807 as there were concerns about the crashes with 7667. I am yet to look at PRs #8423 and #8443

@maskit
Copy link
Member

maskit commented Jan 30, 2023

(Modified the description to link issues mentioned)

@vmamidi vmamidi self-assigned this Jan 30, 2023
@vmamidi vmamidi added the Revert label Jan 30, 2023
@vmamidi vmamidi added this to the 9.2.1 milestone Jan 30, 2023
@bryancall bryancall requested a review from shinrich January 31, 2023 00:28
@vmamidi
Copy link
Contributor Author

vmamidi commented Jan 31, 2023

We can leave PRs #8423 and #8443 in the code for future use.

@zwoop zwoop modified the milestones: 9.2.1, 10.0.0 Feb 2, 2023
@zwoop
Copy link
Contributor

zwoop commented Feb 2, 2023

Talking to Bryan and VJ, we'll hold off on this until we have more testing, and reviews. But this is a showstopper for a 10.0.0 release. We will revert this for 9.2.x though which will go out into v9.2.1.

Copy link
Contributor

@bneradt bneradt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just leaving a note that reverting this causes issues with the HTTP/2 to origin patch. Reverting this in 9.2.x is fine, but we cannot revert this patch in master as it is a dependency for HTTP/2 to origin.

@shinrich
Copy link
Member

We applied these changes to fix asserts before h2 to origin as well. Perhaps those asserts were only seen in the Yahoo environment? Perhaps we should push forward and fix the issue with the stalled/inactive HttpSMs?

@ezelkow1
Copy link
Member

[approve ci centos cmake]

@github-actions
Copy link

This pull request has been automatically marked as stale because it has not had recent activity. Marking it stale to flag it for further consideration by the community.

@github-actions github-actions bot added the Stale label Aug 18, 2023
@github-actions github-actions bot closed this Aug 25, 2023
@zwoop zwoop removed this from the 10.0.0 milestone Feb 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

6 participants