Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updating the session history seems pretty broken #6197

Closed
6 tasks
domfarolino opened this issue Dec 5, 2020 · 6 comments · Fixed by #6199
Closed
6 tasks

Updating the session history seems pretty broken #6197

domfarolino opened this issue Dec 5, 2020 · 6 comments · Fixed by #6199

Comments

@domfarolino
Copy link
Member

Earlier this week I noticed that we don't always set a Window's associated Document (via setting the browsing context's active document) right away after creating the Window, which can lead to some weird problems. While going down this rabbit hole and discussing this more with @domenic I realized that the following flow is pretty broken:

While trying to figure out the best way to fix this, I think the idea is that we need to plumb in the correct "new document" to the #update-the-session-history-with-the-new-page algorithm so that we can appropriately reference it in Step 1, instead of accidentally referencing the old document.

However even once we fix this, from talking with Domenic I think we still have the following problem:

Let’s analyze the two conditionals in #update-the-session-history-with-the-new-page below:


Screen Shot 2020-12-05 at 14 21 35

Let's analyze both cases:

  • entry-update

    • The spec text matches the entry-update dfn. The dfn says that we won’t be creating a new session history entry, but instead will update the existing current entry with a brand new document, and traverse to this current entry. However I am not sure how this is actually supposed to work. The flow for entry-update looks like this:
      • #traverse-the-history(document = null) => #navigate => #navigate-html
        • create and initialize a document
        • #update-the-session-history[...] (triggering the above screenshot text)
          • Sets current entry's document to the newly-created document
          • #traverse-the-history > Step 5 always evaluates false; we never activate the new document
    • I propose that we change the dfn and the spec text here to create a new session history entry, set it up with the new document etc. Then we put the new entry in the session history where current entry is. Now current entry holds an old entry that is no longer in the session history and is about to be deleted. Continue to #traverse-the-history passing in the new entry (not current entry). Step 5 evaluates to true, and we can activate the new document!
  • reload

    • The spec text here DOES NOT match what the reload's dfn says. The dfn states that we'll reload the current entry's document by: creating a brand new history entry, putting it in place of the current entry (_which from my understanding doesn't actually change current entry’s value), and navigating to the new entry
    • In other words, it expects my proposal above (I think)

Screen Shot 2020-12-05 at 14 21 43

I don’t think this is doing the right thing. I first read this as if it were setting current entry to a new session history entry, and then traversing the history to it. However after talking to Domenic I think it's doing something different, but equally wrong: Putting a new entry in the session history where current entry is (current entry's value hasn't changed), and invoking traverse-the-history with current entry. I think it should invoke #traverse-the-history with the new entry, so that #traverse-the-history > step 5 actually compares the two distinct entries. Does that make sense?


If my analysis is right, then I think to fix all of these there are the following action items:

  • Change the signature of #update-the-session-history-with-the-new-page to explicitly take a new Document
  • Plumb the new document to all call sites of this algorithm
  • Have step 1 reference this new Document, instead of the "old" one that is currently activated
  • Optional: Decouple the two concepts (a) browsing context's active document, and (b) Window's associated Document`, so that we can set a Window's associated Document ASAP, without setting the browsing context's active document.
  • Change the first conditional branch of #update-the-session-history-with-the-new-page > step 3 to my proposal above
  • Change the second conditional branch of #update-the-session-history-with-the-new-page > step 3 to the other proposal above (that is, mostly leave it as-is and invoke #traverse-the-history with the new entry, instead of current entry.

Thoughts?

@domfarolino
Copy link
Member Author

I posted #6199 as a first stab to fix this issue, purely going off the above analysis. If any of it is wrong, I'm happy to update the PR accordingly.

@domenic
Copy link
Member

domenic commented Dec 8, 2020

So I reviewed #6199 and two things make me a bit unsettled about your proposed solution:

  • It's weird that the current entry is, for a short span of time, no longer in the session history. (This happens between your modifications to #update-the-session-history-with-the-new-page, and the step in #traverse-the-history which updates the current entry.) As discussed in another thread with you recently, these temporary invariant violations aren't the end of the world, but I think it'd be good to update the definition of current entry to reflect that, if we did go that route. However...

  • "replace" seems to take a very different approach. It seems to go down the "Otherwise" path in #update-the-session-history-with-the-new-page, and then in #traverse-the-history, we have "If historyHandling is "replace", then remove the entry immediately before entry in the session history." Should we consider adopting this approach instead? In general, as you noted in the PR, "replace" and "reload" are supposed to be very similar.

I think taking the "replace" approach for the "reload" case is probably the right move. I'm less sure about entry update.

If you look at entry update's call site, it seems like it really does want to replace the current entry's document, and not create a new entry? At least conceptually, and judging from the non-normative note

The "navigate" algorithm reinvokes this "traverse" algorithm to complete the traversal, at which point entry's document is non-null.

i.e. it references the specific entry entry. But, we do want to trigger the (currently faulty) "If newDocument is different than the current entry's document, then" clause for entry-update. Hmm.

Finally, I spent some time figuring out what

If the navigation was initiated with a URL that equals newDocument's URL

means, and it has a separate bug I will file.

@domfarolino
Copy link
Member Author

It's weird that the current entry is, for a short span of time, no longer in the session history. (This happens between your modifications to #update-the-session-history-with-the-new-page, and the step in #traverse-the-history which updates the current entry.) As discussed in another thread with you recently, these temporary invariant violations aren't the end of the world, but I think it'd be good to update the definition of current entry to reflect that, if we did go that route. However...

Yeah, it's a bit awkward; as you point out, the reason I went with it is because after we "replace the current entry", the value of current entry is now different, which I don't think is what we want because any time we invoke #traverse-the-history with current entry, we're guaranteed to not trip Step 5 and never activate the document. Basically I'm trying to figure out the best way to make sure that whenever we make a Document change, #traverse-the-hisory > Step 5 catches it, as you point out later in your comment.

"replace" seems to take a very different approach. It seems to go down the "Otherwise" path in #update-the-session-history-with-the-new-page

Well per spec, if you call location.replace(theCurrentURL), then it actually would trigger the second path that you've filed a bug against, since the URL is the same. But now I'm actually super confused about the spec's handling of replace in general! All browsers that I test handle location.replace(...) just like reload, in that they do not delete any session history entries after the current entry (being replaced). It looks like the spec does does delete all entries after the current entry though, if it takes the third path whenever the URLs aren't the same. Am I missing something big?

and then in #traverse-the-history, we have "If historyHandling is "replace", then remove the entry immediately before entry in the session history." Should we consider adopting this approach instead? In general, as you noted in the PR, "replace" and "reload" are supposed to be very similar.

I saw that and did like the approach there, however I don't see a great way to weave it into other history handling modes right away while satisfying (1) Keeping the session entry the same (as you point out entry update kind of wants this, and reload's dfn mentions this) (2) Properly tripping #traverse-the-history > Step 5.

I think it would be reasonable to modify my PR to create a "correct" version of today's spec like so:

  • Modify the #traverse-the-history algorithm to accept an traverse-to-same-entry (example name) bool defaulting to false
  • Modify #traverse-the-history > Step 5 condition from
    • if (newDocument != current_entry.document), to...
    • if (newDocument != current_entry.document || traverse-to-same-entry)
    • Now we can manually enter the document activation steps even when traversing to an existing entry, which is certainly something the spec expects to work as you mentioned.
  • Modify #update-the-session[...] > Step 3's first condition to handle entry update and reload in a way that does not create a new session entry, but modifies the current entry and passes in the optional boolean to #traverse-the-history
  • Include the condition that checks if the URLs are equal
    • ...or whatever the outcome is on the issue you filed..., I need to give it a closer look but I think it might be OK for now?
  • Add a new condition to #update-the-session[...] > Step 3 that specifically handles the replace case in a way that creates a new entry after the current entry, and #traverses to it. This seems to perfectly match what #traverse-the-history > Step 9 expects
    • It is still weird to me that we handle this differently from reload and entry-update. They all ultimately end up being in-place operations, just one operates on a Document-replacement level, and the other operates on a session entry-replacement level. I guess maybe we can go with this for now and consider simplifying once we actually make things correct.
  • Modify the final path of #update-the-session[...] > Step 3 to assert that the history handling mode is default, since everything else should be handled

It's possible I'm missing something huge, but from taking a closer look those are my thoughts. WDYT?

@domenic
Copy link
Member

domenic commented Dec 10, 2020

Well per spec, if you call location.replace(theCurrentURL), then it actually would trigger the second path that you've filed a bug against, since the URL is the same. But now I'm actually super confused about the spec's handling of replace in general! All browsers that I test handle location.replace(...) just like reload, in that they do not delete any session history entries after the current entry (being replaced). It looks like the spec does does delete all entries after the current entry though, if it takes the third path whenever the URLs aren't the same. Am I missing something big?

Ugh, yeah, that seems pretty bad. This seems like it was the case even back in 2016 though, so at least it's not a recent regression.

I think it would be reasonable to modify my PR to create a "correct" version of today's spec like so:

I think this would work, although I'm partial to solving replace and the same-URL case in a single PR so that we're sure we have a plan. I could chip in on top of what you describe.

Let's consider some important features that distinguish them:

  • Entry update: we definitely want to keep the session history entry, and just replace the document

  • Reload: up until now the spec has been the same as entry update. (E.g. location.reload() used to trigger navigation "for entry update", and I preserved that when refactoring.) Should we keep that, or is it more like the following case?

  • Same URL, but not entry update or reload (but this case could include replace): we need to replace the document.

    • I suspect we also want to reset the serialized state, title, scroll restoration mode, scroll position data, browsing context name, and persisted user state.
    • We can test the serialized state and the scroll position data pretty easily.
    • If tests show we want to reset those, then we should probably replace the whole entry and delete the existing one.
  • Replace, but not same URL: we need a whole new entry, and we need to delete the existing one.

  • Everything else: must be "default". Create a whole new entry.

Let me do some testing on "reload" and same-URL but not reload...

@domenic
Copy link
Member

domenic commented Dec 10, 2020

https://boom-bath.glitch.me/session-history-1.html shows that reload is indeed very similar to entry update.

Same URL (non-reload/entry update case, potentially including replace) seems to resets everything in Firefox. In Chrome it resets the scroll position, but does not reset history.state. So, uh, that's weird. Note that the spec used to be vague about this in a way that could understandably lead to non-interop:

Replace the current entry with a new entry representing the new resource and its Document object, related state, and the default scroll restoration mode of "auto".

The results for history.length are also strange in Firefox, but that property is kind of a lost cause.


My recommendation would then be:

  • Entry update: keep the session history entry, and just replace the document

  • Reload: same as entry update

  • Same URL, but not entry update or reload (but this case could include replace): create a new entry and delete the old one, with an XXX box saying some browsers seem to copy over the serialized state. In the PR you're working on in particular, let's just do the easy bugfix from Updating the session history: same-URL case is broken #6202 so that it's comparing the right URLs; I can try to fix the action at a distance later.

  • Replace, but not same URL: create a new entry and delete the old one.

  • Everything else: must be "default". Create a whole new entry.

@domfarolino
Copy link
Member Author

Thanks for the analysis and the discussion!

My recommendation would then be:

I'm prettyyyy sure that this lines up with my latest proposal above, with the exception that you're handling the same-URL case explicitly here as best we can without solving it entirely. Given that, I've updated the PR in #6199 to reflect the latest proposal(s), so PTAL!

domenic pushed a commit that referenced this issue Dec 11, 2020
This fixes several related issues surrounding the "update the session
history with the new page" and "traverse the history" algorithms.

* A mistaken attempted refactoring in #6039 got the wrong Document for
  the newDocument variable in "traverse the history". To fix this, we
  explicitly pass in the new document to the algorithm from all its call
  sites. (Discussed in #6197.)

* As discussed more extensively in #6197, the same-URL and entry
  update/reload conditions in "update the session history with the new
  page" were not correctly triggering the new-document clause in
  "traverse the history", despite replacing the document. This was
  partially due to #6039, although the phrasing before #6039 was
  extremely ambiguous, so #6039 was mostly a transition from "unclear"
  to "clearly wrong".

* This fixes the "easy bug" discussed in #6202, where the same-URL case
  was using the wrong URL to determine sameness. That bug was also
  introduced in #6039. The harder bug, as well as the
  action-at-a-distance nature of the same-URL check, remain tracked in
  #6202.

* For years, the spec has required deleting the future session history
  entries after performing a navigation with history handling of
  "replace", e.g. via location.replace(). This is not correct; a
  "replace" navigation does not delete future session history entries.
  Instead it just replaces the current entry, leaving future entries
  alone.

* The latter point makes the handling of same-URL navigations almost
  identical to "replace" navigations (including non-same-URL "replace"
  navigations). This change makes this clear, by using the same text for
  both, but adding a pointer to #6213 which highlights the fact that
  some browsers treat them slightly differently (and some treat them the
  same).

* Finally, this adds or modifies a few assertions to check that we're in
  the "default" history handling behavior, so it's clearer to the reader
  what the non-exceptional path through the spec is.

Closes #6197.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging a pull request may close this issue.

2 participants