Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[common] Correct definition of alternate host name #17873

Merged
merged 1 commit into from
Jul 18, 2019

Conversation

jugglinmike
Copy link
Contributor

The value of the "alternate" host is completely at the discretion of the
server operator (via the alternate_hosts configuration). It is not
guaranteed to have any resemblance to the primary host name. Despite
this, the shared get-host-info.sub.js file attempted to derive the
alternate host name in terms of the primary host name.

Replace that faulty derivation with the value which is present in the
server configuration.

The value of the "alternate" host is completely at the discretion of the
server operator (via the `alternate_hosts` configuration). It is not
guaranteed to have any resemblance to the primary host name. Despite
this, the shared `get-host-info.sub.js` file attempted to derive the
alternate host name in terms of the primary host name.

Replace that faulty derivation with the value which is present in the
server configuration.
@jugglinmike
Copy link
Contributor Author

@Hexcles this patch failed Chrome's stability check. I'm trying to determine if it's due to the content of the change, the number of affected tests, a glitch in the infrastructure.

The logs report

Identified 333 affected tests

And finally

TEST-START | /signed-exchange/sxg-referrer-policy-header.tentative.https.html
Run 1/10
mem avail: 4277 MiB (57 %), swap free: 0 MiB (0 %)

[taskcluster:error] Task timeout after 7200 seconds. Force killing container.
[taskcluster 2019-07-17 04:05:31.541Z] === Task Finished ===
[taskcluster 2019-07-17 04:05:31.542Z] Unsuccessful task run with exit code: -1 completed in 7201.772 seconds

We don't have timestamps, but I think the runner was actually hanging here. That's because we only intermittently print memory information, so it's unlikely to have occurred during a normal test execution.

Do you think we should just retry?

@Hexcles Hexcles closed this Jul 17, 2019
@Hexcles Hexcles reopened this Jul 17, 2019
@jugglinmike
Copy link
Contributor Author

...then again, if the runner was hanging for a significant length of time, then we'd expect to see a number of messages about memory availability.

@Hexcles
Copy link
Member

Hexcles commented Jul 17, 2019

I think the task is most likely timing out because of too many tests affected, but there could an infra glitch that led to incomplete (unflushed?) logs.

@jugglinmike
Copy link
Contributor Author

@Hexcles Further research supports the theory that the timeout is due to the number of tests being run, not the change itself (see below). Would you be up for admin-merging? Or would you like me to check something else, first?


This second attempt timed out on a different test:

TEST-OK | /signed-exchange/sxg-location.tentative.html | took 3478ms
TEST-START | /signed-exchange/sxg-location.tentative.html
Run 8/10

[taskcluster:error] Task timeout after 7200 seconds. Force killing container.
[taskcluster 2019-07-17 18:02:55.315Z] === Task Finished ===
[taskcluster 2019-07-17 18:02:55.315Z] Unsuccessful task run with exit code: -1 completed in 7201.861 seconds

So this patch isn't causing a deterministic timeout. It may be exposing some flaky behavior, but I can't reproduce that locally. I've run the stability checker on each of the tests that were running when the previous two builds timed out:

$ python ./wpt run --channel=dev --verify --verify-log-full --no-pause --no-restart-on-unexpected --install-fonts --no-headless chrome signed-exchange/sxg-location.tentative.html signed-exchange/sxg-referrer-policy-header.tentative.https.html

That command produces the same result on master and the pull request branch:

$ python ./wpt run --channel=dev --verify --verify-log-full --no-pause --no-restart-on-unexpected --install-fonts --no-headless chrome signed-exchange/sxg-location.tentative.html signed-exchange/sxg-referrer-policy-header.tentative.https.html
Ran 30 tests finished in 14.8 seconds.
  • 0 ran as expected. 0 tests skipped.
  • 15 tests timed out unexpectedly
  • 30 tests had unexpected subtest results

$ echo $?
0

I opened a draft pull request with an inconsequential change to the same file, and the stability checks timed out in Chrome there, as well.

@Hexcles
Copy link
Member

Hexcles commented Jul 18, 2019

I stumbled upon #15819 which seems like an easy fix (add one more field to the returned object). While we are at it, shall we fix that as well?

@jugglinmike
Copy link
Contributor Author

Reading over that request, I have differing opinions about the technical implementation (I'd rather define that as ALT_HOST to reuse the terminology from the configuration file and make their relation clear). Sorry to say, but my own objection makes it less of an easy fix.

Copy link
Member

@Hexcles Hexcles left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that's cool. I'll merge this now. Thanks!

@Hexcles Hexcles merged commit 08b4256 into web-platform-tests:master Jul 18, 2019
@jugglinmike
Copy link
Contributor Author

Thank you! Following up on gh-15819 presently

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants