Skip to content

Conversation

jsha
Copy link
Contributor

@jsha jsha commented Jul 11, 2019

This rolls forward #4326 after it was reverted in #4328.

Resolves #4329

The older query didn't have a LIMIT 1 so it was returning multiple results, but gorp's SelectOne was okay with multiple results when the selection was going into an int64. When I changed this to a struct in #4326, gorp started producing errors.

For this bug to manifest, an account needs to create an order, then fail validation, twice in a row for a given domain name, then create an order once more for the same domain name - that third request will fail because there are multiple orders in the orderFqdnSets table for that domain.

Note that the bug condition doesn't happen when an account does three successful issuances in a row, because finalizing an order (that is, issuing a certificate for it) deletes the row in orderFqdnSets. Failing an authorization does not delete the row in orderFqdnSets. I believe this was an intentional design decision because an authorization can participate in many orders, and those orders can have many other authorizations, so computing the updated state of all those orders would be expensive (remember, order state is not persisted in the DB but is calculated dynamically based on the authorizations it contains).

This wasn't detected in integration tests because we don't have any tests that fail validation for the same domain multiple times. I filed an issue for an integration test that would have incidentally caught this: #4332. There's also a more specific test case in #4331.

This rolls forward #4326 after it was reverted in #4328.
@jsha jsha requested a review from a team as a code owner July 11, 2019 00:23
Copy link
Contributor

@cpu cpu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still confused by the root cause of this bug. I think the fix in this branch was adding the LIMIT 1 to the legacy query (used when features.FasterGetOrderForNames is disabled) but I don't understand how if that was the required fix that we didn't see this bug prior to this branch. Can you add some more context to the PR description or elsewhere?

@jsha
Copy link
Contributor Author

jsha commented Jul 11, 2019

Thanks for the reminder to add more context about what failed, and why we didn't catch it! Glad to have that written down in more detail, since it was subtle. I've updated it in the issue and in the PR desc (same content). I also filed a new issue for a nicer integration test.

cpu
cpu previously approved these changes Jul 11, 2019
Copy link
Contributor

@cpu cpu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @jsha!

When I originally made this branch, I wrote a patch on top of #4326,
then rebased that patch on top of master. That rebase had conflicts,
naturally, and I resolved those conflicts but evidently messed it up.

This brings back the missing changes by:

git show -v 9fa3607 | patch -p1

There was one rejected hunk; I manually made the changes from that hunk.
Copy link
Contributor

@cpu cpu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A field name typo is causing CI to fail.

@cpu
Copy link
Contributor

cpu commented Jul 11, 2019

Merging with one 🔍 based on available reviewers.

@cpu cpu merged commit 7469948 into master Jul 11, 2019
@cpu cpu deleted the fix-getorderfornames branch July 11, 2019 17:43
jsha added a commit that referenced this pull request Jul 18, 2019
In #4331 I introduced this new more efficient query for
GetOrderForNames, and commented about why we needed an ORDER BY... ASC
to efficiently use the index. However, the actually query did not match
the comment, and it used DESC. This fixes the query.

To demonstrate that the index is actually used with the ASC version,
here's the EXPLAIN output after filling up the table with a bunch of
failed orders:

MariaDB [boulder_sa_integration]> explain select orderID, registrationID FROM orderFqdnSets
    -> WHERE setHash = UNHEX('B60FE34E4A6735D5A575D81C97F4DFED2102DC179B34252E4AA18F6E2A375C98')
    -> AND expires > NOW() ORDER BY EXPIRES ASC LIMIT 1 \G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: orderFqdnSets
         type: range
possible_keys: setHash_expires_idx
          key: setHash_expires_idx
      key_len: 37
          ref: NULL
         rows: 1500
        Extra: Using index condition
1 row in set (0.000 sec)

MariaDB [boulder_sa_integration]> explain select orderID, registrationID FROM orderFqdnSets
    -> WHERE setHash = UNHEX('B60FE34E4A6735D5A575D81C97F4DFED2102DC179B34252E4AA18F6E2A375C98')
    -> AND expires > NOW() ORDER BY EXPIRES DESC LIMIT 1 \G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: orderFqdnSets
         type: range
possible_keys: setHash_expires_idx
          key: setHash_expires_idx
      key_len: 37
          ref: NULL
         rows: 1500
        Extra: Using where
1 row in set (0.000 sec)
cpu pushed a commit that referenced this pull request Jul 18, 2019
In #4331 I introduced this new more efficient query for
GetOrderForNames, and commented about why we needed an ORDER BY... ASC
to efficiently use the index. However, the actually query did not match
the comment, and it used DESC. This fixes the query.

To demonstrate that the index is actually used with the ASC version,
here's the EXPLAIN output after filling up the table with a bunch of
failed orders:

MariaDB [boulder_sa_integration]> explain select orderID, registrationID FROM orderFqdnSets
    -> WHERE setHash = UNHEX('B60FE34E4A6735D5A575D81C97F4DFED2102DC179B34252E4AA18F6E2A375C98')
    -> AND expires > NOW() ORDER BY EXPIRES ASC LIMIT 1 \G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: orderFqdnSets
         type: range
possible_keys: setHash_expires_idx
          key: setHash_expires_idx
      key_len: 37
          ref: NULL
         rows: 1500
        Extra: Using index condition
1 row in set (0.000 sec)

MariaDB [boulder_sa_integration]> explain select orderID, registrationID FROM orderFqdnSets
    -> WHERE setHash = UNHEX('B60FE34E4A6735D5A575D81C97F4DFED2102DC179B34252E4AA18F6E2A375C98')
    -> AND expires > NOW() ORDER BY EXPIRES DESC LIMIT 1 \G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: orderFqdnSets
         type: range
possible_keys: setHash_expires_idx
          key: setHash_expires_idx
      key_len: 37
          ref: NULL
         rows: 1500
        Extra: Using where
1 row in set (0.000 sec)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

FasterGetOrderForNames feature breaks gorp
2 participants