Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add CAA presence/validity information to ValidationRecords #2165

Open
rolandshoemaker opened this issue Sep 9, 2016 · 17 comments
Open

Add CAA presence/validity information to ValidationRecords #2165

rolandshoemaker opened this issue Sep 9, 2016 · 17 comments
Labels
area/va kind/enhancement starter Ideal issues for folks getting familiar with Boulder

Comments

@rolandshoemaker
Copy link
Contributor

Since this is important information about an authorization.

@rolandshoemaker
Copy link
Contributor Author

This should probably also contain the the highest label at which a record was found if one is found at all.

@laurencemcf
Copy link

I would be interested in trying to do this. Do you have any directions/hints?

@laurencemcf
Copy link

@rolandshoemaker I would be interested in implementing this. Is it still reverent?

@laurencemcf
Copy link

@rolandshoemaker @jsha I would be interested in implementing this, so I was wondering if it is still relevant, and what information you think would be good to keep in the validation records.

@rolandshoemaker
Copy link
Contributor Author

Hey! Sorry for the silence, I completely missed your comments. This is still relevant and it would be great if you had a chance to take a stab at it. I think it makes sense to just include the CAA information for now.

@laurencemcf
Copy link

No problem. So from my understanding of the issue you would want validation record similar to below for now? Do you have any preferences on the format?

[
    {
        "url":"http://c185da94d854a28961368b75dd634b29afcf7f60.com:5002/.well-known/...",
        "hostname":"c185da94d854a28961368b75dd634b29afcf7f60.com",
        "port":"5002",
        "addressesResolved":["127.0.0.1"],
        "addressUsed":"127.0.0.1"
    },
    {
        "caaStatus": "none|valid|invalid"
    }
]

laurencemcf pushed a commit to laurencemcf/boulder that referenced this issue Jul 30, 2018
Records CAA status (i.e. missing, found and valid, or invalid) in
Validation Records. Functionally is gated behind a feature flag, and
defaulted to off. Tests included for both states of the flag.

Issue letsencrypt#2165
@cpu
Copy link
Contributor

cpu commented Jul 30, 2018

👋 @laurencemcf - thanks for your interest in taking this issue!

I talked a bit with @rolandshoemaker out of band. I think we've both come around to the consensus the CAA information should be captured at the authorization level and not in the authorization's validation records. This slightly complicates the work involved as it would require a database migration to add a new column to the authz table to maintain the CAA state.

The other "catch" related to this work is when an issuance for an identifier is backed by a valid authorization older than 8hrs we need to recheck CAA per the BRs and should also update the authorization's CAA state at that time. That makes CAA information a many-to-one relation with an authorization.

I apologize that I'm providing this guidance after you've already taken a crack at an implementation in #3806. Does the above make sense to you? Are you interested in closing #3806 and resubmitting a PR based on the above?

Thanks!

@laurencemcf
Copy link

@cpu No problem, I'll close #3806 and have a go at doing this way.

So to confirm we want to add a new json column to the pendingAuthorizations and authz tables (although I think the column in pendingAuthorizations table will be empty). When we do a recheck we'll add a new item into the array with the information from that recheck.

A example of the format of the new column:

[
    {
        "status": "none-found|found-valid|invalid",
        "timeChecked": "datetime"
    },
    {
        "status": "none-found|found-valid|invalid",
        "timeChecked": "datetime-8-hours-later"
    }
]

@cpu
Copy link
Contributor

cpu commented Jul 31, 2018

@laurencemcf Great, thanks!

So to confirm we want to add a new json column

I think we should avoid JSON columns. A separate table for CAA information would be better normalized and easier to query against. @rolandshoemaker do you concur?

@rolandshoemaker Adding an empty field to the pendingAuthz table is awkward... Do you have any thoughts there? In theory that field will get removed when we finally consolidate the two tables so maybe its a fair compromise.

@laurencemcf
Copy link

If we want to have a separate table for CAA, then due to the many-to-one relationship I think we can avoid having a column in either pendingAuthorizations or authz. These table could be left the same and we would specify a foreign key in the CAA table mapping back to authz. The definition of the table would be something like below (identifier would contain the DNS name the CAA records was found at).

CREATE TABLE `caa` (
  `id` varchar(255) NOT NULL,
  `authzID` varchar(255) DEFAULT NULL,
  `status` varchar(255) DEFAULT NULL,
  `identifier` varchar(255) DEFAULT NULL,
  `timestamp` datetime DEFAULT NULL,
  PRIMARY KEY (`id`),
  CONSTRAINT `caa_authz` FOREIGN KEY (`authzID`) REFERENCES `authz` (`id`)
)

@cpu
Copy link
Contributor

cpu commented Aug 1, 2018

@laurencemcf Of course! That makes sense :-) Thanks for pointing that out.

@rolandshoemaker Are you OK with the separate table approach?

@rolandshoemaker
Copy link
Contributor Author

I think this approach makes sense, in the past we've been way too willing to just keep stuffing stuff in existing tables when a new table would make more sense.

In terms of the schema I think the ID should be a BIGINT(20) NOT NULL AUTO_INCREMENT and the rest of the fields should be NOT NULL instead of DEFAULT NULL since we'll never want them empty. Given we only expect status to be one of three things we could also go with a ENUM, but that's not super important.

Probably we'll also want a KEY on authzID since that's likely how we'll be looking these up the majority of the time. @cpu can you think of any other indexy things here?

@cpu
Copy link
Contributor

cpu commented Aug 1, 2018

@cpu can you think of any other indexy things here?

@rolandshoemaker I think it might be prudent to put an index on the identifier field. We can join back to the authz table to get the identifier for the authorization but I can imagine scenarios where we may want to be querying the caa table by identifier. E.g. maybe for the base domain in the case of a CAA decision based on the record for example.com when the authorization was for foo.example.com

@rolandshoemaker
Copy link
Contributor Author

rolandshoemaker commented Aug 1, 2018

Hm, good point. We may want to change the name of the identifier column as well, it's slightly confusing given we have a loaded definition of that term elsewhere in the codebase that this doesn't conform with. Perhaps just presentAt or something would be a bit clearer.

@laurencemcf
Copy link

In terms of the schema I think the ID should be a BIGINT(20) NOT NULL AUTO_INCREMENT and the rest of the fields should be NOT NULL instead of DEFAULT NULL since we'll never want them empty.

@rolandshoemaker makes sense, but I was just wondering what we should do for the presentAt field when we don't find any CAA records? Do you want to include the last suffix (for com for foo.example.com), or should we make this field null?

Also is there any other information which would be useful to include in this table, e.g. whether it was a wild card check, or the validation methods allowed? I don't think either of those things would be much work to add but may make the table more useful depending on the querying patterns you envision.

@laurencemcf
Copy link

I've been working on implementing this and had a question about the best way to do it.

We need to add a new RPC between the RA and SA to save the CAA recheck information (for authorizations older than 8hrs), and I was wondering if we should use this new RPC when we perform the initial validation as well. At the time of the initial validation we are already calling the FinalizeAuthorization RPC, so if we use the new RPC we would be performing two SA calls one after another. Would this be a problem (I'm not sure of the costs of preforming them)? The alternative would be to store the new CAA records into the authz object and save them during the FinalizeAuthorization call.

@rolandshoemaker
Copy link
Contributor Author

@rolandshoemaker makes sense, but I was just wondering what we should do for the presentAt field when we don't find any CAA records? Do you want to include the last suffix (for com for foo.example.com), or should we make this field null?

Yup null makes sense in that case.

Also is there any other information which would be useful to include in this table, e.g. whether it was a wild card check, or the validation methods allowed? I don't think either of those things would be much work to add but may make the table more useful depending on the querying patterns you envision.

Probably makes sense to include if it was a wildcard check, other than that I can't think of anything in particular, given our parameter work is still being changed I think it should be left out of any initial implementation and we can extend the table at a later date if we want to include that info.

I've been working on implementing this and had a question about the best way to do it.

We need to add a new RPC between the RA and SA to save the CAA recheck information (for authorizations older than 8hrs), and I was wondering if we should use this new RPC when we perform the initial validation as well. At the time of the initial validation we are already calling the FinalizeAuthorization RPC, so if we use the new RPC we would be performing two SA calls one after another. Would this be a problem (I'm not sure of the costs of preforming them)? The alternative would be to store the new CAA records into the authz object and save them during the FinalizeAuthorization call.

I'd need to take a closer look but I believe we can just extend the existing CAA recheck RPC response to include the information we want back. We can also extend the FinalizeAuthorization RPC request to include this information and simply alter the SA method to also store that (I'd also not be strictly opposed to simply doing two RPCs here from RA -> SA, which would be a slightly cleaner boundary, but since we will only ever be doing this after a FinalizeAuthorization call I think it's fine to just mash them together).

@aarongable aarongable added starter Ideal issues for folks getting familiar with Boulder and removed help wanted labels Aug 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/va kind/enhancement starter Ideal issues for folks getting familiar with Boulder
Projects
None yet
Development

No branches or pull requests

4 participants