Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update Task for DAP-02. #616

Merged
merged 14 commits into from
Oct 7, 2022
Merged

Update Task for DAP-02. #616

merged 14 commits into from
Oct 7, 2022

Conversation

branlwyd
Copy link
Member

@branlwyd branlwyd commented Oct 6, 2022

Specifically:

  • Introduce query_type field, determining how batches of reports are determined.
  • Add task_expiration field, determining when a task naturally expires.
  • Rename min_batch_duration to time_precision.
  • Rename max_batch_lifetime to max_batch_query_count.

Also, clean up Task while I'm touching it: make it immutable, with methods returning references. I also introduce a test-only TaskBuilder abstraction which allows building a Task by specifying a few required fields, and using default values for other fields (while allowing those defaults to be overridden).

Most importantly, make it immutable. This requires creating getters, and
I added a TaskBuilder abstraction for tests.
This would only be triggered if the task under test had more than one
collector auth token, which is why this was not triggered previously.
Specifically, we would configure the task under test to use the _first_
collector auth token, while the collector would use the _primary_ (aka
last) collector auth token. This would lead to a mismatch & therefore
authorization failure. Now both contexts use the primary collector auth
token.
Broken by a previous commit in this PR, the Janus/Janus integration test
set the leader & helper up to use differing VDAF verification keys. This
didn't work out so well. :-)

As part of the fix, allow TaskBuilders to be cloned; this makes it a lot
easier to generate paired tasks for leader & helper with all of the
relevant fields matching, without requiring the caller to generate those
values themselves.
I removed this earlier as it appeared to be only used in one place, and
the new TaskBuilder abstraction would have required reimplementing the
body anyway.  But I missed that it was also used in Daphne. Reintroduce
the helper function, suitably updated, to avoid unnecessarily repeating
code.
@branlwyd branlwyd requested a review from a team as a code owner October 6, 2022 23:09
@branlwyd
Copy link
Member Author

branlwyd commented Oct 6, 2022

Thought: I could get rid of TaskBuilder entirely, placing the with_foo methods as (test-util-only) methods on Task itself. What do y'all think?

I had to manually fix one formatting error before `cargo fmt` would do
anything -- otherwise it error'ed out!
Copy link
Contributor

@tgeoghegan tgeoghegan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thought: I could get rid of TaskBuilder entirely, placing the with_foo methods as (test-util-only) methods on Task itself. What do y'all think?

That approach would work well, too, but I'm not sure I see that it's necessarily better than what you already have, so I would choose the option where you don't have to rewrite a bunch of code. i.e., keep TaskBuilder as is.

If we wanted to invest effort in doing better, then I think we should find a crate that implements the builder pattern with a macro. For instance, derive_builder, but that's just the first thing that came up when I searched for "rust builder pattern" so I'm not sure if it's the "good" one (though it does seem to have lots of nice features).

})
}

/// Retrieves the task ID associated with this task.
pub fn task_id(&self) -> &TaskId {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

task.task_id() stutters. How about Task::id(&self) -> &TaskId? That would also be consistent with HpkeConfig::id(&self).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went back and forth on this -- the reason I didn't go for it originally is that e.g. reports aren't actually identified just by report_id; their effective ID is actually (task_id, report_id). So an id() method would return something like (&TaskId, &ReportId).

Of course, that doesn't apply to tasks (since there is no higher-level entity tasks are conceptually "inside of"), and the stuttering bugs me too. Since you noticed it as well, I fixed it here as well as for ReportMetadata & AggregationJob. (The last ID type, BatchID, doesn't yet correspond directly to a defined data structure.)

@@ -568,23 +569,26 @@ where
for<'a> &'a A::AggregateShare: Into<Vec<u8>>,
{
// Check how many rows in the relevant table have an intersecting batch interval.
// Each such row consumes one unit of batch lifetime (§4.6).
let intersecting_intervals: Vec<_> = match task.role {
// Each such row consumes one unit of query count (§4.6).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I think this should refer to section 4.5.6 or 4.5.7 now. Though there's lots of incorrect section references in Janus.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed the section reference entirely. I think it's only of marginal value given how fungible the DAP specification still is.

Maybe we can do a pass or something once per major version to update all of the section references appropriately? I don't think a process based on one-off updating section references as we touch the relevant code lines will work well -- even code that doesn't need to change to update the DAP version might well have its underlying spec sections moved around under it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I've done most recently is include a link to the relevant spec section instead of the section number. Then at least it's a little more obvious that a reference is out of date, because you have the draft revision in the URL. However I'm also OK with not putting in section references until DAP stabilizes.

Affected are Task::task_id(), ReportMetadata::report_id(), and
AggregationJob::aggregation_job_id() -- all three of these methods are
now called id() instead.
(noticed by CI, --all-features is required to compile this at the
moment)
@branlwyd branlwyd merged commit a9eb40e into main Oct 7, 2022
@branlwyd branlwyd deleted the bran/task-query-type branch October 7, 2022 17:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants