Test encrypted input #1237

eriktaubeneck · 2024-08-22T21:16:14Z

For the in-market test (and in general in production), we're constructing 3 files of encrypted data. #1216 built a tool to encrypt, and to test decryption. This PR adds a test which reads in those files and runs a test query based on those encrypted files.

One issue here is that it requires adding another feature (in-memory-infra) to the cli::crypto module, that's only used for testing. Relatedly, it also requires making runner::oprf_ipa::OprfIpaQuery public. I think it might be preferable to move this elsewhere, and I'd love some feedback on where best this test should live to minimize these types of changes.

As for implementing this for actual use (the next step, which I will do in a separate PR), we have two options:

The report_collector binary could read in all 3 files, create the buffers, and upload them as it does today.
We could create a new helper end point / modify the existing /create endpoint so that it takes in a URL of the input data and downloads that.

We will need (2) eventually for sharding, but it's not immediately obvious all that needs to be done to make it happen here. Curious if others have thoughts on which approach to take.

codecov · 2024-08-22T21:20:43Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 92.62%. Comparing base (fbc56af) to head (836ff97).
Report is 9 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1237      +/-   ##
==========================================
- Coverage   92.82%   92.62%   -0.20%     
==========================================
  Files         200      200              
  Lines       31166    31413     +247     
==========================================
+ Hits        28930    29097     +167     
- Misses       2236     2316      +80

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

andyleiserson · 2024-08-23T21:50:42Z

I think it makes sense to put this test with the integration tests in ipa-core/tests/helper_networks.rs. It could either be a new test or a modification of one of the existing ones.

The integration tests may currently all be using real HTTP networking rather than the in-memory implementation, but I don't think it has to be that way -- if there's a reason this test needs to use in-memory networking, it would be fine to have some integration tests in that configuration.

(2) We could create a new helper end point / modify the existing /create endpoint so that it takes in a URL of the input data and downloads that.

We will need (2) eventually for sharding, but it's not immediately obvious all that needs to be done to make it happen here. Curious if others have thoughts on which approach to take.

There is some overlap with the data transfer involved in sharding, but I'm not sure it's really interchangeable. I think it would be better to leave a pull-from-URL input mechanism and sharding as future, separate tasks.

martinthomson · 2024-08-24T06:21:58Z

The feature usage seems fine to me. You can do a bunch of things with tags if it’s only for tests.

eriktaubeneck · 2024-08-26T19:16:32Z

I think it makes sense to put this test with the integration tests in ipa-core/tests/helper_networks.rs. It could either be a new test or a modification of one of the existing ones.
The integration tests may currently all be using real HTTP networking rather than the in-memory implementation, but I don't think it has to be that way -- if there's a reason this test needs to use in-memory networking, it would be fine to have some integration tests in that configuration.

Let me move it there and see if it works. There's no hard dependency on using in-memory networking, I was just trying to replicate the test in ipa-core/src/query/runner/oprf_ipa.rs.

There is some overlap with the data transfer involved in sharding, but I'm not sure it's really interchangeable. I think it would be better to leave a pull-from-URL input mechanism and sharding as future, separate tasks.

Sounds good. I'll just work on updating the existing report_collector cli to accept 3 encrypted files.

eriktaubeneck · 2024-08-26T20:41:36Z

ipa-core/src/test_fixture/world.rs

+
+                // Clippy complaining, but unclear why.
+                #[allow(clippy::disallowed_methods)]


For some reason, without this, I get the following compile error:

error: use of a disallowed method `futures::future::join_all` --> ipa-core/src/test_fixture/world.rs:633:21 | 633 | / join_all(zip(shares, iter::repeat(m_ctx.clone())).enumerate().map( 634 | | |(i, (share, m_ctx))| async move { 635 | | let record_id = RecordId::from(i); 636 | | let m_share = share.upgrade(m_ctx.clone(), record_id).await.unwrap(); ... | 644 | | }, 645 | | )) | |______________________^ | = note: We don't have a replacement for this method yet. Consider extending `SeqJoin` trait. (from clippy.toml) = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#disallowed_methods = note: `-D clippy::disallowed-methods` implied by `-D warnings` = help: to override `-D warnings` add `#[allow(clippy::disallowed_methods)]`

It's not exactly clear why this change triggers this. I do import TestWorld, but I imported that when the file was in ipa-core/cli/crypto.rs as well.

In general, we disallow the join_all method because we want to steer protocols to use seq_join where appropriate. seq_join is more careful about the number of futures it will spawn.

It's fine to use join_all here, though, because it's only three items, and it's test code. There is a general waiver of the disallowed_methods lint for cfg(test), but not for feature = test-fixture.

So any of the following would be fine:

Update the comment to say why it's okay to waive the lint (e.g. // It's just 3 items. appears elsewhere in world.rs).

Use join3 instead.

Waive disallowed_methods whenever feature = test-fixture, or for all of world.rs, or for all of mod test_fixture.

I'm still somewhat confused what code path all the sudden hits this (I didn't add the join_all.) It didn't seem to come up when I had it previously in the src/cli/crypto.rs file ¯_(ツ)_/¯

I'll update the comment, presumably this will also come up when others use this code path.

andyleiserson

As-is, I don't think this test will run in CI, it probably makes sense to add a config so it does.

For integration tests, similar behavior to the cfg directives in the test file can be configured in Cargo.toml, but just mentioning that as an FYI, I don't see a reason it matters which way it's done here.

eriktaubeneck · 2024-08-27T19:26:52Z

As-is, I don't think this test will run in CI, it probably makes sense to add a config so it does.

@andyleiserson dumb question, but what config controls this?

andyleiserson · 2024-08-27T20:09:46Z

As-is, I don't think this test will run in CI, it probably makes sense to add a config so it does.

@andyleiserson dumb question, but what config controls this?

.github/workflows/check.yml controls what runs in CI (and pre-commit is what gets run before pushing).

eriktaubeneck · 2024-08-27T23:19:00Z

pre-commit

@@ -101,6 +101,9 @@ check "Web tests" \
 check "Concurrency tests" \
    cargo test -p ipa-core --release --features "shuttle multi-threading"

+check "Encrypted Input Tests" \
+    cargo test --test encrypted_input --features "cli test-fixture web-app in-memory-infra"


@andyleiserson this runs this specific test file, and maybe that's ok. leaving this out just reruns all ~700 tests which we certainly want to avoid. I couldn't figure out how to just run things in the ipa-core/tests directory. Is this an OK approach?

This seems fine to me.

eriktaubeneck added 2 commits August 22, 2024 12:32

add test reading in encrypted file. compiles but test fails

d5c30dd

tests passing

2c01fd1

eriktaubeneck added 3 commits August 26, 2024 12:45

move encrypt_and_execute_query to ipa-core/tests/encrypted_input

31bef04

move cfg for test

fddb055

add clippy allow comment

d06463d

eriktaubeneck commented Aug 26, 2024

View reviewed changes

eriktaubeneck requested a review from andyleiserson August 26, 2024 20:42

andyleiserson approved these changes Aug 27, 2024

View reviewed changes

eriktaubeneck added 2 commits August 27, 2024 16:14

make sure test runs in CI

795f3ea

fix pre-commit too

836ff97

eriktaubeneck commented Aug 27, 2024

View reviewed changes

eriktaubeneck merged commit b6276b8 into main Aug 28, 2024
12 checks passed

eriktaubeneck deleted the test-encrypted-input branch August 28, 2024 02:43

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Test encrypted input #1237

Test encrypted input #1237

eriktaubeneck commented Aug 22, 2024

codecov bot commented Aug 22, 2024 •

edited

Loading

andyleiserson commented Aug 23, 2024

martinthomson commented Aug 24, 2024

eriktaubeneck commented Aug 26, 2024

eriktaubeneck Aug 26, 2024

andyleiserson Aug 27, 2024

eriktaubeneck Aug 27, 2024

andyleiserson left a comment

eriktaubeneck commented Aug 27, 2024

andyleiserson commented Aug 27, 2024

eriktaubeneck Aug 27, 2024

andyleiserson Aug 27, 2024


		// Clippy complaining, but unclear why.
		#[allow(clippy::disallowed_methods)]

Test encrypted input #1237

Test encrypted input #1237

Conversation

eriktaubeneck commented Aug 22, 2024

codecov bot commented Aug 22, 2024 • edited Loading

Codecov Report

andyleiserson commented Aug 23, 2024

martinthomson commented Aug 24, 2024

eriktaubeneck commented Aug 26, 2024

eriktaubeneck Aug 26, 2024

Choose a reason for hiding this comment

andyleiserson Aug 27, 2024

Choose a reason for hiding this comment

eriktaubeneck Aug 27, 2024

Choose a reason for hiding this comment

andyleiserson left a comment

Choose a reason for hiding this comment

eriktaubeneck commented Aug 27, 2024

andyleiserson commented Aug 27, 2024

eriktaubeneck Aug 27, 2024

Choose a reason for hiding this comment

andyleiserson Aug 27, 2024

Choose a reason for hiding this comment

codecov bot commented Aug 22, 2024 •

edited

Loading