Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for HDInsight Cluster with Enterprise Security Package #12866

Merged
merged 29 commits into from
Aug 31, 2021

Conversation

neil-yechenwei
Copy link
Contributor

@neil-yechenwei neil-yechenwei commented Aug 6, 2021

Currently, TF HDInsight Cluster doesn't support to integrate with Enterprise Security Package (ESP). So I submitted this PR to implement it.

--- PASS: TestAccHDInsightHadoopCluster_securityProfile (1995.55s)
--- PASS: TestAccHDInsightSparkCluster_securityProfile (2476.50s)
--- PASS: TestAccHDInsightKafkaCluster_securityProfile (2082.18s)
--- PASS: TestAccHDInsightInteractiveQueryCluster_securityProfile (1806.02s)
--- PASS: TestAccHDInsightHBaseCluster_securityProfile (3006.68s)

API Reference:
https://github.com/Azure/azure-rest-api-specs/blob/a331a455fd31e022f8df07105058ae586a02ab03/specification/hdinsight/resource-manager/Microsoft.HDInsight/stable/2018-06-01-preview/cluster.json#L801

Overview:
https://docs.microsoft.com/en-us/azure/hdinsight/domain-joined/apache-domain-joined-configure-using-azure-adds

@neil-yechenwei neil-yechenwei changed the title Add support for HDInsight Cluster with AAD Domain Service Add support for HDInsight Cluster with Enterprise Security Package Aug 6, 2021
@grayzu
Copy link
Collaborator

grayzu commented Aug 6, 2021

Thanks for this PR @neil-yechenwei!

Since this is a pretty complicated scenario, can you add your tests to the examples folder so that customers can have a starting point for how to be successfully with these new properties?

@neil-yechenwei
Copy link
Contributor Author

@grayzu , thanks for your comment. The example has been added. Thanks.

Copy link
Contributor

@manicminer manicminer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @neil-yechenwei, thanks for this PR. This is mostly looking good. I've made some comments inline - there's also another PR #13011 which (if merged) will make for simpler configurations.

If you can look at the below, and the tests pass, this should be good to merge. Thanks!

Copy link
Contributor

@manicminer manicminer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@neil-yechenwei Thanks for the changes. There are still some crash opportunities, noted below, but the main thing on further thought is that azurerm_active_directory_domain_service has a quota of 1 per tenant or per subscription, so we'll need a way of handling that when parallelizing tests, else it'll conflict with the test in the domainservices package.

principal_id = azurerm_user_assigned_identity.test.principal_id
}

resource "azurerm_active_directory_domain_service" "test" {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whilst it should be fine to include an azurerm_active_directory_domain_service in this test, we'll probably have to think about scheduling and parallelization between this test and the test in the domainservices package - it's only possible to have a single domain service at any one time and they take quite a while to stand up and tear down.

Copy link
Contributor Author

@neil-yechenwei neil-yechenwei Aug 18, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it's possible, could you provide some suggestion on it? Thanks in advance.

Copy link
Contributor Author

@neil-yechenwei neil-yechenwei Aug 18, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've updated code for your concern. Hopes it's what you expect.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah i see. Unfortunately I don't think we can consolidate the tests into one package - the tests really need to live with their respective resources.

I'm looking at the logistics of having a long-running domain service to try and compensate for the concurrency issue, but still working out how to make that work nicely with the domainservices tests themselves.

Copy link
Contributor Author

@neil-yechenwei neil-yechenwei Aug 18, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. I separated them into their respective resources and updated TCs related with HDInsight Cluster ESP as parallel TCs. Maybe you have to update your test script in Teamcity to run DomainService's TCs and HDInsight Cluster ESP's TCs in parallel.

examples/hdinsight/enterprise-security-package/main.tf Outdated Show resolved Hide resolved
@neil-yechenwei
Copy link
Contributor Author

@manicminer , kindly ping. May I ask is there any update for above issue?

@neil-yechenwei
Copy link
Contributor Author

neil-yechenwei commented Aug 23, 2021

@manicminer, For this test issue, is it possible that we can skip these TCs and merge this PR first? Once you figure out this kind of test issue, then enable and update the TCs.

@manicminer
Copy link
Contributor

@neil-yechenwei This is challenging because having a persistent AADDS resource will break the tests for that resource. We'll leave this as-is for now, and we'll resolve separately when it becomes a problem - but can we change the AADDS SKU in this tests to Standard?

@neil-yechenwei
Copy link
Contributor Author

neil-yechenwei commented Aug 23, 2021

@manicminer , Sure. I've changed the AADDS SKU in TCs to Standard. Please have an another look.

@manicminer
Copy link
Contributor

manicminer commented Aug 25, 2021

@neil-yechenwei I've refactored the tests a little with a template for the common AADDS configuration. Running the tests again now (expecting them to take around 5-6 hrs), if all goes well this should be good to merge 👍

@manicminer
Copy link
Contributor

This is a challenging one to test - a single AADDS test is flaky but 5 successive ones are very prone to failure.

@neil-yechenwei
Copy link
Contributor Author

neil-yechenwei commented Aug 27, 2021

@manicminer , after checked test log you triggered, seems only one tc "testAccHDInsightInteractiveQueryCluster_securityProfile" is failed in teamcity.
Test log in teamcity:
image

Per my understanding, I think it's not TF code issue. Hence I just now reran this test case and it can pass on my local. Below is the test result.

Test Result:

$ make testacc TEST=./internal/services/hdinsight/ TESTARGS="-parallel 11 -test.run=TestAccHDInsightCluster_securityProfileSequential" TESTTIMEOUT='1440m'
==> Checking that code complies with gofmt requirements...
==> Checking that Custom Timeouts are used...
==> Checking that acceptance test packages are used...
TF_ACC=1 go test ./internal/services/hdinsight/ -v -parallel 11 -test.run=TestAccHDInsightCluster_securityProfileSequential -timeout 1440m -ldflags="-X=github.com/hashicorp/terraform-provider-azurerm/version.ProviderVersion=acc"
=== RUN   TestAccHDInsightCluster_securityProfileSequential
=== RUN   TestAccHDInsightCluster_securityProfileSequential/interactiveQuery
=== RUN   TestAccHDInsightCluster_securityProfileSequential/interactiveQuery/securityProfile
--- PASS: TestAccHDInsightCluster_securityProfileSequential (8228.09s)
    --- PASS: TestAccHDInsightCluster_securityProfileSequential/interactiveQuery (8228.09s)
        --- PASS: TestAccHDInsightCluster_securityProfileSequential/interactiveQuery/securityProfile (8228.09s)
PASS
ok      github.com/hashicorp/terraform-provider-azurerm/internal/services/hdinsight     8234.541s

@neil-yechenwei neil-yechenwei force-pushed the hdinsightwithaad branch 2 times, most recently from 866c877 to 1ba88cf Compare August 28, 2021 00:24
@neil-yechenwei
Copy link
Contributor Author

neil-yechenwei commented Aug 28, 2021

@manicminer , after rerun the TCs in TeamCity, the failed test case "testAccHDInsightInteractiveQueryCluster_securityProfile" you encountered can pass now in TeamCity. So I think it's not TF code problem.

image

@neil-yechenwei
Copy link
Contributor Author

neil-yechenwei commented Aug 30, 2021

@manicminer , all TCs passed now in TeamCity after rerun.

According to my observation, seems the test script in TeamCity would clean up the existing resources at some time every day. And as you can see, these test cases keep running around 11 hours. The execution duration is long. Probably you encountered the situation that the test script is deleting the existing resources regularly when one of these test cases is running. So one of them fail. Hence these test cases can pass after adjusting the execution time to avoid that time period of data cleaning.

image

Copy link
Contributor

@manicminer manicminer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @neil-yechenwei, thanks for re-running in TC. I agree this is likely to clash with the cleanup jobs due to the extended test duration. We'll likely have to revisit this to improve the stability whilst complying with the singular quota for AADDS, but in the meantime this should be good to merge 👍

@manicminer manicminer added this to the v2.75.0 milestone Aug 31, 2021
@manicminer manicminer merged commit f2f0919 into hashicorp:main Aug 31, 2021
manicminer added a commit that referenced this pull request Aug 31, 2021
@github-actions
Copy link

github-actions bot commented Sep 2, 2021

This functionality has been released in v2.75.0 of the Terraform Provider. Please see the Terraform documentation on provider versioning or reach out if you need any assistance upgrading.

For further feature requests or bug reports with this functionality, please create a new GitHub issue following the template. Thank you!

@github-actions
Copy link

github-actions bot commented Oct 3, 2021

I'm going to lock this pull request because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active contributions.
If you have found a problem that seems related to this change, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Oct 3, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants