Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AWS: Use provided glue catalog id in defaultWarehouseLocation #6223

Merged
merged 4 commits into from
Dec 19, 2022

Conversation

grbinho
Copy link
Contributor

@grbinho grbinho commented Nov 19, 2022

defaultWarehouseLocation call to glue.getDatabase did not set catalog id while building a request.
This results in software.amazon.awssdk.services.glue.model.EntityNotFoundException: Database my_database not found. error when table is being created in another AWS account.

More details are in #6215

@singhpk234
If you can take a look, it would be appreciated.
I did not add any tests for it since I saw that similar tests for other methods do not exist.

@github-actions github-actions bot added the AWS label Nov 19, 2022
Copy link
Contributor

@singhpk234 singhpk234 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM ! Thanks @grbinho .

You can add a UT for now it by mocking GetDatabase request and making sure catalogId passed in it. can ref this existing test

cc @jackye1995

@grbinho
Copy link
Contributor Author

grbinho commented Nov 20, 2022

LGTM ! Thanks @grbinho .

You can add a UT for now it by mocking GetDatabase request and making sure catalogId passed in it. can ref this existing test

cc @jackye1995

I added a unit test with AwsProperties.GLUE_CATALOG_ID property set in AwsProperties.
This seemed like the way to go with this one since test is supposed to validate that property is actually used in the method.
I'm not that experienced with Java and Mockito, so feedback on the correct way to do this is welcome!

@@ -144,6 +144,32 @@ public void testDefaultWarehouseLocationDbUri() {
Assert.assertEquals("s3://bucket2/db/table", location);
}

@Test
public void testDefaultWarehouseLocationCustomCatalogId() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The integration tests in TestGlueCatalogNamespace also don't set the catalogId, do we need to set it in those files also?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there multiple AWS accounts available for integration tests?
I could not find that setup either in AwsIntegTestUtil or TestAssumeRoleAwsClientFactory (as and obvious potential cross account example).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I presume it is a single account.
The Integration testcase can be executed by exporting the mentioned environment variables (#4855)

Thanks for the clarification. may be no need of catalog id for these tests.

…ters are now used instead of a constructor map.
@grbinho grbinho changed the title Use provided glue catalog id in defaultWarehouseLocation AWS: Use provided glue catalog id in defaultWarehouseLocation Nov 21, 2022
@grbinho
Copy link
Contributor Author

grbinho commented Nov 21, 2022

@ajantha-bhat Can you please do another quick check. I changed how AwsProperties are initialized in the test. New way looks cleaner to me.

@ajantha-bhat
Copy link
Member

@ajantha-bhat Can you please do another quick check. I changed how AwsProperties are initialized in the test. New way looks cleaner to me.

LGTM

@grbinho
Copy link
Contributor Author

grbinho commented Nov 22, 2022

@ajantha-bhat @singhpk234
Do you need anything else from me to get this merged?

@ajantha-bhat
Copy link
Member

ajantha-bhat commented Nov 22, 2022

cc: @Fokko, @rdblue, @RussellSpitzer, @jackye1995

Copy link
Contributor

@JonasJ-ap JonasJ-ap left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Additional note: I checked other request builders in GlueCatalog and GlueTableOperations and found that these two requests also not include the catalogId:
I wonder if we shall also add catalogId to these two requests (may be in another PR)

CreateTableRequest in:

private boolean createGlueTempTableIfNecessary(TableMetadata base, String metadataLocation) {
if (awsProperties.glueLakeFormationEnabled() && base == null) {
// LakeFormation credential require TableArn as input,so creating a dummy table
// beforehand for create table scenario
glue.createTable(
CreateTableRequest.builder()
.databaseName(databaseName)
.tableInput(
TableInput.builder()

DeleteTableRequest in

private void cleanupGlueTempTableIfNecessary(
boolean glueTempTableCreated, CommitStatus commitStatus) {
if (glueTempTableCreated && commitStatus != CommitStatus.SUCCESS) {
glue.deleteTable(
DeleteTableRequest.builder().databaseName(databaseName).name(tableName).build());
}
}

@grbinho
Copy link
Contributor Author

grbinho commented Nov 28, 2022

I wonder if we shall also add catalogId to these two requests (may be in another PR)

For createGlueTempTableIfNecessary I'm not sure that is necessary since it's talking about Lake Formation.
Usually with Lake Formation cross account access is handled by Lake Formation (and resource shares).
In normal deployments tables would always be created in your "local" catalog and then shared with others and/or central catalog.

That said, I'm not that familiar with the code base enough to be sure it's unnecessary.

Maybe @amogh-jahagirdar knows?

@grbinho
Copy link
Contributor Author

grbinho commented Dec 15, 2022

@JonasJ-ap @ajantha-bhat
Hi guys, can you advise how we can move this MR forward?

@rdblue
Copy link
Contributor

rdblue commented Dec 18, 2022

Running CI.

@rdblue rdblue merged commit 69bbcc7 into apache:master Dec 19, 2022
@rdblue
Copy link
Contributor

rdblue commented Dec 19, 2022

Thanks, @grbinho!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants