Skip to content

Conversation

@dongjoon-hyun
Copy link
Member

@dongjoon-hyun dongjoon-hyun commented Mar 16, 2017

What changes were proposed in this pull request?

In the kerberized hadoop cluster, when Spark creates tables, the owner of tables are filled with PRINCIPAL strings instead of USER names. This is inconsistent with Hive and causes problems when using ROLE in Hive. We had better to fix this.

BEFORE

scala> sql("create table t(a int)").show
scala> sql("desc formatted t").show(false)
...
|Owner:                      |spark@EXAMPLE.COM                                         |       |

AFTER

scala> sql("create table t(a int)").show
scala> sql("desc formatted t").show(false)
...
|Owner:                      |spark                                         |       |

How was this patch tested?

Manually do create table and desc formatted because this happens in Kerberized clusters.

@dongjoon-hyun
Copy link
Member Author

Hi, @vanzin .
Could you review this PR?

@SparkQA
Copy link

SparkQA commented Mar 16, 2017

Test build #74657 has finished for PR 17311 at commit 9dfbf82.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@dongjoon-hyun
Copy link
Member Author

Oh, there exists a version issue. Let me fix that.

}
hiveTable.setPartCols(partCols.asJava)
conf.foreach(c => hiveTable.setOwner(c.getUser))
conf.foreach(c => hiveTable.setOwner(SessionState.getUserFromAuthenticator()))
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getUserFromAuthenticator was added 0.13.

@SparkQA
Copy link

SparkQA commented Mar 16, 2017

Test build #74664 has finished for PR 17311 at commit e8ef7d1.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

}
hiveTable.setPartCols(partCols.asJava)
conf.foreach(c => hiveTable.setOwner(c.getUser))
conf.foreach(c => hiveTable.setOwner(SessionState.get().getAuthenticator().getUserName()))
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We cannot use the following since it was introduced in 0.13.

conf.foreach(c => hiveTable.setOwner(SessionState.getUserFromAuthenticator()))

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since you're touching this, could you make it follow the usual style?

.foreach { c => ... }

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for review, @vanzin !
Which object do you mean for .foreach here?
For conf, we already use that.
For SessionState.get(), it's not Option.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's only one foreach call and it's using a different style than the rest of the code.

Copy link
Member Author

@dongjoon-hyun dongjoon-hyun Mar 17, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I see. It's about removing .foreach.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it's about style. Parentheses v. braces.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah. Sorry again.

@dongjoon-hyun
Copy link
Member Author

Hi, @gatorsmile .
Could you review this issue?

@dongjoon-hyun
Copy link
Member Author

I fixed that, @vanzin . Thank you again.

}
hiveTable.setPartCols(partCols.asJava)
conf.foreach(c => hiveTable.setOwner(c.getUser))
conf.foreach { _ => hiveTable.setOwner(SessionState.get().getAuthenticator().getUserName()) }
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I think we can remove conf here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, yes, you're not using it anymore. So away it goes.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep!

Copy link
Member Author

@dongjoon-hyun dongjoon-hyun Mar 17, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vanzin . Some test cases seems to related with conf is None, it causes failures. In this PR, I'll focus on putting the correct name only with the exactly same situations of the previous existing logic.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should pass the user name.

@SparkQA
Copy link

SparkQA commented Mar 17, 2017

Test build #74750 has finished for PR 17311 at commit 5efdf8b.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Mar 17, 2017

Test build #74748 has finished for PR 17311 at commit 2959d98.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

This reverts commit 5efdf8b.
@SparkQA
Copy link

SparkQA commented Mar 17, 2017

Test build #74753 has finished for PR 17311 at commit 258ff8d.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@dongjoon-hyun
Copy link
Member Author

If there is anything to do more, please let me know.

@dongjoon-hyun
Copy link
Member Author

Hi, @vanzin and @srowen .
Could you review this PR (again) when you have sometime?
I feel guilty because this PR need to be verified by manually on kerberized clusters.

@dongjoon-hyun
Copy link
Member Author

Retest this please

@SparkQA
Copy link

SparkQA commented Mar 20, 2017

Test build #74877 has finished for PR 17311 at commit 258ff8d.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

}
hiveTable.setPartCols(partCols.asJava)
conf.foreach(c => hiveTable.setOwner(c.getUser))
conf.foreach { _ => hiveTable.setOwner(SessionState.get().getAuthenticator().getUserName()) }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't you be using the state field in the class instead of SessionState.get()? Maybe that will allow you to get rid of the foreach.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

state is in class HiveClientImpl, but this function is in object HiveClientImpl.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh. Now it makes sense why conf would be None.

@vanzin
Copy link
Contributor

vanzin commented Mar 20, 2017

LGTM, merging to master / 2.1.

@dongjoon-hyun
Copy link
Member Author

Thank you so much, @vanzin !

@vanzin
Copy link
Contributor

vanzin commented Mar 20, 2017

Couldn't merge to 2.1.

@dongjoon-hyun
Copy link
Member Author

Oh, I'll make a backport then.

@asfgit asfgit closed this in fc75545 Mar 20, 2017
@dongjoon-hyun
Copy link
Member Author

The difference is due to 3881f34#diff-6fd847124f8eae45ba2de1cf7d6296feR855 .

ghost pushed a commit to dbtsai/spark that referenced this pull request Mar 24, 2017
…RINCIPAL in kerberized clusters apache#17311

### What changes were proposed in this pull request?
This is a follow-up for the PR: apache#17311

- For safety, use `sessionState` to get the user name, instead of calling `SessionState.get()` in the function `toHiveTable`.
- Passing `user names` instead of `conf` when calling `toHiveTable`.

### How was this patch tested?
N/A

Author: Xiao Li <gatorsmile@gmail.com>

Closes apache#17405 from gatorsmile/user.
@dongjoon-hyun dongjoon-hyun deleted the SPARK-19970 branch January 7, 2019 07:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants