add auth and session design #501

typhoonzero · 2019-06-10T10:29:45Z

weiguoz · 2019-06-10T12:35:28Z

doc/auth_design.md

+
+```go
+type Session struct {
+    ClientEndpoint    string


What's the ClientEndpoint?

Usually means IP:port, see: https://docs.oracle.com/javase/tutorial/networking/sockets/definition.html

weiguoz · 2019-06-10T12:41:45Z

doc/auth_design.md

+Users can set auth information in SQLFlow extended SQL statement like:
+
+```sql
+SET CREDENTIAL username secretkey


I'm not sure if a user inputs the credential in the notebook is a good way, due to such credential information might be exposed.
Don't mind too much about this, just take it as a reminder.

Well, we should let user set the application keys, usually access key and secret key, please refer to: https://docs.aws.amazon.com/general/latest/gr/aws-sec-cred-types.html and https://usercenter.console.aliyun.com/#/manage/ak

weiguoz · 2019-06-10T12:45:45Z

doc/auth_design.md

+    ClientEndpoint    string
+    DBConnStr         string  // mysql://user:pass@127.0.0.1:3306
+    Token             int64  // useful only in "side-car" design
+}


Should we consider the expired time to eliminate those zombie sessions?

We should! Thanks!

Yancey0623 · 2019-06-11T13:31:02Z

doc/auth_design.md

+
+<img src="figures/auth1.png">
+
+In production environments, one SQLFlow server is designed to accept many clients'


Maybe we can implement the Authentication and Authorization separated, for my brief idea:

Uses can go the SQLFlow website such as https://sqlflow.domain.com, the auth-server would process the request and check the user.

If the user has not logged in, auth-server would redirect to the SSO URL with 302 redirections.

If the user has logged in, the SQLFlow auth-server (maybe another name) would launch the notebook Pod if not exists with the user token as environment vars, and then redirect to the notebook URL.

notebook would call the SQLFlow server with Session struct (fill the user token)
5/6. SQLFlow server instance would auth MySQL/kubernetes and etc. with the user token.

Updated the design with some modifications.

wangkuiyi · 2019-06-11T16:44:36Z

doc/auth_design.md

+type Session struct {
+    Token          int64  // useful only in "side-car" design
+    ClientEndpoint string // ip:port from the client
+    DBConnStr      string // mysql://127.0.0.1:3306


Is the schema "mysql://" what we invented to help identify the kind of SQL engines? I ask because I think an address of MySQL server is something like http://user:passwd@127.0.0.1:3306, but not beginning with mysql://....

If so, how about we have

DBKind string // can be "mysql", "hive", ... DBConnStr string // e.g., "http://user:passwd@127.0.0.1:3306"

Is the schema "mysql://" what we invented to help identify the kind of SQL engines?

Yes. The string before :// is the "driver string, can be mysql://, hive:// or odps://

wangkuiyi · 2019-06-11T16:46:09Z

doc/auth_design.md

+
+To make it modulized and extensible, we prefer to introduce an authentication server, a.k.a., auth server. We use a
+[Django](https://www.djangoproject.com/) Web server so that the authentication methods
+can extend to:


Good to know that Django has so many features. Do we need to write code on top of Django, or we only need to configure and run the Django server for authentication?

Oh, I need to delete these lines, the latest design does not involve a Django server. All the authentication and authorization should be done by the jupyter notebook

wangkuiyi · 2019-06-11T16:55:16Z

doc/auth_design.md

+- Database service that stores the training data
+- A training cluster that runs the SQLFlow training job, e.g. Kubernetes
+
+SQLFlow should depend on the [SSO](https://en.wikipedia.org/wiki/Single_sign-on)


Why should SQLFlow use SSO? What are other choices?

Using JupyterHub, we can add any type of authenticators including SSO, Kerbros, etc, :https://github.com/jupyterhub/jupyterhub/wiki/Authenticators

wangkuiyi · 2019-06-11T16:56:12Z

doc/auth_design.md

+    Token          int64  // useful only in "side-car" design
+    ClientEndpoint string // ip:port from the client
+    DBConnStr      string // mysql://127.0.0.1:3306
+    AK             string // access key


Do we need only one pair of AK and SK? Or do we need multiple pairs, like one for the SQL engine and the other one for Kubernetes?

wangkuiyi · 2019-06-11T16:58:05Z

doc/auth_design.md

+
+```go
+type Session struct {
+    Token          int64  // useful only in "side-car" design


What is a side-car design?

Does the token identify an SQLFlow service user who has logged in?

Removed, it's not useful anymore.

wangkuiyi · 2019-06-11T17:02:35Z

doc/auth_design.md

+Once the user is logged in, SSO service will return the "token" represents the user's
+identity. Then the web IDE will call the "Auth Service" to get AK/SK for the database and
+training cluster. After that, the web IDE will call SQLFlow RPC service to create
+a new session, and the SQLFlow server will verify that all tokens, AK/SK are valid, then


Does the "to create a new session" imply that we need to change the gRPC service definition to add a remote call named SQLFlowService.CreateSession?

Yes, I'll add the new RPC defination in this doc

wangkuiyi · 2019-06-11T17:04:37Z

doc/auth_design.md

+Authorization is not a too much a challenge because we can rely on
+SQL engines and training clusters, which denies requests if the user
+have no access.  In this document, we focus on authentication of SQLFlow users.
+


I assume that we should clarify the concept of the "client" in this section. A client of SQLFlow server might be the SQLFlow magic command, which is an extension to Jupyter Notebook server, or a Windows-native or macOS-native GUI program. It looks to me that we introduce an authentication server because we want to support both kinds of clients?

wangkuiyi · 2019-06-11T17:10:19Z

doc/auth_design.md

+
+Users can use SQLFlow server with a simple jupyter notebook for simple deployment,
+for production deployments, users can take advantage of the cloud web IDE. The web
+IDE will redirect a user to the SSO service if the user is not logged in.


Why would "the web IDE" redirect a user to the SSO service"? Is it configured to do so? Could users use Jupyter Notebook as their "web IDE"? If so, how should they configure it to work with SSO? And, how comes the SSO service? Who is supposed to build it?

Removed all "web IDE" stuff and move to "JupyterHub"

wangkuiyi · 2019-06-11T17:13:17Z

doc/auth_design.md

+
+Users can use SQLFlow server with a simple jupyter notebook for simple deployment,
+for production deployments, users can take advantage of the cloud web IDE. The web
+IDE will redirect a user to the SSO service if the user is not logged in.


I know how to connect to a Jupyter Notebook server running on my laptop -- I need to copy-and-paste a URL containing a token printed by the Jupyter Notebook server on my console into my Web browser, so could I access the server while identify myself. However, I don't understand how am I supposed to identify myself to a Jupyter Notebook server running remotely as part of a Kubernetes service. Do you know how could we do that? Or, does this document imply that there is a Jupyter Notebook service there on a Kubernetes cluster?

wangkuiyi · 2019-06-11T17:14:22Z

doc/auth_design.md

+identity. Then the web IDE will call the "Auth Service" to get AK/SK for the database and
+training cluster. After that, the web IDE will call SQLFlow RPC service to create
+a new session, and the SQLFlow server will verify that all tokens, AK/SK are valid, then
+the session will be stored.


To where "the session will be stored"? To the etcd cluster?

@wangkuiyi I've updated the design doc on the basis of recent surveys.

…flow into add_auth_and_session_design

…oonzero/sqlflow into add_auth_and_session_design

typhoonzero added 2 commits June 10, 2019 18:28

add auth and session design

49b07e3

update

1b15923

weiguoz reviewed Jun 10, 2019

View reviewed changes

Yancey0623 reviewed Jun 11, 2019

View reviewed changes

typhoonzero and others added 2 commits June 11, 2019 22:37

update designs

14108d4

Wording

74f76b0

wangkuiyi reviewed Jun 11, 2019

View reviewed changes

typhoonzero added 4 commits June 17, 2019 21:15

follow comments

3e0da76

Merge branch 'develop' of https://github.com/sql-machine-learning/sql…

1120431

…flow into add_auth_and_session_design

Merge branch 'add_auth_and_session_design' of https://github.com/typh…

feeb45e

…oonzero/sqlflow into add_auth_and_session_design

update

ced88f9

wangkuiyi approved these changes Jun 17, 2019

View reviewed changes

wangkuiyi merged commit 95cc608 into sql-machine-learning:develop Jun 17, 2019

Yancey0623 mentioned this pull request Jun 21, 2019

[Feature Request] Implement SQLFlow Session #531

Closed

6 tasks

typhoonzero deleted the add_auth_and_session_design branch August 14, 2019 02:51


		<img src="figures/auth1.png">

		In production environments, one SQLFlow server is designed to accept many clients'

add auth and session design #501

add auth and session design #501

Uh oh!

Conversation

typhoonzero commented Jun 10, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

weiguoz Jun 10, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

weiguoz Jun 10, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Yancey0623 Jun 11, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

typhoonzero Jun 12, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

typhoonzero commented Jun 10, 2019 •

edited

Loading

weiguoz Jun 10, 2019 •

edited

Loading

weiguoz Jun 10, 2019 •

edited

Loading

Yancey0623 Jun 11, 2019 •

edited

Loading

typhoonzero Jun 12, 2019 •

edited

Loading